As part of our project, “Developing Association Mapping in Polyploid Perennial Biofuel Grasses” (DOE-USDA Plant Feedstock Genomics for Bioenergy Program grant DE-A102-07ER64454)*, two SNP discovery initiatives were carried out. The earlier one (2009) was an approach based on EST sequences. The latest initiative (2011-12) adopted a more powerful approach, based on GBS (Genotyping by Sequencing). We believe that the SNP markers identified in these studies will greatly enhance breeding efforts that target the improvement of key biofuel traits and the development of new switchgrass cultivars.
SNP Discovery using GBS
Fei Lu designed the UNEAK pipeline for GBS. This pipeline can be widely used for virtually any species, regardless of ploidy, heterozygosity, genome complexity, or the availability of a reference genome. The UNEAK pipeline was successfully applied to our switchgrass populations. A manuscript describing this research has been submitted for publication (May 2012).
To enable genome-wide association study (GWAS) and genomic selection (GS) in switchgrass, we genotyped a full-sib population (n = 130), a half-sib population (n = 168) and association populations (66 pops, n = 540). The parents of the linkage populations are upland tetraploids. The association populations are primarily of the upland ecotype, both tetraploid and octoploid, with a few lowland tetraploids as well. A total of 350 GB of sequence was generated from 840 individuals using GBS. Over 1.2 million putative SNPs were discovered with the UNEAK pipeline. In addition, ultra high density paternal and maternal linkage maps, of 41K and 46K SNPs, respectively, were also constructed based on the conserved synteny between switchgrass and foxtail millet.
The data can be accessed here:
EST-based SNP Discovery
Elhan Ersoz, in collaboration with Jasmyn L. Pangilinan (Joint Genome Institute) and Mark H. Wright (Cornell University), carried out an EST-based SNP discovery initiative in switchgrass. We believe that the SNP markers identified in this study will greatly enhance breeding efforts that target the improvement of key biofuel traits and the development of new switchgrass cultivars.
From thirteen diverse switchgrass cultivars, representing both upland and lowland ecotypes, as well as tetraploid and octoploid genomes, EST libraries were generated and sequenced. This was followed by whole genome, massively parallel sequencing with Illumina- GA technology. EST libraries were used to generate unigene clusters and establish a gene-space reference genome. The short sequence reads were then mapped to the reference sequences to identify single nucleotide polymorphisms (SNPs). A custom software program for alignment and SNP detection was used to identify over 149,000 SNPs across the 13 short-read sequencing libraries (SRSLs). An additional ~25K SNPs were also identified from the entire EST collection available for the species.
The data files can be accessed here:
If you would like more information on how to use the data files, please click here
* Edward S. Buckler (USDA-ARS, Institute for Genomic Diversity, Cornell University), Principal Investigator; Michael D. Casler (USDA-ARS, U.S. Dairy Forage Research Center, Madison, WI) and Jerome H. Cherney (Cornell University), Co-Principal Investigators; Denise E. Costich, Project Coordinator (USDA-ARS, Institute for Genomic Diversity, Cornell University).