How does reducing library complexity through a targeted sequencing approach translate into reality? – Case studies.
Julian Dupuis, PhD – University of Hawaii
Highly Multiplex Amplicon-Based Phylogenomics Using Cleanplex Technology.
With over 4,900 species, ~200 recognized as serious agricultural pests, fruit flies in the family Tephritidae are a systematically diverse and taxonomically difficult group. Four genera, Anastrepha, Bactrocera, Ceratitis, and Zeugodacus, include some of the most economically important pest species in the tropics and subtropics, and control, eradication, and quarantine efforts are employed across the world to combat these pests. The main taxonomic difficulties across these genera lie in groups of closely-related species, whose adults, and even more so, larvae, are morphologically indistinct. Our goal was to develop a phylogenomic foundation for a rapid, straightforward tool for species identification of invasive tephritid fruit flies that are commonly detected in California, Florida, and South Texas. We developed a bioinformatic locus selection pipeline that takes advantage of a variety of genomic and transcriptomic data sources to identify phylogenetically-informative, conserved exons in orthologous genes. Using Paragon Genomics’ CleanPlex technology, we targeted 878 conserved exons in highly multiplexed, single tube reactions for hundreds of individuals across the four aforementioned genera. This approach yielded a phylogenomic dataset that far exceeded the phylogenetic resolution of existing datasets, containing >40,000 informative characters after reasonable filtering. The wet lab procedure and our analysis pipeline can analyze hundreds of individuals at a time, and return taxonomic, and in some cases population level, assignment in as few as three days from sample collection. Our approach provides a novel way to combine diverse genomic and transcriptomic data sources, particularly when at least one well-annotated data source is available, and can rapidly develop robust phylogenetic analyses for non-model systems that are scalable, cost-effective, and robust.
Scott Geib, PhD, USDA-ARS
Applications of Multiplex Amplicon Sequencing for Rapid Diagnostics Supporting Invasive Species Exclusion
The growth of global trade and transport has greatly increased the speed and frequency that alien species are introduced to new habitats. When exposed to naïve ecosystems in introduced ranges, these species can quickly become serious agricultural pests and threaten native biodiversity. From a regulatory perspective, an additional and potentially-underappreciated complication is that many of these alien species belong to highly diverse groups or complexes of closely-related species. The taxonomy and systematics of these groups are often poorly understood, and species can be difficult or impossible to delimit using traditional taxonomic methods. The combination of these factors creates a herculean task for regulatory agencies who rely on robust species identifications to make regulatory decisions. The utilization of Paragon CleanPlex amplicon sequencing technology is being applied to develop rapid diagnostic tools for species identification and source determination, using Tephritid fruit flies as an example. Using a variety of whole genome, transcriptome, or reduced representation (e.g. ddRAD) datasets, we identified diagnostic targets in the genome, and evaluated the efficacy of these markers in the context of use in a diagnostic lab.
Kimberly Andrews, PhD, University of Idaho; University of Washington
Samuel S Hunter, PhD, Institute for Bioinformatics and Evolutionary Studies, University of Idaho
Amplicon Sequencing for Parentage Analysis of Mexican Gray Wolves Using Blood, Hair, and Fecal Samples
There is currently a need in the fields of ecological, evolutionary, and conservation genetics for an efficient, cost-effective, and flexible method to generate Illumina sequence data from tens to hundreds of markers across tens to thousands of samples. In addition, many studies require this method to be feasible for samples with low quantity and low quality DNA, such as museum samples or non-invasively collected samples like hair and feces. Paragon Genomics’ CleanPlex amplicon sequencing approach may be capable of filling this need with its optimized multiplex primer design, high flexibility in the number of loci and samples that can be run, and reasonable per-sample cost and effort. We tested whether this method would be effective for parentage analyses of Mexican gray wolves using blood, hair, and fecal samples. We first used RADseq data from 72 Mexican gray wolf blood samples to identify 424 SNPs that would be informative for parentage analysis. Paragon then designed multiplex primers for these SNPs, targeting short (105-175bp) regions to accommodate DNA degradation expected to be present in fecal samples. We used these primers with the Paragon CleanPlex protocol to generate libraries for 87 blood and tissue samples, 12 hair samples, and 13 fecal samples (each fecal sample had a paired blood sample). Low-quality samples required special treatment and optimization. We then sequenced these libraries on an Illumina MiSeq. Results were overall good, and amplicons were successfully produced for all sample types. Out of 424 primer pairs, only 28 failed to produce reads reliably, and only 6 samples failed to amplify well. Coverage variance between targets was reasonable, ensuring sufficient coverage for genotyping at almost all loci.