• A comparative genomics multitool for scientific discovery and conservation

      Genereux, Diane P.; Serres, Aitor; Armstrong, Joel; Johnson, Jeremy; Marinescu, Voichita D.; Murén, Eva; Juan, David; Bejerano, Gill; Casewell, Nicholas R.; Chemnick, Leona G.; et al. (2020)
      The Zoonomia Project is investigating the genomics of shared and specialized traits in eutherian mammals. Here we provide genome assemblies for 131 species, of which all but 9 are previously uncharacterized, and describe a whole-genome alignment of 240 species of considerable phylogenetic diversity, comprising representatives from more than 80% of mammalian families. We find that regions of reduced genetic diversity are more abundant in species at a high risk of extinction, discern signals of evolutionary selection at high resolution and provide insights from individual reference genomes. By prioritizing phylogenetic diversity and making data available quickly and without restriction, the Zoonomia Project aims to support biological discovery, medical research and the conservation of biodiversity.
    • A comprehensive genomic history of extinct and living elephants

      Palkopoulou, Eleftheria; Lipson, Mark; Mallick, Swapan; Nielsen, Svend; Rohland, Nadin; Baleka, Sina; Karpinski, Emil; Ivancevic, Atma M.; To, Thu-Hien; Kortschak, R. Daniel; et al. (2018)
      Elephantids are the world’s most iconic megafaunal family, yet there is no comprehensive genomic assessment of their relationships. We report a total of 14 genomes, including 2 from the American mastodon, which is an extinct elephantid relative, and 12 spanning all three extant and three extinct elephantid species including an ~120,000-y-old straight-tusked elephant, a Columbian mammoth, and woolly mammoths....
    • A demonstration of conservation genomics for threatened species management

      Wright, Belinda R.; Farquharson, Katherine A.; McLennan, Elspeth A.; Belov, Katherine; Hogg, Carolyn J.; Grueber, Catherine E. (2020)
      ... We conducted whole genome sequencing (WGS) of 25 individuals from the captive breeding programme and reduced‐representation sequencing (RRS) of 98 founders of the same programme. A subset of the WGS samples was also sequenced by RRS, allowing us to directly compare genome‐wide heterozygosity with estimates from RRS data. We found good congruence in interindividual variation and gene‐ontology classifications between the two data sets, indicating that our RRS data reflect the genome well....
    • A high density snp array for the domestic horse and extant Perissodactyla: Utility for association mapping, genetic diversity, and phylogeny studies

      McCue, Molly E.; Bannasch, Danika L.; Petersen, Jessica L.; Gurr, Jessica; Bailey, Ernie; Binns, Matthew M.; Distl, Ottmar; Guérin, Gérard; Hasegawa, Telhisa; Hill, Emmeline W.; et al. (2012)
      An equine SNP genotyping array was developed and evaluated on a panel of samples representing 14 domestic horse breeds and 18 evolutionarily related species. More than 54,000 polymorphic SNPs provided an average inter-SNP spacing of ?43 kb. The mean minor allele frequency across domestic horse breeds was 0.23, and the number of polymorphic SNPs within breeds ranged from 43,287 to 52,085. Genome-wide linkage disequilibrium (LD) in most breeds declined rapidly over the first 50–100 kb and reached background levels within 1–2 Mb. The extent of LD and the level of inbreeding were highest in the Thoroughbred and lowest in the Mongolian and Quarter Horse. Multidimensional scaling (MDS) analyses demonstrated the tight grouping of individuals within most breeds, close proximity of related breeds, and less tight grouping in admixed breeds. The close relationship between the Przewalski's Horse and the domestic horse was demonstrated by pair-wise genetic distance and MDS. Genotyping of other Perissodactyla (zebras, asses, tapirs, and rhinoceros) was variably successful, with call rates and the number of polymorphic loci varying across taxa. Parsimony analysis placed the modern horse as sister taxa to Equus przewalski. The utility of the SNP array in genome-wide association was confirmed by mapping the known recessive chestnut coat color locus (MC1R) and defining a conserved haplotype of -750 kb across all breeds. These results demonstrate the high quality of this SNP genotyping resource, its usefulness in diverse genome analyses of the horse, and potential use in related species.
    • A massively parallel sequencing approach uncovers ancient origins and high genetic variability of endangered Przewalski's horses

      Goto, Hiroki; Ryder, Oliver A.; Fisher, Allison R.; Schultz, Bryant; Kosakovsky Pond, Sergei L.; Nekrutenko, Anton; Makova, Kateryna D. (2011)
      The endangered Przewalski's horse is the closest relative of the domestic horse and is the only true wild horse species surviving today. The question of whether Przewalski's horse is the direct progenitor of domestic horse has been hotly debated. Studies of DNA diversity within Przewalski's horses have been sparse but are urgently needed to ensure their successful reintroduction to the wild. In an attempt to resolve the controversy surrounding the phylogenetic position and genetic diversity of Przewalski's horses, we used massively parallel sequencing technology to decipher the complete mitochondrial and partial nuclear genomes for all four surviving maternal lineages of Przewalski's horses. Unlike single-nucleotide polymorphism (SNP) typing usually affected by ascertainment bias, the present method is expected to be largely unbiased. Three mitochondrial haplotypes were discovered—two similar ones, haplotypes I/II, and one substantially divergent from the other two, haplotype III. Haplotypes I/II versus III did not cluster together on a phylogenetic tree, rejecting the monophyly of Przewalski's horse maternal lineages, and were estimated to split 0.117–0.186 Ma, significantly preceding horse domestication. In the phylogeny based on autosomal sequences, Przewalski's horses formed a monophyletic clade, separate from the Thoroughbred domestic horse lineage. Our results suggest that Przewalski's horses have ancient origins and are not the direct progenitors of domestic horses. The analysis of the vast amount of sequence data presented here suggests that Przewalski's and domestic horse lineages diverged at least 0.117 Ma but since then have retained ancestral genetic polymorphism and/or experienced gene flow.
    • Altered gonadal expression of TGF-beta superfamily signaling factors in environmental contaminant-exposed juvenile alligators

      Moore, B.C.; Milnes, Matthew R.; Kohno, S.; Katsu, Y.; Iguchi, T. (2011)
      Environmental contaminant exposure can influence gonadal steroid signaling milieus; however, little research has investigated the vulnerability of non-steroidal signaling pathways in the gonads. Here we use American alligators (Alligator mississippiensis) hatched from field-collected eggs to analyze gonadal mRNA transcript levels of the activin–inhibin–follistatin gene expression network and growth differentiation factor 9....
    • Applying SNP-derived molecular coancestry estimates to captive breeding programs

      Ivy, Jamie A.; Putnam, Andrea S.; Navarro, Asako Y.; Gurr, Jessica; Ryder, Oliver A. (2016)
      ...Although pedigree-based breeding strategies are quite effective at retaining long-term genetic variation, management of zoo-based breeding programs continues to be hampered when pedigrees are poorly known. The objective of this study was to evaluate 2 options for generating single nucleotide polymorphism (SNP) data to resolve unknown relationships within captive breeding programs...
    • Assessing evolutionary processes over time in a conservation breeding program: A combined approach using molecular data, simulations and pedigree analysis

      Wright, Belinda R.; Hogg, Carolyn J.; McLennan, Elspeth A.; Belov, Katherine; Grueber, Catherine E. (2021)
      …We have quantified the effects of selection, drift and gene flow in 503 individuals across five generations from the Tasmanian devil insurance population. To determine whether different processes were acting in different settings, we separately analysed animals housed under individual-based management, versus those that were released to an island site. We found that a greater proportion of alleles were lost over time in the smaller island population than in captivity and propose that genetic drift is the most likely process influencing this result….
    • Broad host range of SARS-CoV-2 predicted by comparative and structural analysis of ACE2 in vertebrates

      Damas, Joana; Hughes, Graham M.; Keough, Kathleen C.; Painter, Corrie A.; Persky, Nicole S.; Corbo, Marco; Hiller, Michael; Koepfli, Klaus-Peter; Pfenning, Andreas R.; Zhao, Huabin; et al. (2020)
      The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of COVID-19. The main receptor of SARS-CoV-2, angiotensin I converting enzyme 2 (ACE2), is now undergoing extensive scrutiny to understand the routes of transmission and sensitivity in different species. Here, we utilized a unique dataset of ACE2 sequences from 410 vertebrate species, including 252 mammals, to study the conservation of ACE2 and its potential to be used as a receptor by SARS-CoV-2. We designed a five-category binding score based on the conservation properties of 25 amino acids important for the binding between ACE2 and the SARS-CoV-2 spike protein. Only mammals fell into the medium to very high categories and only catarrhine primates into the very high category, suggesting that they are at high risk for SARS-CoV-2 infection. We employed a protein structural analysis to qualitatively assess whether amino acid changes at variable residues would be likely to disrupt ACE2/SARS-CoV-2 spike protein binding and found the number of predicted unfavorable changes significantly correlated with the binding score. Extending this analysis to human population data, we found only rare (frequency <0.001) variants in 10/25 binding sites. In addition, we found significant signals of selection and accelerated evolution in the ACE2 coding sequence across all mammals, and specific to the bat lineage. Our results, if confirmed by additional experimental data, may lead to the identification of intermediate host species for SARS-CoV-2, guide the selection of animal models of COVID-19, and assist the conservation of animals both in native habitats and in human care.
    • Characterization of reproductive gene diversity in the endangered Tasmanian devil

      Brandies, Parice A.; Wright, Belinda R.; Hogg, Carolyn J.; Grueber, Catherine E.; Belov, Katherine (2020)
      ...We characterized single nucleotide polymorphisms (SNPs) at 214 genes involved in reproduction in 37 Tasmanian devils…. We will use this information in future to examine the interplay between reproductive gene variation and reproductive fitness in Tasmanian devil populations.
    • Chromosome painting in Tragulidae facilitates the reconstruction of Ruminantia ancestral karyotype

      Kulemzina, Anastasia I.; Yang, Fengtang; Trifonov, Vladimir A.; Ryder, Oliver A.; Ferguson-Smith, Malcolm A.; Graphodatsky, Alexander S. (2011)
      ...Here, we present the first genome-wide comparative map of the Java mouse deer (Tragulus javanicus, Tragulidae) revealed by chromosome painting with human and dromedary probes. Together with the published comparative maps of major representative cetartiodactyl species established with the same set of probes, our results allowed us to reconstruct a 2n = 48 Ruminantia ancestral karyotype, which is similar to the cetartiodactyl ancestral karyotype....
    • Comparative and demographic analysis of orang-utan genomes

      Locke, Devin P.; Hillier, LaDeana W.; Warren, Wesley C.; Worley, Kim C.; Nazareth, Lynne V.; Muzny, Donna M.; Yang, Shiaw-Pyng; Wang, Zhengyuan; Chinwalla, Asif T.; Minx, Pat; et al. (2011)
      ‘Orang-utan’ is derived from a Malay term meaning ‘man of the forest’ and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal1, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000 years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (Ne) expanded exponentially relative to the ancestral Ne after the split, while Bornean Ne declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.
    • Conservation genomics of threatened animal species

      Steiner, Cynthia C.; Putnam, Andrea S.; Hoeck, Paquita E. A.; Ryder, Oliver A. (2013)
      The genomics era has opened up exciting possibilities in the field of conservation biology by enabling genomic analyses of threatened species that previously were limited to model organisms. Next-generation sequencing (NGS) and the collection of genome-wide data allow for more robust studies of the demographic history of populations and adaptive variation associated with fitness and local adaptation.…
    • Conservation implications of inaccurate estimation of cryptic population size: Inaccurate estimation of cryptic population size

      Katzner, T. E.; Ivy, Jamie A.; Bragin, E. A.; Milner-Gulland, E.J.; DeWoody, J. A. (2011)
      ...Estimating population size is central to species‐oriented conservation and management. However, in spite of recent development in monitoring protocols, there are gaps in our ability to accurately and quickly estimate numbers of individuals present, especially for the cryptic and often non‐breeding components of structured vertebrate populations. Yet knowing the size and growth trajectory of all stage classes of a population is critical for species conservation. Here we use data from 2 years of non‐invasive genetic sample collection from the cryptic, non‐breeding component of an endangered bird of prey population to evaluate the impact of variability in population estimates on demographic models that underpin conservation efforts....
    • Contemporary demographic reconstruction methods are robust to genome assembly quality: A case study in Tasmanian devils

      Patton, Austin H; Margres, Mark J; Stahlke, Amanda R; Hendricks, Sarah; Lewallen, Kevin; Hamede, Rodrigo K; Ruiz-Aravena, Manuel; Ryder, Oliver A.; McCallum, Hamish I; Jones, Menna E; et al. (2019)
      Reconstructing species’ demographic histories is a central focus of molecular ecology and evolution. Recently, an expanding suite of methods leveraging either the sequentially Markovian coalescent (SMC) or the site-frequency spectrum has been developed to reconstruct population size histories from genomic sequence data. However, few studies have investigated the robustness of these methods to genome assemblies of varying quality. In this study, we first present an improved genome assembly for the Tasmanian devil using the Chicago library method. Compared with the original reference genome, our new assembly reduces the number of scaffolds (from 35,975 to 10,010) and increases the scaffold N90 (from 0.101 to 2.164 Mb). Second, we assess the performance of four contemporary genomic methods for inferring population size history (PSMC, MSMC, SMC++, Stairway Plot), using the two devil genome assemblies as well as simulated, artificially fragmented genomes that approximate the hypothesized demographic history of Tasmanian devils. We demonstrate that each method is robust to assembly quality, producing similar estimates of Ne when simulated genomes were fragmented into up to 5,000 scaffolds. Overall, methods reliant on the SMC are most reliable between ?300 generations before present (gbp) and 100 kgbp, whereas methods exclusively reliant on the site-frequency spectrum are most reliable between the present and 30 gbp. Our results suggest that when used in concert, genomic methods for reconstructing species’ effective population size histories 1) can be applied to nonmodel organisms without highly contiguous reference genomes, and 2) are capable of detecting independently documented effects of historical geological events.
    • Copy number variation analysis in the great apes reveals species-specific patterns of structural variation

      Gazave, E.; Darre, F.; Morcillo-Suarez, C.; Petit-Marty, N.; Carreno, A.; Marigorta, U. M.; Ryder, Oliver A.; Blancher, A.; Rocchi, M.; Bosch, E.; et al. (2011)
      ...We performed intraspecific comparative genomic hybridizations to identify loci harboring copy number variants in each of the four great apes: bonobos, chimpanzees, gorillas, and orangutans. For the first time, we could analyze differences in CNV location and frequency in these four species, and compare them with human CNVs and primate segmental duplication (SD) maps....
    • Cryptic population size and conservation: consequences of making the unknown known: Cryptic population size and conservation reponse

      Katzner, T. E.; Ivy, Jamie A.; Bragin, E. A.; Milner-Gulland, E. J.; DeWoody, J. A. (2011)
      ...Our two papers use non‐invasive genetic sampling and population modeling to highlight how far off our original estimate of imperial eagle Aquila heliaca population size was (Rudnick et al., 2008), and to allow us to begin to consider the consequences, for monitoring and for conservation, of changing an known unknown into a known known (Katzner et al., 2011)....
    • Evaluating the performance of captive breeding techniques for conservation hatcheries: A case study of the delta smelt captive breeding program

      Fisch, Kathleen M.; Ivy, Jamie A.; Burton, Ronald S.; May, Bernie (2013)
      The delta smelt, an endangered fish species endemic to the San Francisco Bay-Delta, California, United States, was recently brought into captivity for species preservation. This study retrospectively evaluates the implementation of a genetic management plan for the captive delta smelt population....
    • From reference genomes to population genomics: comparing three reference-aligned reduced-representation sequencing pipelines in two wildlife species

      Wright, Belinda R.; Farquharson, Katherine A.; McLennan, Elspeth A.; Belov, Katherine; Hogg, Carolyn J.; Grueber, Catherine E. (2019)
      Recent advances in genomics have greatly increased research opportunities for non-model species. For wildlife, a growing availability of reference genomes means that population genetics is no longer restricted to a small set of anonymous loci. When used in conjunction with a reference genome, reduced-representation sequencing (RRS) provides a cost-effective method for obtaining reliable diversity information for population genetics. Many software tools have been developed to process RRS data, though few studies of non-model species incorporate genome alignment in calling loci. A commonly-used RRS analysis pipeline, Stacks, has this capacity and so it is timely to compare its utility with existing software originally designed for alignment and analysis of whole genome sequencing data. Here we examine population genetic inferences from two species for which reference-aligned reduced-representation data have been collected. Our two study species are a threatened Australian marsupial (Tasmanian devil Sarcophilus harrisii; declining population) and an Arctic-circle migrant bird (pink-footed goose Anser brachyrhynchus; expanding population). Analyses of these data are compared using Stacks versus two widely-used genomics packages, SAMtools and GATK. We also introduce a custom R script to improve the reliability of single nucleotide polymorphism (SNP) calls in all pipelines and conduct population genetic inferences for non-model species with reference genomes.