57
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Genomes of the Mouse Collaborative Cross

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.

          Related collections

          Most cited references49

          • Record: found
          • Abstract: found
          • Article: not found

          High-resolution genetic mapping using the Mouse Diversity outbred population.

          The JAX Diversity Outbred population is a new mouse resource derived from partially inbred Collaborative Cross strains and maintained by randomized outcrossing. As such, it segregates the same allelic variants as the Collaborative Cross but embeds these in a distinct population architecture in which each animal has a high degree of heterozygosity and carries a unique combination of alleles. Phenotypic diversity is striking and often divergent from phenotypes seen in the founder strains of the Collaborative Cross. Allele frequencies and recombination density in early generations of Diversity Outbred mice are consistent with expectations based on simulations of the mating design. We describe analytical methods for genetic mapping using this resource and demonstrate the power and high mapping resolution achieved with this population by mapping a serum cholesterol trait to a 2-Mb region on chromosome 3 containing only 11 genes. Analysis of the estimated allele effects in conjunction with complete genome sequence data of the founder strains reduced the pool of candidate polymorphisms to seven SNPs, five of which are located in an intergenic region upstream of the Foxo1 gene.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Sequence-based characterization of structural variation in the mouse genome.

            Structural variation is widespread in mammalian genomes and is an important cause of disease, but just how abundant and important structural variants (SVs) are in shaping phenotypic variation remains unclear. Without knowing how many SVs there are, and how they arise, it is difficult to discover what they do. Combining experimental with automated analyses, we identified 711,920 SVs at 281,243 sites in the genomes of thirteen classical and four wild-derived inbred mouse strains. The majority of SVs are less than 1 kilobase in size and 98% are deletions or insertions. The breakpoints of 160,000 SVs were mapped to base pair resolution, allowing us to infer that insertion of retrotransposons causes more than half of SVs. Yet, despite their prevalence, SVs are less likely than other sequence variants to cause gene expression or quantitative phenotypic variation. We identified 24 SVs that disrupt coding exons, acting as rare variants of large effect on gene function. One-third of the genes so affected have immunological functions.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Defining the consequences of genetic variation on a proteome-wide scale.

              Genetic variation modulates protein expression through both transcriptional and post-transcriptional mechanisms. To characterize the consequences of natural genetic diversity on the proteome, here we combine a multiplexed, mass spectrometry-based method for protein quantification with an emerging outbred mouse model containing extensive genetic variation from eight inbred founder strains. By measuring genome-wide transcript and protein expression in livers from 192 Diversity outbred mice, we identify 2,866 protein quantitative trait loci (pQTL) with twice as many local as distant genetic variants. These data support distinct transcriptional and post-transcriptional models underlying the observed pQTL effects. Using a sensitive approach to mediation analysis, we often identified a second protein or transcript as the causal mediator of distant pQTL. Our analysis reveals an extensive network of direct protein-protein interactions. Finally, we show that local genotype can provide accurate predictions of protein abundance in an independent cohort of collaborative cross mice.
                Bookmark

                Author and article information

                Journal
                Genetics
                Genetics
                genetics
                genetics
                genetics
                Genetics
                Genetics Society of America
                0016-6731
                1943-2631
                June 2017
                6 June 2017
                6 June 2017
                : 206
                : 2
                : 537-556
                Affiliations
                [* ]The Jackson Laboratory, Bar Harbor, Maine 04609
                []Department of Genetics, University of North Carolina, Chapel Hill, North Carolina 27599
                []Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599
                [§ ]Curriculum of Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, North Carolina 27599
                [** ]Department of Computer Science, University of North Carolina, Chapel Hill, North Carolina 27599
                [†† ]Curriculum of Genetics and Molecular Biology, University of North Carolina, Chapel Hill, North Carolina 27599
                Author notes
                [1]

                These authors contributed equally to this work.

                [2 ]Corresponding author: Department of Genetics, University of North Carolina, Chapel Hill, Rm 5046 GMB, 120 Mason Farm Rd., Chapel Hill, NC 27599. E-mail: fernando@ 123456med.unc.edu ; and The Jackson Laboratory, Bar Harbor, Maine 04609. E-mail: gary.churchill@ 123456jax.org
                Author information
                http://orcid.org/0000-0003-1942-4543
                http://orcid.org/0000-0003-4732-5526
                http://orcid.org/0000-0003-1241-6268
                http://orcid.org/0000-0002-0781-7254
                http://orcid.org/0000-0002-5738-5795
                Article
                198838
                10.1534/genetics.116.198838
                5499171
                28592495
                0a16290a-74b3-409b-91f8-20a139dbec96
                Copyright © 2017 Srivastava et al.

                Available freely online through the author-supported open access option.

                This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 10 January 2017
                : 14 March 2017
                Page count
                Figures: 7, Tables: 3, Equations: 0, References: 82, Pages: 20
                Categories
                Multiparental Populations
                Investigations

                Genetics
                whole genome sequence,drift,selection,genetic variants,multiparental populations,mpp
                Genetics
                whole genome sequence, drift, selection, genetic variants, multiparental populations, mpp

                Comments

                Comment on this article