726
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      A high-resolution map of human evolutionary constraint using 29 mammals

      research-article
      1 , 2 , $ , 1 , 1 , 1 , 3 , 4 , 3 , 1 , 3 , 1 , 3 , 5 , 1 , 1 , 3 , 6 , 7 , 8 , 9 , 1 , 10 , 1 , 1 , 5 , 1 , 6 , 11 , 1 , 5 , 5 , 1 , 12 , 1 , 3 , 9 , 9 , 1 , 12 , 5 , 4 , 6 , 3 , 1 , 13 , 5 , 4 , 1 , 1 , Broad Institute Sequencing Platform and Whole Genome Assembly Team, 14 , 14 , 14 , 14 , Baylor College of Medicine Human Genome Sequencing Center, 15 , 15 , 14 , 15 , 15 , Genome Institute at Washington University, 5 , 16 , 5 , 17 , 6 , 8 ,   12 , 5 , 9 , 18 , 4 , 19 , 1 , $ , 1 , 3 , $
      Nature

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Comparison of related genomes has emerged as a powerful lens for genome interpretation. Here, we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and report constrained elements covering ~4.2% of the genome. We use evolutionary signatures and comparison with experimental datasets to suggest candidate functions for ~60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events, and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements, and ~1,000 primate- and human-accelerated elements. Overlap with disease-associated variants suggests our findings will be relevant for studies of human biology and health.

          Related collections

          Most cited references35

          • Record: found
          • Abstract: found
          • Article: not found

          Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals.

          Comprehensive identification of all functional elements encoded in the human genome is a fundamental need in biomedical research. Here, we present a comparative analysis of the human, mouse, rat and dog genomes to create a systematic catalogue of common regulatory motifs in promoters and 3' untranslated regions (3' UTRs). The promoter analysis yields 174 candidate motifs, including most previously known transcription-factor binding sites and 105 new motifs. The 3'-UTR analysis yields 106 motifs likely to be involved in post-transcriptional regulation. Nearly one-half are associated with microRNAs (miRNAs), leading to the discovery of many new miRNA genes and their likely target genes. Our results suggest that previous estimates of the number of human miRNA genes were low, and that miRNAs regulate at least 20% of human genes. The overall results provide a systematic view of gene regulation in the human, which will be refined as additional mammalian genomes become available.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Sequencing and comparison of yeast species to identify genes and regulatory elements.

            Identifying the functional elements encoded in a genome is one of the principal challenges in modern biology. Comparative genomics should offer a powerful, general approach. Here, we present a comparative analysis of the yeast Saccharomyces cerevisiae based on high-quality draft sequences of three related species (S. paradoxus, S. mikatae and S. bayanus). We first aligned the genomes and characterized their evolution, defining the regions and mechanisms of change. We then developed methods for direct identification of genes and regulatory motifs. The gene analysis yielded a major revision to the yeast gene catalogue, affecting approximately 15% of all genes and reducing the total count by about 500 genes. The motif analysis automatically identified 72 genome-wide elements, including most known regulatory motifs and numerous new motifs. We inferred a putative function for most of these motifs, and provided insights into their combinatorial interactions. The results have implications for genome analysis of diverse organisms, including the human.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs

              RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes.
                Bookmark

                Author and article information

                Journal
                0410462
                6011
                Nature
                Nature
                0028-0836
                1476-4687
                15 September 2011
                12 October 2011
                27 April 2012
                : 478
                : 7370
                : 476-482
                Affiliations
                [1 ]Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), 7 Cambridge Center, Cambridge, Massachusetts 02142, USA
                [2 ]Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Box 582, SE-751 23 Uppsala, Sweden
                [3 ]MIT Computer Science and Artificial Intelligence Laboratory, 32 Vassar St. Cambridge MA 02139, USA
                [4 ]The Bioinformatics Centre, Department of Biology, University of Copenhagen, DK-2200 Copenhagen, Denmark
                [5 ]EMBL-EBI, Wellcome Trust Genome Campus, CB10 1SD Hinxton, UK
                [6 ]Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064
                [7 ]Department of Developmental Biology, Stanford University, Stanford, CA 94305
                [8 ]Howard Hughes Medical Institute
                [9 ]Gladstone Institutes, University of California, 1650 Owens Street, San Francisco, CA 94158
                [10 ]BioTeam Inc, 7 Derosier Drive, Middleton, MA
                [11 ]Research Computing, Division of Science, Faculty of Arts and Sciences, Harvard University, Cambridge MA 02138
                [12 ]Dept. of Biological Statistics & Computational Biology, Cornell University, Ithaca, NY 14853
                [13 ]Research Institute of Molecular Pathology (IMP), A-1030 Vienna, Austria
                [14 ]Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77031, USA
                [15 ]Genome Institute at Washington University, Washington University School of Medicine, 1 Childrens Place, Saint Louis, MO 63110, USA
                [16 ]Genome Informatics Section, Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda MD 20892 USA
                [17 ]NISC Comparative Sequencing Program, Genome Technology Branch and NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda MD 20892 USA
                [18 ]Institute for Human Genetics, and Division of Biostatistics, University of California, 1650 Owens Street, San Francisco, CA 94158
                [19 ]Department of Molecular Medicine (MOMA), Aarhus University Hospital, Skejby, DK-8200 Aarhus N, Denmark
                Author notes
                Correspondence and requests for materials should be addressed to K. L. T. ( kersli@ 123456broadinstitute.org ), E. S. L. ( lander@ 123456broadinstitute.org ) and M. K. ( manoli@ 123456mit.edu ).
                [$ ]To whom correspondence should be addressed.
                [*]

                Contributed equally to the manuscript.

                [†]

                Full list of contributors and author affiliations appears at the end of the manuscript.

                Article
                nihpa322841
                10.1038/nature10530
                3207357
                21993624
                463ce3a4-3b11-4715-8d9d-e3383b101371

                Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

                History
                Categories
                Article

                Uncategorized
                Uncategorized

                Comments

                Comment on this article