53
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Transcript Annotation in FANTOM3: Mouse Gene Catalog Based on Physical cDNAs

      other

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full-length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web-based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full-length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding (including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full-length cDNAs. The total number of distinct non-protein-coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.

          Related collections

          Most cited references19

          • Record: found
          • Abstract: found
          • Article: not found

          The transcriptional landscape of the mammalian genome.

          This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The generic genome browser: a building block for a model organism system database.

            The Generic Model Organism System Database Project (GMOD) seeks to develop reusable software components for model organism system databases. In this paper we describe the Generic Genome Browser (GBrowse), a Web-based application for displaying genomic annotations and other features. For the end user, features of the browser include the ability to scroll and zoom through arbitrary regions of a genome, to enter a region of the genome by searching for a landmark or performing a full text search of all features, and the ability to enable and disable tracks and change their relative order and appearance. The user can upload private annotations to view them in the context of the public ones, and publish those annotations to the community. For the data provider, features of the browser software include reliance on readily available open source components, simple installation, flexible configuration, and easy integration with other components of a model organism system Web site. GBrowse is freely available under an open source license. The software, its documentation, and support are available at http://www.gmod.org.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs.

              Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 'transcriptional units', contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense-antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.
                Bookmark

                Author and article information

                Journal
                PLoS Genet
                pgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                April 2006
                28 April 2006
                : 2
                : 4
                : e62
                Author notes
                * To whom correspondence should be addressed. E-mail: rgscerg@ 123456gsc.riken.jp
                ¤a Current address: Functional Genomics Subunit, Center for Developmental Biology, RIKEN, Kobe, Japan
                ¤b Current address: Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, Massachusetts, United States of America

                N. Maeda, M. Itoh, K. Shibata, J. Kawai, P. Carninci, and Y. Hayashizaki are in the Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, Wako, Japan. T. Kasukawa, R. Oyama, J. Gough, M. Frith, M. Kanamori-Katayama, S. Katayama, T. Kawashima, C. Kai, J. Kawai, P. Carninci, and Y. Hayashizaki are in the Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan. T. Kasukawa is in the Broadband Communication Service Business Unit, Network Service Solution Business Group, NTT Software Corporation, Yokohama, Japan. M. Frith, R. N. Aturaliya, and R. D. Teasdale are at the Australian Research Council Center in Bioinformatics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia. P. G. Engström and B. Lenhard are in the Computational Biology Unit, Bergen Center for Computational Science, University of Bergen, Bergen, Norway, and the Programme for Genomics and Bioinformatics, Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden. S. Batalov and C. F. Fletcher are at the Genomics Institute of the Novartis Research Foundation, San Diego, California, United States of America. K. W. Beisel is in the Department of Biomedical Sciences, Creighton University School of Medicine, Omaha, Nebraska, United States of America. C. J. Bult, M. Furuno, D. Hill, and Y. Zhu are in the Mouse Genome Informatics Consortium, The Jackson Laboratory, Bar Harbor, Maine, United States of America. A. R. R. Forrest, T. Ravasi, and D. A. Hume are at the Australian Research Council Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia. M. Katoh is in the Genetics and Cell Biology Section, National Cancer Center Research Institute, Tokyo, Japan. J. Quackenbush is at the Institute for Genomic Research, Rockville, Maryland, United States of America. B. Z. Ring is at Applied Genomics, Sunnyvale, California, United States of America. K. Sugiura is at The Jackson Laboratory, Bar Harbor, Maine, United States of America. Y. Takenaka is in the Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, Osaka, Japan. C. A. Wells is in the School of Biomolecular and Biomedical Science, Eskitis Institute for Cell and Molecular Therapies, Griffith University, Nathan, Queensland, Australia. Y. Hayashizaki is at Yokohama City University, Yokohama, Japan, and in the Graduate School of Comprehensive Human Science, University of Tsukuba, Tsukuba, Japan.

                Article
                06-PLGE-SR-0063R1 plge-02-04-22
                10.1371/journal.pgen.0020062
                1449903
                16683036
                0f2554b8-a9be-40c1-9af7-dcbf2b01dac1
                Copyright: © 2006 Maeda et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                Page count
                Pages: 6
                Categories
                Technical Report
                Bioinformatics - Computational Biology
                Biotechnology
                Cell Biology
                Evolution
                Molecular Biology - Structural Biology
                Statistics
                Genetics/Genomics
                Genetics/Gene Discovery
                Genetics/Functional Genomics
                Genetics/Gene Expression
                Eukaryotes
                Animals
                Vertebrates
                Mammals
                Mus (Mouse)
                Homo (Human)
                Home (Human)
                Primates
                Custom metadata
                Maeda N, Kasukawa T, Oyama R, Gough J, Frith M, et al. (2006) Transcript annotation in FANTOM3: Mouse gene catalog based on physical cDNAs. PLoS Genet 2(4): e62. DOI: 10.1371/journal.pgen.0020062
                Fantom_Logo.jpg

                Genetics
                Genetics

                Comments

                Comment on this article