128
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The Ensembl gene annotation system

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in order to construct candidate transcript models. Careful assessment and filtering of these candidate transcripts ultimately leads to the final gene set, which is made available on the Ensembl website. Here, we describe the annotation process in detail.

          Database URL: http://www.ensembl.org/index.html

          Related collections

          Most cited references61

          • Record: found
          • Abstract: found
          • Article: not found
          Is Open Access

          The zebrafish reference genome sequence and its relationship to the human genome.

          Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The transcriptional landscape of the mammalian genome.

            This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence

                Bookmark

                Author and article information

                Journal
                Database (Oxford)
                Database (Oxford)
                databa
                databa
                Database: The Journal of Biological Databases and Curation
                Oxford University Press
                1758-0463
                2016
                23 June 2016
                23 June 2016
                : 2016
                : baw093
                Affiliations
                1European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
                2Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
                3Present addresses: The Genome Analysis Centre, Norwich Research Park, Norwich NR4 7UH, UK
                4Eagle Genomics Ltd, Babraham Research Campus, Cambridge CB22 3AT, UK
                5European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
                6Pfizer Inc, 10646 Science Center Dr, San Diego, CA 92121, USA
                7Institutionen för cell-och molekylärbiologi, Uppsala University, Husargatan 3, Uppsala 752 37, Sweden
                8CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna a-1090, Austria
                9Genentech Inc, 1 DNA Way, South San Francisco, CA 94080, USA
                10The Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
                Author notes
                * Corresponding author: Tel: + 44 (0) 1223 494 167, Fax: + 44 (0) 1223 494 468, Email: bronwen.aken@ 123456ebi.ac.uk , Correspondence may also be addressed to Stephen M. J. Searle. Email: smjsearle@ 123456yahoo.co.uk

                Citation details: Aken,B.L., Ayling,S., Barrell,D. et al. The Ensembl gene annotation system. Database (2016) Vol. 2016: article ID baw093; doi: 10.1093/database/baw093

                Article
                baw093
                10.1093/database/baw093
                4919035
                27337980
                1ddfe444-d569-400d-bd78-e753b33b653c
                © The Author(s) 2016. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 11 January 2016
                : 09 May 2016
                : 09 May 2016
                Page count
                Pages: 19
                Categories
                Database Update

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article