The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4)

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.

Related collections

Most cited references 9

Record: found
Abstract: not found
Article: not found

tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence

T. M. Lowe, S. Eddy (1997)

0 comments Cited 818 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

IMG 4 version of the integrated microbial genomes comparative analysis system

Victor Markowitz, I-Min Chen, Krishna Palaniappan … (2013)

The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

0 comments Cited 306 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

CDD: a conserved domain database for interactive domain family analysis

Aron Marchler-Bauer, John B. Anderson, Myra Derbyshire … (2006)

The conserved domain database (CDD) is part of NCBI's Entrez database system and serves as a primary resource for the annotation of conserved domain footprints on protein sequences in Entrez. Entrez's global query interface can be accessed at and will search CDD and many other databases. Domain annotation for proteins in Entrez has been pre-computed and is readily available in the form of ‘Conserved Domain’ links. Novel protein sequences can be scanned against CDD using the CD-Search service; this service searches databases of CDD-derived profile models with protein sequence queries using BLAST heuristics, at . Protein query sequences submitted to NCBI's protein BLAST search service are scanned for conserved domain signatures by default. The CDD collection contains models imported from Pfam, SMART and COG, as well as domain models curated at NCBI. NCBI curated models are organized into hierarchies of domains related by common descent. Here we report on the status of the curation effort and present a novel helper application, CDTree, which enables users of the CDD resource to examine curated hierarchies. More importantly, CDD and CDTree used in concert, serve as a powerful tool in protein classification, as they allow users to analyze protein sequences in the context of domain family hierarchies.

0 comments Cited 219 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Marcel Huntemann: mhuntemann@lbl.gov

Journal

Journal ID (nlm-ta): Stand Genomic Sci

Journal ID (iso-abbrev): Stand Genomic Sci

Title: Standards in Genomic Sciences

Publisher: BioMed Central (London )

ISSN (Electronic): 1944-3277

Publication date (Electronic): 26 October 2015

Publication date PMC-release: 26 October 2015

Publication date Collection: 2015

Volume: 10

Electronic Location Identifier: 86

Affiliations

[ ]Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, USA

[ ]Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA

[ ]Present Address: Computational Biology Group, Celgene Corporation, Summit, USA

Article

Publisher ID: 77

DOI: 10.1186/s40793-015-0077-y

PMC ID: 4623924

PubMed ID: 26512311

SO-VID: 67d046d8-4bda-43f3-9844-514d5e7f9f26

License:

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

History

Date received : 18 April 2015

Date accepted : 13 October 2015

Custom metadata

ScienceOpen disciplines: Genetics

Keywords: microbial genome annotation,sop,img,jgi

Data availability:

ScienceOpen disciplines: Genetics

Keywords: microbial genome annotation, sop, img, jgi

Comments

Comment on this article

scite_

Cited by 136

See all cited by

Most referenced authors 736

See all reference authors

The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4)

Read this article at

Abstract

Related collections

Genome Integrity

Most cited references 9

tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence

IMG 4 version of the integrated microbial genomes comparative analysis system

CDD: a conserved domain database for interactive domain family analysis

Author and article information

Contributors

Journal

Affiliations

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 150

Cited by 136

Most referenced authors 736