GenBank

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 380 000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.

Related collections

Most cited references 9

Record: found
Abstract: not found
Article: not found

dbEST--database for "expressed sequence tags".

M S Boguski, T. M. Lowe, C Tolstoshev (1993)

0 comments Cited 325 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Functional annotation of a full-length mouse cDNA collection.

J Kawai, A Shinagawa, K. Shibata … (2001)

The RIKEN Mouse Gene Encyclopaedia Project, a systematic approach to determining the full coding potential of the mouse genome, involves collection and sequencing of full-length complementary DNAs and physical mapping of the corresponding genes to the mouse genome. We organized an international functional annotation meeting (FANTOM) to annotate the first 21,076 cDNAs to be analysed in this project. Here we describe the first RIKEN clone collection, which is one of the largest described for any organism. Analysis of these cDNAs extends known gene families and identifies new ones.

0 comments Cited 138 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Protein sequence similarity searches using patterns as seeds.

E Koonin, Richard Miller, David Madden … (1998)

Protein families often are characterized by conserved sequence patterns or motifs. A researcher frequently wishes to evaluate the significance of a specific pattern within a protein, or to exploit knowledge of known motifs to aid the recognition of greatly diverged but homologous family members. To assist in these efforts, the pattern-hit initiated BLAST (PHI-BLAST) program described here takes as input both a protein sequence and a pattern of interest that it contains. PHI-BLAST searches a protein database for other instances of the input pattern, and uses those found as seeds for the construction of local alignments to the query sequence. The random distribution of PHI-BLAST alignment scores is studied analytically and empirically. In many instances, the program is able to detect statistically significant similarity between homologous proteins that are not recognizably related using traditional single-pass database search methods. PHI-BLAST is applied to the analysis of CED4-like cell death regulators, HS90-type ATPase domains, archaeal tRNA nucleotidyltransferases and archaeal homologs of DnaG-type DNA primases.

0 comments Cited 70 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (publisher-id): nar

Journal ID (hwp): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date Collection: January 2011

Publication date (Print): January 2011

Publication date (Electronic): 10 November 2010

Publication date PMC-release: 10 November 2010

Volume: 39

Issue: Database issue , Database issue

Pages: D32-D37

Affiliations

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA

Author notes

*To whom correspondence should be addressed. Tel: +1 301 496 2475; Fax: +1 301 480 9241; Email: sayers@ 123456ncbi.nlm.nih.gov

Article

Publisher ID: gkq1079

DOI: 10.1093/nar/gkq1079

PMC ID: 3013681

PubMed ID: 21071399

SO-VID: 15706013-f1fa-401c-a8d2-126b95158c69

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 17 September 2010

Date revision received : 14 October 2010

Date accepted : 14 October 2010

GenBank

Read this article at

Abstract

Related collections

Genomic Prediction

Most cited references 9

dbEST--database for "expressed sequence tags".

Functional annotation of a full-length mouse cDNA collection.

Protein sequence similarity searches using patterns as seeds.

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Cited by 215

Most referenced authors 896