SpliceMiner: a high-throughput database implementation of the NCBI Evidence Viewer for microarray splice variant analysis

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

There are many fewer genes in the human genome than there are expressed transcripts. Alternative splicing is the reason. Alternatively spliced transcripts are often specific to tissue type, developmental stage, environmental condition, or disease state. Accurate analysis of microarray expression data and design of new arrays for alternative splicing require assessment of probes at the sequence and exon levels.

Description

SpliceMiner is a web interface for querying Evidence Viewer Database (EVDB). EVDB is a comprehensive, non-redundant compendium of splice variant data for human genes. We constructed EVDB as a queryable implementation of the NCBI Evidence Viewer (EV). EVDB is based on data obtained from NCBI Entrez Gene and EV. The automated EVDB build process uses only complete coding sequences, which may or may not include partial or complete 5' and 3' UTRs, and filters redundant splice variants. Unlike EV, which supports only one-at-a-time queries, SpliceMiner supports high-throughput batch queries and provides results in an easily parsable format. SpliceMiner maps probes to splice variants, effectively delineating the variants identified by a probe.

Conclusion

EVDB can be queried by gene symbol, genomic coordinates, or probe sequence via a user-friendly web-based tool we call SpliceMiner ( http://discover.nci.nih.gov/spliceminer). The EVDB/SpliceMiner combination provides an interface with human splice variant information and, going beyond the very valuable NCBI Evidence Viewer, supports fluent, high-throughput analysis. Integration of EVDB information into microarray analysis and design pipelines has the potential to improve the analysis and bioinformatic interpretation of gene expression data, for both batch and interactive processing. For example, whenever a gene expression value is recognized as important or appears anomalous in a microarray experiment, the interactive mode of SpliceMiner can be used quickly and easily to check for possible splice variant issues.

Related collections

Most cited references 49

Record: found
Abstract: found
Article: not found

Entrez Gene: gene-centered information at NCBI

Donna R. Maglott, Jim Ostell, Kim D. Pruitt … (2004)

Entrez Gene (www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene) is NCBI's database for gene-specific information. It does not include all known or predicted genes; instead Entrez Gene focuses on the genomes that have been completely sequenced, that have an active research community to contribute gene-specific information, or that are scheduled for intense sequence analysis. The content of Entrez Gene represents the result of curation and automated integration of data from NCBI's Reference Sequence project (RefSeq), from collaborating model organism databases, and from many other databases available from NCBI. Records are assigned unique, stable and tracked integers as identifiers. The content (nomenclature, map location, gene products and their attributes, markers, phenotypes, and links to citations, sequences, variation details, maps, expression, homologs, protein domains and external databases) is updated as new information becomes available. Entrez Gene is a step forward from NCBI's LocusLink, with both a major increase in taxonomic scope and improved access through the many tools associated with NCBI Entrez.

0 comments Cited 340 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

GenBank

Dennis A Benson, Ilene Karsch-Mizrachi, David Lipman … (2004)

GenBank® is a comprehensive database that contains publicly available DNA sequences for more than 165 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in the UK and the DNA Data Bank of Japan helps to ensure worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at http://www.ncbi.nlm.nih.gov.

0 comments Cited 336 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Alternative splicing: increasing diversity in the proteomic world.

Brenton R. Graveley (2001)

How can the genome of Drosophila melanogaster contain fewer genes than the undoubtedly simpler organism Caenorhabditis elegans? The answer must lie within their proteomes. It is becoming clear that alternative splicing has an extremely important role in expanding protein diversity and might therefore partially underlie the apparent discrepancy between gene number and organismal complexity. Alternative splicing can generate more transcripts from a single gene than the number of genes in an entire genome. However, for the vast majority of alternative splicing events, the functional significance is unknown. Developing a full catalog of alternatively spliced transcripts and determining each of their functions will be a major challenge of the upcoming proteomic era.

0 comments Cited 312 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): BMC Bioinformatics

Title: BMC Bioinformatics

Publisher: BioMed Central (London )

ISSN (Electronic): 1471-2105

Publication date Collection: 2007

Publication date (Electronic): 5 March 2007

Volume: 8

Page: 75

Affiliations

[1 ]Department of Bioinformatics, George Mason University, Fairfax, Virginia, USA

[2 ]Laboratory of Molecular Pharmacology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA

[3 ]Tiger Team Consulting, Fairfax, VA, USA

[4 ]Georgetown University, Washington, DC, USA

[5 ]School of Informatics, Northern Kentucky University, Highland Heights, KY, USA

Article

Publisher ID: 1471-2105-8-75

DOI: 10.1186/1471-2105-8-75

PMC ID: 1839109

PubMed ID: 17338820

SO-VID: b244b1ef-286f-479f-b588-a419caf3cb5a

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

SpliceMiner: a high-throughput database implementation of the NCBI Evidence Viewer for microarray splice variant analysis

Read this article at

Abstract

Background

Description

Conclusion

Related collections

Network and Systems Medicine

Most cited references 49

Entrez Gene: gene-centered information at NCBI

GenBank

Alternative splicing: increasing diversity in the proteomic world.

Author and article information

Journal

Affiliations

Article

History

Categories

Comments

Comment on this article

Similar content 163

Cited by 10

Most referenced authors 816