Fragrep: An Efficient Search Tool for Fragmented Patterns in Genomic Sequences

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Many classes of non-coding RNAs (ncRNAs; including Y RNAs, vault RNAs, RNase P RNAs, and MRP RNAs, as well as a novel class recently discovered in Dictyostelium discoideum) can be characterized by a pattern of short but well-conserved sequence elements that are separated by poorly conserved regions of sometimes highly variable lengths. Local alignment algorithms such as BLAST are therefore ill-suited for the discovery of new homologs of such ncRNAs in genomic sequences. The Fragrep tool instead implements an efficient algorithm for detecting the pattern fragments that occur in a given order. For each pattern fragment, the mismatch tolerance and bounds on the length of the intervening sequences can be specified separately. Furthermore, matches can be ranked by a statistically well-motivated scoring scheme.

Related collections

Most cited references 10

Record: found
Abstract: found
Article: not found

RNAMotif, an RNA secondary structure definition and search algorithm.

T Macke (2001)

RNA molecules fold into characteristic secondary and tertiary structures that account for their diverse functional activities. Many of these RNA structures are assembled from a collection of RNA structural motifs. These basic building blocks are used repeatedly, and in various combinations, to form different RNA types and define their unique structural and functional properties. Identification of recurring RNA structural motifs will therefore enhance our understanding of RNA structure and help associate elements of RNA structure with functional and regulatory elements. Our goal was to develop a computer program that can describe an RNA structural element of any complexity and then search any nucleotide sequence database, including the complete prokaryotic and eukaryotic genomes, for these structural elements. Here we describe in detail a new computational motif search algorithm, RNAMotif, and demonstrate its utility with some motif search examples. RNAMotif differs from other motif search tools in two important aspects: first, the structure definition language is more flexible and can specify any type of base-base interaction; second, RNAMotif provides a user controlled scoring section that can be used to add capabilities that patterns alone cannot provide.

0 comments Cited 169 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure

Sean R. Eddy (2002)

Background Covariance models (CMs) are probabilistic models of RNA secondary structure, analogous to profile hidden Markov models of linear sequence. The dynamic programming algorithm for aligning a CM to an RNA sequence of length N is O(N 3) in memory. This is only practical for small RNAs. Results I describe a divide and conquer variant of the alignment algorithm that is analogous to memory-efficient Myers/Miller dynamic programming algorithms for linear sequence alignment. The new algorithm has an O(N 2 log N) memory complexity, at the expense of a small constant factor in time. Conclusions Optimal ribosomal RNA structural alignments that previously required up to 150 GB of memory now require less than 270 MB.

0 comments Cited 71 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

dictyBase: a new Dictyostelium discoideum genome database.

Petra Fey, E Just, Warren Kibbe … (2004)

Dictyostelium discoideum is a powerful and genetically tractable model system used for the study of numerous cellular molecular mechanisms including chemotaxis, phagocytosis and signal transduction. The past 2 years have seen a significant expansion in the scope and accessibility of online resources for Dictyostelium. Recent advances have focused on the development of a new comprehensive online resource called dictyBase (http://dictybase.org). This database not only provides access to genomic data including functional annotation of genes, gene products and chromosomal mapping, but also to extensive biological information such as mutant phenotypes and corresponding reference material. In conjunction with additional sites (http://genome. imb-jena.de/dictyostelium/, http://dictyensembl. bioch.bcm.tmc.edu and http://www.sanger.ac.uk/Projects/D_discoideum/) from the genome sequencing and assembly centers, these improvements have expanded the scope of the Dictyostelium databases making them accessible and useful to any researcher interested in comparative and functional genomics in metazoan organisms.

0 comments Cited 28 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Axel Mosig

Journal

Journal ID (nlm-ta): Genomics Proteomics Bioinformatics

Journal ID (iso-abbrev): Genomics Proteomics Bioinformatics

Title: Genomics, Proteomics & Bioinformatics

Publisher: Elsevier

ISSN (Print): 1672-0229

ISSN (Electronic): 2210-3244

Publication date PMC-release: 18 April 2006

Publication date (Print): 2006

Publication date (Electronic): 18 April 2006

Volume: 4

Issue: 1

Pages: 56-60

Affiliations

[1 ]Department of Combinatorics and Geometry, CAS/MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China

[2 ]Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig D-04103, Germany

[3 ]Institute for Theoretical Chemistry, University of Vienna, Vienna A-1090, Austria

[4 ]The Santa Fe Institute, Santa Fe, NM 87501, USA

[5 ]Max Planck Institute for Mathematics in the Sciences, Leipzig D-04103, Germany

Author notes

[* ]Corresponding author. mosig@ 123456sibs.ac.cn

Article

Publisher ID: S1672-0229(06)60017-X

DOI: 10.1016/S1672-0229(06)60017-X

PMC ID: 5054030

PubMed ID: 16689703

SO-VID: 7b84fc35-a469-4ae0-825d-deebcbde9de2

License:

This is an open access article under the CC BY-NC-SA license (http://creativecommons.org/licenses/by-nc-sa/3.0/).

Fragrep: An Efficient Search Tool for Fragmented Patterns in Genomic Sequences

Read this article at

Abstract

Related collections

Genomic Prediction

Most cited references 10

RNAMotif, an RNA secondary structure definition and search algorithm.

A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure

dictyBase: a new Dictyostelium discoideum genome database.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 122

Cited by 8

Most referenced authors 133