49
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction

      research-article
      1 , 1 ,
      BMC Bioinformatics
      BioMed Central

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          RNA secondary structure prediction methods based on probabilistic modeling can be developed using stochastic context-free grammars (SCFGs). Such methods can readily combine different sources of information that can be expressed probabilistically, such as an evolutionary model of comparative RNA sequence analysis and a biophysical model of structure plausibility. However, the number of free parameters in an integrated model for consensus RNA structure prediction can become untenable if the underlying SCFG design is too complex. Thus a key question is, what small, simple SCFG designs perform best for RNA secondary structure prediction?

          Results

          Nine different small SCFGs were implemented to explore the tradeoffs between model complexity and prediction accuracy. Each model was tested for single sequence structure prediction accuracy on a benchmark set of RNA secondary structures.

          Conclusions

          Four SCFG designs had prediction accuracies near the performance of current energy minimization programs. One of these designs, introduced by Knudsen and Hein in their PFOLD algorithm, has only 21 free parameters and is significantly simpler than the others.

          Related collections

          Most cited references49

          • Record: found
          • Abstract: found
          • Article: not found

          Rfam: an RNA family database.

          Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at http://www.sanger.ac.uk/Software/Rfam/ and in the US at http://rfam.wustl.edu/. These websites allow the user to search a query sequence against a library of covariance models, and view multiple sequence alignments and family annotation. The database can also be downloaded in flatfile form and searched locally using the INFERNAL package (http://infernal.wustl.edu/). The first release of Rfam (1.0) contains 25 families, which annotate over 50 000 non-coding RNA genes in the taxonomic divisions of the EMBL nucleotide database.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Non-coding RNA genes and the modern RNA world.

            S. Eddy (2001)
            Non-coding RNA (ncRNA) genes produce functional RNA molecules rather than encoding proteins. However, almost all means of gene identification assume that genes encode proteins, so even in the era of complete genome sequences, ncRNA genes have been effectively invisible. Recently, several different systematic screens have identified a surprisingly large number of new ncRNA genes. Non-coding RNAs seem to be particularly abundant in roles that require highly specific nucleic acid recognition without complex catalysis, such as in directing post-transcriptional regulation of gene expression or in guiding RNA modifications.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The equilibrium partition function and base pair binding probabilities for RNA secondary structure.

              A novel application of dynamic programming to the folding problem for RNA enables one to calculate the full equilibrium partition function for secondary structure and the probabilities of various substructures. In particular, both the partition function and the probabilities of all base pairs are computed by a recursive scheme of polynomial order N3 in the sequence length N. The temperature dependence of the partition function gives information about melting behavior for the secondary structure. The pair binding probabilities, the computation of which depends on the partition function, are visually summarized in a "box matrix" display and this provides a useful tool for examining the full ensemble of probable alternative equilibrium structures. The calculation of this ensemble representation allows a proper application and assessment of the predictive power of the secondary structure method, and yields important information on alternatives and intermediates in addition to local information about base pair opening and slippage. The results are illustrated for representative tRNA, 5S RNA, and self-replicating and self-splicing RNA molecules, and allow a direct comparison with enzymatic structure probes. The effect of changes in the thermodynamic parameters on the equilibrium ensemble provides a further sensitivity check to the predictions.
                Bookmark

                Author and article information

                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                2004
                4 June 2004
                : 5
                : 71
                Affiliations
                [1 ]Howard Hughes Medical Institute and Department of Genetics, Washington University School of Medicine, 4444 Forest Park Blvd. Box 8510, St. Louis, MO 63108 USA
                Article
                1471-2105-5-71
                10.1186/1471-2105-5-71
                442121
                15180907
                619f1047-2f7d-46a7-85ee-80abd121dd90
                Copyright © 2004 Dowell and Eddy; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
                History
                : 19 April 2004
                : 4 June 2004
                Categories
                Research Article

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article