+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      OrthoReD: a rapid and accurate orthology prediction tool with low computational requirement


      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          Identifying orthologous genes is an initial step required for phylogenetics, and it is also a common strategy employed in functional genetics to find candidates for functionally equivalent genes across multiple species. At the same time, in silico orthology prediction tools often require large computational resources only available on computing clusters. Here we present OrthoReD, an open-source orthology prediction tool with accuracy comparable to published tools that requires only a desktop computer. The low computational resource requirement of OrthoReD is achieved by repeating orthology searches on one gene of interest at a time, thereby generating a reduced dataset to limit the scope of orthology search for each gene of interest.


          The output of OrthoReD was highly similar to the outputs of two other published orthology prediction tools, OrthologID and/or OrthoDB, for the three dataset tested, which represented three phyla with different ranges of species diversity and different number of genomes included. Median CPU time for ortholog prediction per gene by OrthoReD executed on a desktop computer was <15 min even for the largest dataset tested, which included all coding sequences of 100 bacterial species.


          With high-throughput sequencing, unprecedented numbers of genes from non-model organisms are available with increasing need for clear information about their orthologies and/or functional equivalents in model organisms. OrthoReD is not only fast and accurate as an orthology prediction tool, but also gives researchers flexibility in the number of genes analyzed at a time, without requiring a high-performance computing cluster.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s12859-017-1726-5) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Demonstration of CRISPR/Cas9/sgRNA-mediated targeted gene modification in Arabidopsis, tobacco, sorghum and rice

          The type II CRISPR/Cas system from Streptococcus pyogenes and its simplified derivative, the Cas9/single guide RNA (sgRNA) system, have emerged as potent new tools for targeted gene knockout in bacteria, yeast, fruit fly, zebrafish and human cells. Here, we describe adaptations of these systems leading to successful expression of the Cas9/sgRNA system in two dicot plant species, Arabidopsis and tobacco, and two monocot crop species, rice and sorghum. Agrobacterium tumefaciens was used for delivery of genes encoding Cas9, sgRNA and a non-fuctional, mutant green fluorescence protein (GFP) to Arabidopsis and tobacco. The mutant GFP gene contained target sites in its 5′ coding regions that were successfully cleaved by a CAS9/sgRNA complex that, along with error-prone DNA repair, resulted in creation of functional GFP genes. DNA sequencing confirmed Cas9/sgRNA-mediated mutagenesis at the target site. Rice protoplast cells transformed with Cas9/sgRNA constructs targeting the promoter region of the bacterial blight susceptibility genes, OsSWEET14 and OsSWEET11, were confirmed by DNA sequencing to contain mutated DNA sequences at the target sites. Successful demonstration of the Cas9/sgRNA system in model plant and crop species bodes well for its near-term use as a facile and powerful means of plant genetic engineering for scientific and agricultural applications.
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            IMG 4 version of the integrated microbial genomes comparative analysis system

            The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              InParanoid 7: new algorithms and tools for eukaryotic orthology analysis

              The InParanoid project gathers proteomes of completely sequenced eukaryotic species plus Escherichia coli and calculates pairwise ortholog relationships among them. The new release 7.0 of the database has grown by an order of magnitude over the previous version and now includes 100 species and their collective 1.3 million proteins organized into 42.7 million pairwise ortholog groups. The InParanoid algorithm itself has been revised and is now both more specific and sensitive. Based on results from our recent benchmarking of low-complexity filters in homology assignment, a two-pass BLAST approach was developed that makes use of high-precision compositional score matrix adjustment, but avoids the alignment truncation that sometimes follows. We have also updated the InParanoid web site (http://InParanoid.sbc.su.se). Several features have been added, the response times have been improved and the site now sports a new, clearer look. As the number of ortholog databases has grown, it has become difficult to compare among these resources due to a lack of standardized source data and incompatible representations of ortholog relationships. To facilitate data exchange and comparisons among ortholog databases, we have developed and are making available two XML schemas: SeqXML for the input sequences and OrthoXML for the output ortholog clusters.

                Author and article information

                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                21 June 2017
                21 June 2017
                : 18
                : 310
                [1 ]ISNI 0000 0004 1936 9684, GRID grid.27860.3b, Department of Plant Sciences, , University of California, ; Davis, CA USA
                [2 ]ISNI 0000 0004 1936 9684, GRID grid.27860.3b, Department of Entomology and Nematology, , University of California, ; Davis, CA USA
                Author information
                © The Author(s). 2017

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                : 29 April 2017
                : 12 June 2017
                Funded by: Graduate studies, University of California, Davis
                Funded by: FundRef http://dx.doi.org/10.13039/100011080, Department of Plant Sciences, University of California, Davis;
                Funded by: University of California, Davis
                Funded by: National Institute of Food and Agriculture
                Award ID: USDA-NIFA-CA-D-PLS-2173-H
                Award Recipient :
                Custom metadata
                © The Author(s) 2017

                Bioinformatics & Computational biology
                gene orthology,phylogenetics,gene evolution,genome,transcriptome


                Comment on this article