16
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      On the computational complexity of the maximum parsimony reconciliation problem in the duplication-loss-coalescence model

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Phylogenetic tree reconciliation is a widely-used method for inferring the evolutionary histories of genes and species. In the duplication-loss-coalescence (DLC) model, we seek a reconciliation that explains the incongruence between a gene and species tree using gene duplication, loss, and deep coalescence events. In the maximum parsimony framework, costs are associated with these event types and a reconciliation is sought that minimizes the total cost of the events required to map the gene tree onto the species tree.

          Results

          We show that this problem is NP-hard even for the special case of minimizing the number of duplications. We then show that the problem is APX-hard when both duplications and losses are considered, implying that no polynomial-time approximation scheme can exist for the problem unless P = NP.

          Conclusions

          These intractability results are likely to guide future research on algorithmic aspects of the DLC-reconciliation problem.

          Related collections

          Most cited references7

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss

          Motivation: Gene family evolution is driven by evolutionary events such as speciation, gene duplication, horizontal gene transfer and gene loss, and inferring these events in the evolutionary history of a given gene family is a fundamental problem in comparative and evolutionary genomics with numerous important applications. Solving this problem requires the use of a reconciliation framework, where the input consists of a gene family phylogeny and the corresponding species phylogeny, and the goal is to reconcile the two by postulating speciation, gene duplication, horizontal gene transfer and gene loss events. This reconciliation problem is referred to as duplication-transfer-loss (DTL) reconciliation and has been extensively studied in the literature. Yet, even the fastest existing algorithms for DTL reconciliation are too slow for reconciling large gene families and for use in more sophisticated applications such as gene tree or species tree reconstruction. Results: We present two new algorithms for the DTL reconciliation problem that are dramatically faster than existing algorithms, both asymptotically and in practice. We also extend the standard DTL reconciliation model by considering distance-dependent transfer costs, which allow for more accurate reconciliation and give an efficient algorithm for DTL reconciliation under this extended model. We implemented our new algorithms and demonstrated up to 100 000-fold speed-up over existing methods, using both simulated and biological datasets. This dramatic improvement makes it possible to use DTL reconciliation for performing rigorous evolutionary analyses of large gene families and enables its use in advanced reconciliation-based gene and species tree reconstruction methods. Availability: Our programs can be freely downloaded from http://compbio.mit.edu/ranger-dtl/. Contact: mukul@csail.mit.edu; manoli@mit.edu Supplementary information: Supplementary data are available at Bioinformatics online.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Unified modeling of gene duplication, loss, and coalescence using a locus tree.

            Gene phylogenies provide a rich source of information about the way evolution shapes genomes, populations, and phenotypes. In addition to substitutions, evolutionary events such as gene duplication and loss (as well as horizontal transfer) play a major role in gene evolution, and many phylogenetic models have been developed in order to reconstruct and study these events. However, these models typically make the simplifying assumption that population-related effects such as incomplete lineage sorting (ILS) are negligible. While this assumption may have been reasonable in some settings, it has become increasingly problematic as increased genome sequencing has led to denser phylogenies, where effects such as ILS are more prominent. To address this challenge, we present a new probabilistic model, DLCoal, that defines gene duplication and loss in a population setting, such that coalescence and ILS can be directly addressed. Interestingly, this model implies that in addition to the usual gene tree and species tree, there exists a third tree, the locus tree, which will likely have many applications. Using this model, we develop the first general reconciliation method that accurately infers gene duplications and losses in the presence of ILS, and we show its improved inference of orthologs, paralogs, duplications, and losses for a variety of clades, including flies, fungi, and primates. Also, our simulations show that gene duplications increase the frequency of ILS, further illustrating the importance of a joint model. Going forward, we believe that this unified model can offer insights to questions in both phylogenetics and population genetics.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Simultaneous identification of duplications and lateral gene transfers.

              The incongruency between a gene tree and a corresponding species tree can be attributed to evolutionary events such as gene duplication and gene loss. This paper describes a combinatorial model where so-called DTL-scenarios are used to explain the differences between a gene tree and a corresponding species tree taking into account gene duplications, gene losses, and lateral gene transfers (also known as horizontal gene transfers). The reasonable biological constraint that a lateral gene transfer may only occur between contemporary species leads to the notion of acyclic DTL-scenarios. Parsimony methods are introduced by defining appropriate optimization problems. We show that finding most parsimonious acyclic DTL-scenarios is NP-hard. However, by dropping the condition of acyclicity, the problem becomes tractable, and we provide a dynamic programming algorithm as well as a fixed-parameter tractable algorithm for finding most parsimonious DTL-scenarios.
                Bookmark

                Author and article information

                Contributors
                dbork@g.hmc.edu
                ricsonc@andrew.cmu.edu
                jiwang@g.hmc.edu
                jsung@g.hmc.edu
                hadas@cs.hmc.edu
                Journal
                Algorithms Mol Biol
                Algorithms Mol Biol
                Algorithms for Molecular Biology : AMB
                BioMed Central (London )
                1748-7188
                14 March 2017
                14 March 2017
                2017
                : 12
                : 6
                Affiliations
                [1 ]ISNI 0000 0000 8935 1843, GRID grid.256859.5, Department of Computer Science, , Harvey Mudd College, ; Claremont, USA
                [2 ]ISNI 0000 0004 1936 9000, GRID grid.21925.3d, School of Medicine, , University of Pittsburgh, ; Pittsburgh, USA
                [3 ]ISNI 0000 0001 2097 0344, GRID grid.147455.6, School of Computer Science, , Carnegie Mellon University, ; Pittsburgh, USA
                Author information
                http://orcid.org/0000-0001-9120-1948
                Article
                98
                10.1186/s13015-017-0098-8
                5349084
                a690becc-2018-4dda-ad24-89e618ae58c9
                © The Author(s) 2017

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 22 August 2016
                : 5 February 2017
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/100000001, National Science Foundation;
                Award ID: IIS-1419739
                Award Recipient :
                Categories
                Research
                Custom metadata
                © The Author(s) 2017

                Molecular biology
                phylogenetic reconciliation,duplication-loss-coalescence model,np-hardness,apx-hardness

                Comments

                Comment on this article