• Record: found
  • Abstract: found
  • Article: found
Is Open Access

Proteins with Complex Architecture as Potential Targets for Drug Design: A Case Study of Mycobacterium tuberculosis

Read this article at

      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


      Lengthy co-evolution of Homo sapiens and Mycobacterium tuberculosis, the main causative agent of tuberculosis, resulted in a dramatically successful pathogen species that presents considerable challenge for modern medicine. The continuous and ever increasing appearance of multi-drug resistant mycobacteria necessitates the identification of novel drug targets and drugs with new mechanisms of action. However, further insights are needed to establish automated protocols for target selection based on the available complete genome sequences. In the present study, we perform complete proteome level comparisons between M. tuberculosis, mycobacteria, other prokaryotes and available eukaryotes based on protein domains, local sequence similarities and protein disorder. We show that the enrichment of certain domains in the genome can indicate an important function specific to M. tuberculosis. We identified two families, termed pkn and PE/PPE that stand out in this respect. The common property of these two protein families is a complex domain organization that combines species-specific regions, commonly occurring domains and disordered segments. Besides highlighting promising novel drug target candidates in M. tuberculosis, the presented analysis can also be viewed as a general protocol to identify proteins involved in species-specific functions in a given organism. We conclude that target selection protocols should be extended to include proteins with complex domain architectures instead of focusing on sequentially unique and essential proteins only.

      Author Summary

      Mycobacterium tuberculosis (MTB), the causative agent of TB, is a dramatically successful pathogen that poses a considerable challenge for modern medicine. The increase in multi-drug resistant TB necessitates the identification of novel drug targets and drugs with new mechanisms of action. In this work, we developed a novel computational strategy based on comparative proteomic analysis that can highlight proteins involved in specifies-specific functions. Our analyses of the proteins encoded by the MTB genome identified two protein families that stand out in this respect. These proteins have complex architecture combining various domains and disordered segments. They are also involved in vital functions, especially in host-pathogen interactions. Although these proteins generally do not fit into traditional drug design paradigms, there are several new strategies emerging that can be used to target these proteins during drug development. Our results challenge current target selection protocols that largely rely on the uniqueness and the essentiality of proteins. Instead, these findings emphasize the importance of complex evolutionary scenarios that can lead to the emergence of species-specific functions from more ancient building blocks of proteins. The experiences gained from this work have important implications specifically for targeting MTB, and in broader terms, to improve current target selection protocols in drug development.

      Related collections

      Most cited references 99

      • Record: found
      • Abstract: not found
      • Article: not found

      Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

      The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
        • Record: found
        • Abstract: found
        • Article: not found

        Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence.

        Countless millions of people have died from tuberculosis, a chronic infectious disease caused by the tubercle bacillus. The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve our understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions. The genome comprises 4,411,529 base pairs, contains around 4,000 genes, and has a very high guanine + cytosine content that is reflected in the biased amino-acid content of the proteins. M. tuberculosis differs radically from other bacteria in that a very large portion of its coding capacity is devoted to the production of enzymes involved in lipogenesis and lipolysis, and to two new families of glycine-rich proteins with a repetitive structure that may represent a source of antigenic variation.
          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The Pfam protein families database

          Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is ∼100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11 912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (, the USA ( and Sweden (

            Author and article information

            [1 ]Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary
            [2 ]Department of Applied Biotechnology, Budapest University of Technology and Economics, Budapest, Hungary
            University of Heidelberg, Germany
            Author notes

            Conceived and designed the experiments: BM JT BGV ZD IS. Performed the experiments: BM ZD. Analyzed the data: BM ZD. Wrote the paper: BM JT BGV ZD IS.

            Role: Editor
            PLoS Comput Biol
            PLoS Computational Biology
            Public Library of Science (San Francisco, USA )
            July 2011
            July 2011
            21 July 2011
            : 7
            : 7
            Mészáros et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
            Pages: 14
            Research Article
            Computational Biology
            Comparative Genomics
            Sequence Analysis
            Systems Biology
            Sequence Analysis

            Quantitative & Systems biology


            Comment on this article