38
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Information-driven protein–DNA docking using HADDOCK: it is a matter of flexibility

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Intrinsic flexibility of DNA has hampered the development of efficient protein−DNA docking methods. In this study we extend HADDOCK (High Ambiguity Driven DOCKing) [C. Dominguez, R. Boelens and A. M. J. J. Bonvin (2003) J. Am. Chem. Soc. 125, 1731–1737] to explicitly deal with DNA flexibility. HADDOCK uses non-structural experimental data to drive the docking during a rigid-body energy minimization, and semi-flexible and water refinement stages. The latter allow for flexibility of all DNA nucleotides and the residues of the protein at the predicted interface. We evaluated our approach on the monomeric repressor−DNA complexes formed by bacteriophage 434 Cro, the Escherichia coli Lac headpiece and bacteriophage P22 Arc. Starting from unbound proteins and canonical B-DNA we correctly predict the correct spatial disposition of the complexes and the specific conformation of the DNA in the published complexes. This information is subsequently used to generate a library of pre-bent and twisted DNA structures that served as input for a second docking round. The resulting top ranking solutions exhibit high similarity to the published complexes in terms of root mean square deviations, intermolecular contacts and DNA conformation. Our two-stage docking method is thus able to successfully predict protein−DNA complexes from unbound constituents using non-structural experimental data to drive the docking.

          Related collections

          Most cited references61

          • Record: found
          • Abstract: found
          • Article: not found

          LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions.

          The LIGPLOT program automatically generates schematic 2-D representations of protein-ligand complexes from standard Protein Data Bank file input. The output is a colour, or black-and-white, PostScript file giving a simple and informative representation of the intermolecular interactions and their strengths, including hydrogen bonds, hydrophobic interactions and atom accessibilities. The program is completely general for any ligand and can also be used to show other types of interaction in proteins and nucleic acids. It was designed to facilitate the rapid inspection of many enzyme complexes, but has found many other applications.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Database of homology-derived protein structures and the structural meaning of sequence alignment.

            The database of known protein three-dimensional structures can be significantly increased by the use of sequence homology, based on the following observations. (1) The database of known sequences, currently at more than 12,000 proteins, is two orders of magnitude larger than the database of known structures. (2) The currently most powerful method of predicting protein structures is model building by homology. (3) Structural homology can be inferred from the level of sequence similarity. (4) The threshold of sequence similarity sufficient for structural homology depends strongly on the length of the alignment. Here, we first quantify the relation between sequence similarity, structure similarity, and alignment length by an exhaustive survey of alignments between proteins of known structure and report a homology threshold curve as a function of alignment length. We then produce a database of homology-derived secondary structure of proteins (HSSP) by aligning to each protein of known structure all sequences deemed homologous on the basis of the threshold curve. For each known protein structure, the derived database contains the aligned sequences, secondary structure, sequence variability, and sequence profile. Tertiary structures of the aligned sequences are implied, but not modeled explicitly. The database effectively increases the number of known protein structures by a factor of five to more than 1800. The results may be useful in assessing the structural significance of matches in sequence database searches, in deriving preferences and patterns for structure prediction, in elucidating the structural role of conserved residues, and in modeling three-dimensional detail by homology.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Refinement of protein structures in explicit solvent.

              We present a CPU efficient protocol for refinement of protein structures in a thin layer of explicit solvent and energy parameters with completely revised dihedral angle terms. Our approach is suitable for protein structures determined by theoretical (e.g., homology modeling or threading) or experimental methods (e.g., NMR). In contrast to other recently proposed refinement protocols, we put a strong emphasis on consistency with widely accepted covalent parameters and computational efficiency. We illustrate the method for NMR structure calculations of three proteins: interleukin-4, ubiquitin, and crambin. We show a comparison of their structure ensembles before and after refinement in water with and without a force field energy term for the dihedral angles; crambin was also refined in DMSO. Our results demonstrate the significant improvement of structure quality by a short refinement in a thin layer of solvent. Further, they show that a dihedral angle energy term in the force field is beneficial for structure calculation and refinement. We discuss the optimal weight for the energy constant for the backbone angle omega and include an extensive discussion of meaning and relevance of the calculated validation criteria, in particular root mean square Z scores for covalent parameters such as bond lengths. Copyright 2003 Wiley-Liss, Inc.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                Nucleic Acids Research
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                2006
                2006
                4 July 2006
                : 34
                : 11
                : 3317-3325
                Affiliations
                NMR Spectroscopy Research Group, Bijvoet Center for Biomolecular Research, Faculty of Sciences, Utrecht University The Netherlands
                1Department of Biochemistry and Biophysics, Oregon State University Corvallis, USA
                Author notes
                *To whom correspondence may be addressed. Tel: +31 30 2533859; Fax: +31 30 2537623; Email: a.m.j.j.bonvin@ 123456chem.uu.nl
                Article
                10.1093/nar/gkl412
                1500871
                16820531
                bd00879d-8d0f-4376-9ea5-a5f6f75ba202
                © 2006 The Author(s)

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commerical use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 27 April 2006
                : 16 May 2006
                : 18 May 2006
                Categories
                Article

                Genetics
                Genetics

                Comments

                Comment on this article