51
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      M-ORBIS: Mapping of mOleculaR Binding sItes and Surfaces

      research-article
      1 , 2 , 1 , * , 1
      Nucleic Acids Research
      Oxford University Press

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          M-ORBIS is a Molecular Cartography approach that performs integrative high-throughput analysis of structural data to localize all types of binding sites and associated partners by homology and to characterize their properties and behaviors in a systemic way. The robustness of our binding site inferences was compared to four curated datasets corresponding to protein heterodimers and homodimers and protein–DNA/RNA assemblies. The Molecular Cartographies of structurally well-detailed proteins shows that 44% of their surfaces interact with non-solvent partners. Residue contact frequencies with water suggest that ∼86% of their surfaces are transiently solvated, whereas only 15% are specifically solvated. Our analysis also reveals the existence of two major binding site families: specific binding sites which can only bind one type of molecule (protein, DNA, RNA, etc.) and polyvalent binding sites that can bind several distinct types of molecule. Specific homodimer binding sites are for instance nearly twice as hydrophobic than previously described and more closely resemble the protein core, while polyvalent binding sites able to form homo and heterodimers more closely resemble the surfaces involved in crystal packing. Similarly, the regions able to bind DNA and to alternatively form homodimers, are more hydrophobic and less polar than previously described DNA binding sites.

          Related collections

          Most cited references38

          • Record: found
          • Abstract: found
          • Article: not found

          Amino acid substitution matrices from protein blocks.

          Methods for alignment of protein sequences typically measure similarity by using a substitution matrix with scores for all possible exchanges of one amino acid with another. The most widely used matrices are based on the Dayhoff model of evolutionary rates. Using a different approach, we have derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins. This led to marked improvements in alignments and in searches using queries from each of the groups.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            UniRef: comprehensive and non-redundant UniProt reference clusters.

            Redundant protein sequences in biological databases hinder sequence similarity searches and make interpretation of search results difficult. Clustering of protein sequence space based on sequence similarity helps organize all sequences into manageable datasets and reduces sampling bias and overrepresentation of sequences. The UniRef (UniProt Reference Clusters) provide clustered sets of sequences from the UniProt Knowledgebase (UniProtKB) and selected UniProt Archive records to obtain complete coverage of sequence space at several resolutions while hiding redundant sequences. Currently covering >4 million source sequences, the UniRef100 database combines identical sequences and subfragments from any source organism into a single UniRef entry. UniRef90 and UniRef50 are built by clustering UniRef100 sequences at the 90 or 50% sequence identity levels. UniRef100, UniRef90 and UniRef50 yield a database size reduction of approximately 10, 40 and 70%, respectively, from the source sequence set. The reduced redundancy increases the speed of similarity searches and improves detection of distant relationships. UniRef entries contain summary cluster and membership information, including the sequence of a representative protein, member count and common taxonomy of the cluster, the accession numbers of all the merged entries and links to rich functional annotation in UniProtKB to facilitate biological discovery. UniRef has already been applied to broad research areas ranging from genome annotation to proteomics data analysis. UniRef is updated biweekly and is available for online search and retrieval at http://www.uniprot.org, as well as for download at ftp://ftp.uniprot.org/pub/databases/uniprot/uniref. Supplementary data are available at Bioinformatics online.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Protein structure alignment by incremental combinatorial extension (CE) of the optimal path.

              A new algorithm is reported which builds an alignment between two protein structures. The algorithm involves a combinatorial extension (CE) of an alignment path defined by aligned fragment pairs (AFPs) rather than the more conventional techniques using dynamic programming and Monte Carlo optimization. AFPs, as the name suggests, are pairs of fragments, one from each protein, which confer structure similarity. AFPs are based on local geometry, rather than global features such as orientation of secondary structures and overall topology. Combinations of AFPs that represent possible continuous alignment paths are selectively extended or discarded thereby leading to a single optimal alignment. The algorithm is fast and accurate in finding an optimal structure alignment and hence suitable for database scanning and detailed analysis of large protein families. The method has been tested and compared with results from Dali and VAST using a representative sample of similar structures. Several new structural similarities not detected by these other methods are reported. Specific one-on-one alignments and searches against all structures as found in the Protein Data Bank (PDB) can be performed via the Web at http://cl.sdsc.edu/ce.html.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                nar
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                January 2011
                January 2011
                2 September 2010
                2 September 2010
                : 39
                : 1
                : 30-43
                Affiliations
                1Department of Biology and Structural Genomics, IGBMC, Illkirch, 67404 and 2Department of Structural Bioinformatics, BIONEXT, Boulogne Billancourt, 92100, France
                Author notes
                *To whom correspondence should be addressed. Tel: +33 3 88 65 32 94; Fax: +33 3 88 65 32 76; Email: poch@ 123456igbmc.fr
                Article
                gkq736
                10.1093/nar/gkq736
                3017595
                20813758
                3f290a1b-6569-4b99-bcf4-aa1baf9a6f70
                © The Author(s) 2010. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 3 May 2010
                : 31 July 2010
                : 3 August 2010
                Categories
                Computational Biology

                Genetics
                Genetics

                Comments

                Comment on this article