M-ORBIS: Mapping of mOleculaR Binding sItes and Surfaces

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

M-ORBIS is a Molecular Cartography approach that performs integrative high-throughput analysis of structural data to localize all types of binding sites and associated partners by homology and to characterize their properties and behaviors in a systemic way. The robustness of our binding site inferences was compared to four curated datasets corresponding to protein heterodimers and homodimers and protein–DNA/RNA assemblies. The Molecular Cartographies of structurally well-detailed proteins shows that 44% of their surfaces interact with non-solvent partners. Residue contact frequencies with water suggest that ∼86% of their surfaces are transiently solvated, whereas only 15% are specifically solvated. Our analysis also reveals the existence of two major binding site families: specific binding sites which can only bind one type of molecule (protein, DNA, RNA, etc.) and polyvalent binding sites that can bind several distinct types of molecule. Specific homodimer binding sites are for instance nearly twice as hydrophobic than previously described and more closely resemble the protein core, while polyvalent binding sites able to form homo and heterodimers more closely resemble the surfaces involved in crystal packing. Similarly, the regions able to bind DNA and to alternatively form homodimers, are more hydrophobic and less polar than previously described DNA binding sites.

Related collections

Most cited references 38

Record: found
Abstract: found
Article: not found

Amino acid substitution matrices from protein blocks.

S Henikoff, J. Henikoff (1992)

Methods for alignment of protein sequences typically measure similarity by using a substitution matrix with scores for all possible exchanges of one amino acid with another. The most widely used matrices are based on the Dayhoff model of evolutionary rates. Using a different approach, we have derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins. This led to marked improvements in alignments and in searches using queries from each of the groups.

0 comments Cited 1083 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

UniRef: comprehensive and non-redundant UniProt reference clusters.

Baris Suzek, Hongzhan Huang, Peter McGarvey … (2007)

Redundant protein sequences in biological databases hinder sequence similarity searches and make interpretation of search results difficult. Clustering of protein sequence space based on sequence similarity helps organize all sequences into manageable datasets and reduces sampling bias and overrepresentation of sequences. The UniRef (UniProt Reference Clusters) provide clustered sets of sequences from the UniProt Knowledgebase (UniProtKB) and selected UniProt Archive records to obtain complete coverage of sequence space at several resolutions while hiding redundant sequences. Currently covering >4 million source sequences, the UniRef100 database combines identical sequences and subfragments from any source organism into a single UniRef entry. UniRef90 and UniRef50 are built by clustering UniRef100 sequences at the 90 or 50% sequence identity levels. UniRef100, UniRef90 and UniRef50 yield a database size reduction of approximately 10, 40 and 70%, respectively, from the source sequence set. The reduced redundancy increases the speed of similarity searches and improves detection of distant relationships. UniRef entries contain summary cluster and membership information, including the sequence of a representative protein, member count and common taxonomy of the cluster, the accession numbers of all the merged entries and links to rich functional annotation in UniProtKB to facilitate biological discovery. UniRef has already been applied to broad research areas ranging from genome annotation to proteomics data analysis. UniRef is updated biweekly and is available for online search and retrieval at http://www.uniprot.org, as well as for download at ftp://ftp.uniprot.org/pub/databases/uniprot/uniref. Supplementary data are available at Bioinformatics online.

0 comments Cited 575 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Protein structure alignment by incremental combinatorial extension (CE) of the optimal path.

I. Shindyalov, P. Bourne (1998)

A new algorithm is reported which builds an alignment between two protein structures. The algorithm involves a combinatorial extension (CE) of an alignment path defined by aligned fragment pairs (AFPs) rather than the more conventional techniques using dynamic programming and Monte Carlo optimization. AFPs, as the name suggests, are pairs of fragments, one from each protein, which confer structure similarity. AFPs are based on local geometry, rather than global features such as orientation of secondary structures and overall topology. Combinations of AFPs that represent possible continuous alignment paths are selectively extended or discarded thereby leading to a single optimal alignment. The algorithm is fast and accurate in finding an optimal structure alignment and hence suitable for database scanning and detailed analysis of large protein families. The method has been tested and compared with results from Dali and VAST using a representative sample of similar structures. Several new structural similarities not detected by these other methods are reported. Specific one-on-one alignments and searches against all structures as found in the Protein Data Bank (PDB) can be performed via the Web at http://cl.sdsc.edu/ce.html.

0 comments Cited 342 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (publisher-id): nar

Journal ID (hwp): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date Collection: January 2011

Publication date (Print): January 2011

Publication date (Electronic): 2 September 2010

Publication date PMC-release: 2 September 2010

Volume: 39

Issue: 1

Pages: 30-43

Affiliations

¹Department of Biology and Structural Genomics, IGBMC, Illkirch, 67404 and ²Department of Structural Bioinformatics, BIONEXT, Boulogne Billancourt, 92100, France

Author notes

*To whom correspondence should be addressed. Tel: +33 3 88 65 32 94; Fax: +33 3 88 65 32 76; Email: poch@ 123456igbmc.fr

Article

Publisher ID: gkq736

DOI: 10.1093/nar/gkq736

PMC ID: 3017595

PubMed ID: 20813758

SO-VID: 3f290a1b-6569-4b99-bcf4-aa1baf9a6f70

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 3 May 2010

Date revision received : 31 July 2010

Date accepted : 3 August 2010

Comments

Comment on this article

scite_

Cited by 6

See all cited by

Most referenced authors 820

See all reference authors

M-ORBIS: Mapping of mOleculaR Binding sItes and Surfaces

Read this article at

Abstract

Related collections

Genes & Diseases

Most cited references 38

Amino acid substitution matrices from protein blocks.

UniRef: comprehensive and non-redundant UniProt reference clusters.

Protein structure alignment by incremental combinatorial extension (CE) of the optimal path.

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 56

Cited by 6

Most referenced authors 820