43
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      What makes species unique? The contribution of proteins with obscure features

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          An analysis of proteins with obscure features in ten eukaryotic genomes revealed that the majority are species-specific.

          Abstract

          Background

          Proteins with obscure features (POFs), which lack currently defined motifs or domains, represent between 18% and 38% of a typical eukaryotic proteome. To evaluate the contribution of this class of proteins to the diversity of eukaryotes, we performed a comparative analysis of the predicted proteomes derived from 10 different sequenced genomes, including budding and fission yeast, worm, fly, mosquito, Arabidopsis, rice, mouse, rat, and human.

          Results

          Only 1,650 protein groups were found to be conserved among these proteomes (BLAST E-value threshold of 10 -6). Of these, only three were designated as POFs. Surprisingly, we found that, on average, 60% of the POFs identified in these 10 proteomes (44,236 in total) were species specific. In contrast, only 7.5% of the proteins with defined features (PDFs) were species specific (17,554 in total). As a group, POFs appear similar to PDFs in their relative contribution to biological functions, as indicated by their expression, participation in protein-protein interactions and association with mutant phenotypes. However, POF have more predicted disordered structure than PDFs, implying that they may exhibit preferential involvement in species-specific regulatory and signaling networks.

          Conclusion

          Because the majority of eukaryotic POFs are not well conserved, and by definition do not have defined domains or motifs upon which to formulate a functional working hypothesis, understanding their biochemical and biological functions will require species-specific investigations.

          Related collections

          Most cited references23

          • Record: found
          • Abstract: found
          • Article: not found

          A map of the interactome network of the metazoan C. elegans.

          To initiate studies on how protein-protein interaction (or "interactome") networks relate to multicellular functions, we have mapped a large fraction of the Caenorhabditis elegans interactome network. Starting with a subset of metazoan-specific proteins, more than 4000 interactions were identified from high-throughput, yeast two-hybrid (HT=Y2H) screens. Independent coaffinity purification assays experimentally validated the overall quality of this Y2H data set. Together with already described Y2H interactions and interologs predicted in silico, the current version of the Worm Interactome (WI5) map contains approximately 5500 interactions. Topological and biological features of this interactome network, as well as its integration with phenome and transcriptome data sets, lead to numerous biological hypotheses.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Protein disorder prediction: implications for structural proteomics.

            A great challenge in the proteomics and structural genomics era is to predict protein structure and function, including identification of those proteins that are partially or wholly unstructured. Disordered regions in proteins often contain short linear peptide motifs (e.g., SH3 ligands and targeting signals) that are important for protein function. We present here DisEMBL, a computational tool for prediction of disordered/unstructured regions within a protein sequence. As no clear definition of disorder exists, we have developed parameters based on several alternative definitions and introduced a new one based on the concept of "hot loops," i.e., coils with high temperature factors. Avoiding potentially disordered segments in protein expression constructs can increase expression, foldability, and stability of the expressed protein. DisEMBL is thus useful for target selection and the design of constructs as needed for many biochemical studies, particularly structural biology and structural genomics projects. The tool is freely available via a web interface (http://dis.embl.de) and can be downloaded for use in large-scale studies.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Evolution of the protein repertoire.

              Most proteins have been formed by gene duplication, recombination, and divergence. Proteins of known structure can be matched to about 50% of genome sequences, and these data provide a quantitative description and can suggest hypotheses about the origins of these processes.
                Bookmark

                Author and article information

                Journal
                Genome Biol
                Genome Biology
                BioMed Central (London )
                1465-6906
                1465-6914
                2006
                19 July 2006
                : 7
                : 7
                : R57
                Affiliations
                [1 ]Department of Biochemistry and Molecular Biology, University Of Nevada, Reno, NV 89557, USA
                [2 ]Center for Plant Cell Biology, University Of California, Riverside, CA 92521, USA
                Article
                gb-2006-7-7-r57
                10.1186/gb-2006-7-7-r57
                1779552
                16859532
                307eef2b-fdce-4baa-b8b8-f731c10b7495
                Copyright © 2006 Gollery et al.; licensee BioMed Central Ltd.

                This is an open access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 6 March 2006
                : 28 April 2006
                : 27 June 2006
                Categories
                Research

                Genetics
                Genetics

                Comments

                Comment on this article