Protein evolution of ANTP and PRD homeobox genes

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Although homeobox genes have been the subject of many studies, little is known about the main amino acid changes that occurred early in the evolution of genes belonging to different classes.

Results

In this study, we report a method for the fast and efficient retrieval of sequences belonging to the ANTP (HOXL and NKL) and PRD classes. Furthermore, we look for diagnostic amino acid residues that can be used to distinguish HOXL, NKL and PRD genes.

Conclusion

The reported protein features will facilitate the robust classification of homeobox genes from newly sequenced bilaterian genomes. Nevertheless, in non-bilaterian genomes our findings must be cautiously applied. In principle, as long as a good manually curated data set is available the approach here described can be applied to non-bilaterian organisms as well. Our results help focus experimental studies onto investigating the biochemical functions of key homeodomain residues in different gene classes.

Related collections

Most cited references 14

Record: found
Abstract: found
Article: not found

Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation.

C Livingstone, G Barton (1993)

An algorithm is described for the systematic characterization of the physico-chemical properties seen at each position in a multiple protein sequence alignment. The new algorithm allows questions important in the design of mutagenesis experiments to be quickly answered since positions in the alignment that show unusual or interesting residue substitution patterns may be rapidly identified. The strategy is based on a flexible set-based description of amino acid properties, which is used to define the conservation between any group of amino acids. Sequences in the alignment are gathered into subgroups on the basis of sequence similarity, functional, evolutionary or other criteria. All pairs of subgroups are then compared to highlight positions that confer the unique features of each subgroup. The algorithm is encoded in the computer program AMAS (Analysis of Multiply Aligned Sequences) which provides a textual summary of the analysis and an annotated (boxed, shaded and/or coloured) multiple sequence alignment. The algorithm is illustrated by application to an alignment of 67 SH2 domains where patterns of conserved hydrophobic residues that constitute the protein core are highlighted. The analysis of charge conservation across annexin domains identifies the locations at which conserved charges change sign. The algorithm simplifies the analysis of multiple sequence data by condensing the mass of information present, and thus allows the rapid identification of substitutions of structural and functional importance.

0 comments Cited 153 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Rapid and sensitive sequence comparison with FASTP and FASTA.

W R Pearson (1990)

The FASTA program can search the NBRF protein sequence library (2.5 million residues) in less than 20 min on an IBM-PC microcomputer and unambiguously detect proteins that shared a common ancestor billions of years in the past. FASTA is both fast and selective because it initially considers only amino acid identities. Its sensitivity is increased not only by using the PAM250 matrix to score and rescore regions with large numbers of identities but also by joining initial regions. The results of searches with FASTA compare favorably with results using NWS-based programs that are 100 times slower. FASTA is slightly less sensitive but considerably more selective. It is not clear that NWS-based programs would be more successful in finding distantly related members of the G-protein-coupled receptor family. The joining step by FASTA to calculate the initn score is especially useful for sequences that share regions of sequence similarity that are separated by variable-length loops. FASTP and FASTA were designed to identify protein sequences that have descended from a common ancestor, and they have proved very useful for this task. In many cases, a FASTA sequence search will result in a list of high scoring library sequences that are homologous to the query sequence, or the search will result in a list of sequences with similarity scores that cannot be distinguished from the bulk of the library. In either case, the question of whether there are sequences in the library that are clearly related to the query sequence has been answered unambiguously. Unfortunately, the results often will not be so clear-cut, and careful analysis of similarity scores, statistical significance, the actual aligned residues, and the biological context are required. In the course of analyzing the G-protein-coupled receptor family, several proteins were found that, because of a high initn score and a low init1 score that increased almost 2-fold with optimization, appeared to be members of this family which were not previously recognized. RDF2 analysis showed borderline z values, and only a careful examination of the sequence alignments that focused on the conserved residues provided convincing evidence that the high scores were fortuitous. As sequence comparison methods become more powerful by becoming more sensitive, they become more likely to mislead, and even greater care is required.

0 comments Cited 150 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Classification and nomenclature of all human homeobox genes

Peter WH Holland, H Booth, Elspeth Bruford (2007)

Background The homeobox genes are a large and diverse group of genes, many of which play important roles in the embryonic development of animals. Increasingly, homeobox genes are being compared between genomes in an attempt to understand the evolution of animal development. Despite their importance, the full diversity of human homeobox genes has not previously been described. Results We have identified all homeobox genes and pseudogenes in the euchromatic regions of the human genome, finding many unannotated, incorrectly annotated, unnamed, misnamed or misclassified genes and pseudogenes. We describe 300 human homeobox loci, which we divide into 235 probable functional genes and 65 probable pseudogenes. These totals include 3 genes with partial homeoboxes and 13 pseudogenes that lack homeoboxes but are clearly derived from homeobox genes. These figures exclude the repetitive DUX1 to DUX5 homeobox sequences of which we identified 35 probable pseudogenes, with many more expected in heterochromatic regions. Nomenclature is established for approximately 40 formerly unnamed loci, reflecting their evolutionary relationships to other loci in human and other species, and nomenclature revisions are proposed for around 30 other loci. We use a classification that recognizes 11 homeobox gene 'classes' subdivided into 102 homeobox gene 'families'. Conclusion We have conducted a comprehensive survey of homeobox genes and pseudogenes in the human genome, described many new loci, and revised the classification and nomenclature of homeobox genes. The classification scheme may be widely applicable to homeobox genes in other animal genomes and will facilitate comparative genomics of this important gene superclass.

0 comments Cited 150 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): BMC Evol Biol

Title: BMC Evolutionary Biology

Publisher: BioMed Central

ISSN (Electronic): 1471-2148

Publication date Collection: 2008

Publication date (Electronic): 11 July 2008

Volume: 8

Page: 200

Affiliations

[1 ]Instituto de Biologia Molecular e Celular (IBMC); University of Porto, Rua do Campo Alegre 823, 4150-180 Porto, Portugal

[2 ]Department of Zoology, University of Oxford, South Parks Road, Oxford, OX1 3PS, UK

Article

Publisher ID: 1471-2148-8-200

DOI: 10.1186/1471-2148-8-200

PMC ID: 2491631

PubMed ID: 18620554

SO-VID: 65c9366b-b41a-4f6a-baca-66e846ce3498

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Protein evolution of ANTP and PRD homeobox genes

Read this article at

Abstract

Background

Results

Conclusion

Related collections

Teaching and learning evolution

Most cited references 14

Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation.

Rapid and sensitive sequence comparison with FASTP and FASTA.

Classification and nomenclature of all human homeobox genes

Author and article information

Journal

Affiliations

Article

History

Categories

Comments

Comment on this article

Similar content 482

Cited by 5

Most referenced authors 206