Ab Initio Prediction of Transcription Factor Targets Using Structural Knowledge

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Current approaches for identification and detection of transcription factor binding sites rely on an extensive set of known target genes. Here we describe a novel structure-based approach applicable to transcription factors with no prior binding data. Our approach combines sequence data and structural information to infer context-specific amino acid–nucleotide recognition preferences. These are used to predict binding sites for novel transcription factors from the same structural family. We demonstrate our approach on the Cys ₂His ₂ Zinc Finger protein family, and show that the learned DNA-recognition preferences are compatible with experimental results. We use these preferences to perform a genome-wide scan for direct targets of Drosophila melanogaster Cys ₂His ₂ transcription factors. By analyzing the predicted targets along with gene annotation and expression data we infer the function and activity of these proteins.

Synopsis

Cells respond to dynamic changes in their environment by invoking various cellular processes, coordinated by a complex regulatory program. A main component of this program is the regulation of transcription, which is mainly accomplished by transcription factors that bind the DNA in the vicinity of genes. To better understand transcriptional regulation, advanced computational approaches are needed for linking between transcription factors and their targets. The authors describe a novel approach by which the binding site of a given transcription factor can be characterized without previous experimental binding data. This approach involves learning a set of context-specific amino acid–nucleotide recognition preferences that, when combined with the sequence and structure of the protein, can predict its specific binding preferences. Applying this approach to the Cys ₂His ₂ Zinc Finger protein family demonstrated its genome-wide potential by automatically predicting the direct targets of 29 regulators in the genome of the fruit fly Drosophila melanogaster. At present, with the availability of many genome sequences, there are numerous proteins annotated as transcription factors based on their sequence alone. This approach offers a promising direction for revealing the targets of these factors and for understanding their roles in the cellular network.

Related collections

Most cited references 40

Record: found
Abstract: not found
Article: not found

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Yoav Benjamini, Yosef Hochberg (1995)

0 comments Cited 23686 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Profile hidden Markov models.

S. Eddy (1998)

The recent literature on profile hidden Markov model (profile HMM) methods and software is reviewed. Profile HMMs turn a multiple sequence alignment into a position-specific scoring system suitable for searching databases for remotely homologous sequences. Profile HMM analyses complement standard pairwise comparison methods for large-scale sequence analysis. Several software implementations and two large libraries of profile HMMs of common protein domains are available. HMM methods performed comparably to threading methods in the CASP2 structure prediction exercise.

0 comments Cited 1259 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

DNA binding sites: representation and discovery.

G Stormo (2000)

The purpose of this article is to provide a brief history of the development and application of computer algorithms for the analysis and prediction of DNA binding sites. This problem can be conveniently divided into two subproblems. The first is, given a collection of known binding sites, develop a representation of those sites that can be used to search new sequences and reliably predict where additional binding sites occur. The second is, given a set of sequences known to contain binding sites for a common factor, but not knowing where the sites are, discover the location of the sites in each sequence and a representation for the specificity of the protein.

0 comments Cited 419 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

: Role: Editor

Journal

Journal ID (nlm-ta): PLoS Comput Biol

Journal ID (publisher-id): pcbi

Title: PLoS Computational Biology

Publisher: Public Library of Science

ISSN (Print): 1553-734X

ISSN (Electronic): 1553-7358

Publication date (Print): June 2005

Publication date (Electronic): 24 June 2005

Volume: 1

Issue: 1

Electronic Location Identifier: e1

Affiliations

[1 ] School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel

[2 ] Department of Molecular Genetics and Biotechnology, Faculty of Medicine, The Hebrew University, Jerusalem, Israel

University of California at San Francisco, United States of America

Author notes

*To whom correspondence should be addressed. E-mail: nir@ 123456cs.huji.ac.il (NF), hanah@ 123456md.huji.ac.il (HM)

Article

Publisher ID: 05-PLCB-RA-0002 Serial Item and Contribution ID: plcb-01-01-07

DOI: 10.1371/journal.pcbi.0010001

PMC ID: 1183507

PubMed ID: 16103898

SO-VID: 698216a7-8ed4-4542-b2f7-fa1601d60537

Copyright © Copyright: © 2005 Kaplan et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 10 January 2005

Date accepted : 11 February 2005

Page count

Pages: 9

Custom metadata

Citation: Kaplan T, Friedman N, Margalit H (2005) Ab initio prediction of transcription factor targets using structural knowledge. PLoS Comput Biol 1(1): e1.

Ab Initio Prediction of Transcription Factor Targets Using Structural Knowledge

Read this article at

Abstract

Synopsis

Related collections

Indigenous Knowledge

Most cited references 40

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Profile hidden Markov models.

DNA binding sites: representation and discovery.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Custom metadata

Comments

Comment on this article

Similar content 15

Cited by 40

Most referenced authors 1,195