Peptidomic discovery of short open reading frame-encoded peptides in human cells

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The amount of the transcriptome that is translated into polypeptides is of fundamental importance. We developed a peptidomic strategy to detect short ORF (sORF)-encoded polypeptides (SEPs) in human cells. We identified 90 SEPs, 86 of which are novel, the largest number of human SEPs ever reported. SEP abundances range from 10-1000 molecules per cell, identical to known proteins. SEPs arise from sORFs in non-coding RNAs as well as multi-cistronic mRNAs, and many SEPs initiate with non-AUG start codons, indicating that non-canonical translation may be more widespread in mammals than previously thought. In addition, coding sORFs are present in a small fraction (8/1866) of long intergenic non-coding RNAs (lincRNAs). Together, these results provide the strongest evidence to date that the human proteome is more complex than previously appreciated.

Related collections

Most cited references 36

Record: found
Abstract: found
Article: not found

Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs

Mitchell Guttman, Manuel Garber, Joshua Levin … (2010)

RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes.

0 comments Cited 499 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans.

Sarah E Calvo, David Pagliarini, Vamsi K Mootha (2009)

Upstream ORFs (uORFs) are mRNA elements defined by a start codon in the 5' UTR that is out-of-frame with the main coding sequence. Although uORFs are present in approximately half of human and mouse transcripts, no study has investigated their global impact on protein expression. Here, we report that uORFs correlate with significantly reduced protein expression of the downstream ORF, based on analysis of 11,649 matched mRNA and protein measurements from 4 published mammalian studies. Using reporter constructs to test 25 selected uORFs, we estimate that uORFs typically reduce protein expression by 30-80%, with a modest impact on mRNA levels. We additionally identify polymorphisms that alter uORF presence in 509 human genes. Finally, we report that 5 uORF-altering mutations, detected within genes previously linked to human diseases, dramatically silence expression of the downstream protein. Together, our results suggest that uORFs influence the protein expression of thousands of mammalian genes and that variation in these elements can influence human phenotype and disease.

0 comments Cited 344 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Comprehensive comparative analysis of strand-specific RNA sequencing methods

Joshua Levin, Moran Yassour, Xian Adiconis … (2010)

Strand-specific, massively-parallel cDNA sequencing (RNA-Seq) is a powerful tool for novel transcript discovery, genome annotation, and expression profiling. Despite multiple published methods for strand-specific RNA-Seq, no consensus exists as to how to choose between them. Here, we developed a comprehensive computational pipeline to compare library quality metrics from any RNA-Seq method. Using the well-annotated Saccharomyces cerevisiae transcriptome as a benchmark, we compared seven library construction protocols, including both published and our own novel methods. We found marked differences in strand-specificity, library complexity, evenness and continuity of coverage, agreement with known annotations, and accuracy for expression profiling. Weighing each method’s performance and ease, we identify the dUTP second strand marking and the Illumina RNA ligation methods as the leading protocols, with the former benefitting from the current availability of paired-end sequencing. Our analysis provides a comprehensive benchmark, and our computational pipeline is applicable for assessment of future protocols in other organisms.

0 comments Cited 317 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-journal-id): 101231976

Journal ID (pubmed-jr-id): 32624

Journal ID (nlm-ta): Nat Chem Biol

Journal ID (iso-abbrev): Nat. Chem. Biol.

Title: Nature chemical biology

ISSN (Print): 1552-4450

ISSN (Electronic): 1552-4469

Publication date Nihms-submitted: 26 February 2013

Publication date (Electronic): 18 November 2012

Publication date (Print): January 2013

Publication date PMC-release: 01 July 2013

Volume: 9

Issue: 1

Pages: 59-64

Affiliations

[1 ]Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, USA

[2 ]Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA

[3 ]Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA

[4 ]Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115, USA

[5 ]Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, Massachusetts 02138, USA

[6 ]Genome Sequencing & Analysis Program, Broad Institute of MIT and Harvard, 320 Charles Street, Cambridge, Massachusetts 02141, USA

[7 ]Research Computing, Division of Science, Faculty of Arts and Sciences, Harvard University, 38 Oxford St, Room 211A, Cambridge, Massachusetts 02138, USA

[8 ]Center of Systems Biology, Mass Spectrometry and Proteomics Lab, Faculty of Arts and Sciences, Harvard University, 52 Oxford St, Northwest Labs, B243.20, Cambridge, Massachusetts 02138, USA

Author notes

[*]

These authors contributed equally to this work.

[† ]Correspondence to: saghatelian@ 123456chemistry.harvard.edu .

Article

Manuscript ID: NIHMS415406

DOI: 10.1038/nchembio.1120

PMC ID: 3625679

PubMed ID: 23160002

SO-VID: fcd562e7-a39e-4237-811e-ed331f70a00d

License:

Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

History

Funding

Funded by: National Human Genome Research Institute : NHGRI

Award ID: U54 HG003067 || HG

Funded by: National Institute of General Medical Sciences : NIGMS

Award ID: R01 GM102491 || GM

Funded by: Office of the Director : NIH

Award ID: DP2 OD002374 || OD

Comments

Comment on this article

scite_

Cited by 213

See all cited by

Most referenced authors 1,246

See all reference authors

- Version 1

Peptidomic discovery of short open reading frame-encoded peptides in human cells

Read this article at

Abstract

Related collections

AsiaChem

Most cited references 36

Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs

Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans.

Comprehensive comparative analysis of strand-specific RNA sequencing methods

Author and article information

Journal

Affiliations

Author notes

Article

History

Funding

Categories

Comments

Comment on this article

Similar content 666

Cited by 213

Most referenced authors 1,246