57
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Directed mutational scanning reveals a balance between acidic and hydrophobic residues in strong human activation domains

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          SUMMARY

          Acidic activation domains are intrinsically disordered regions of the transcription factors that bind coactivators. The intrinsic disorder and low evolutionary conservation of activation domains have made it difficult to identify the sequence features that control activity. To address this problem, we designed thousands of variants in seven acidic activation domains and measured their activities with a high-throughput assay in human cell culture. We found that strong activation domain activity requires a balance between the number of acidic residues and aromatic and leucine residues. These findings motivated a predictor of acidic activation domains that scans the human proteome for clusters of aromatic and leucine residues embedded in regions of high acidity. This predictor identifies known activation domains and accurately predicts previously unidentified ones. Our results support a flexible acidic exposure model of activation domains in which the acidic residues solubilize hydrophobic motifs so that they can interact with coactivators. A record of this paper’s transparent peer review process is included in the supplemental information.

          In brief

          Transcriptional activation domains are poorly conserved, intrinsically disordered regions of the transcription factors that remain difficult to predict from protein sequences. A high-throughput method reveals how strong activation domains require a balance between acidic and hydrophobic residues. This balance powers an accurate predictor of activation domains on human transcription factors.

          Graphical Abstract

          Related collections

          Most cited references64

          • Record: found
          • Abstract: found
          • Article: not found

          FLASH: fast length adjustment of short reads to improve genome assemblies.

          Next-generation sequencing technologies generate very large numbers of short reads. Even with very deep genome coverage, short read lengths cause problems in de novo assemblies. The use of paired-end libraries with a fragment size shorter than twice the read length provides an opportunity to generate much longer reads by overlapping and merging read pairs before assembling a genome. We present FLASH, a fast computational tool to extend the length of short reads by overlapping paired-end reads from fragment libraries that are sufficiently short. We tested the correctness of the tool on one million simulated read pairs, and we then applied it as a pre-processor for genome assemblies of Illumina reads from the bacterium Staphylococcus aureus and human chromosome 14. FLASH correctly extended and merged reads >99% of the time on simulated reads with an error rate of <1%. With adequately set parameters, FLASH correctly merged reads over 90% of the time even when the reads contained up to 5% errors. When FLASH was used to extend reads prior to assembly, the resulting assemblies had substantially greater N50 lengths for both contigs and scaffolds. The FLASH system is implemented in C and is freely available as open-source code at http://www.cbcb.umd.edu/software/flash. t.magoc@gmail.com.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            The Pfam protein families database in 2019

            Abstract The last few years have witnessed significant changes in Pfam (https://pfam.xfam.org). The number of families has grown substantially to a total of 17,929 in release 32.0. New additions have been coupled with efforts to improve existing families, including refinement of domain boundaries, their classification into Pfam clans, as well as their functional annotation. We recently began to collaborate with the RepeatsDB resource to improve the definition of tandem repeat families within Pfam. We carried out a significant comparison to the structural classification database, namely the Evolutionary Classification of Protein Domains (ECOD) that led to the creation of 825 new families based on their set of uncharacterized families (EUFs). Furthermore, we also connected Pfam entries to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms. Since Pfam has many community contributors, we recently enabled the linking between authorship of all Pfam entries with the corresponding authors’ ORCID identifiers. This effectively permits authors to claim credit for their Pfam curation and link them to their ORCID record.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.

                Bookmark

                Author and article information

                Journal
                101656080
                43733
                Cell Syst
                Cell Syst
                Cell systems
                2405-4712
                2405-4720
                24 June 2022
                20 April 2022
                03 February 2022
                29 June 2022
                : 13
                : 4
                : 334-345.e5
                Affiliations
                [1 ]Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA
                [2 ]Department of Genetics, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA
                [3 ]Center for Computational Biology, University of California Berkeley, Berkeley, CA 94720, USA
                [4 ]Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA
                [5 ]Center for Science and Engineering of Living Systems, Washington University in St. Louis, St. Louis, MO 63130, USA
                [6 ]Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130, USA
                [7 ]Lead contact
                Author notes

                AUTHOR CONTRIBUTIONS

                M.V.S. and B.A.C. designed the project and wrote the manuscript. M.V.S. and E.R. collected the data. M.V.S., E.R., S.R.K., and A.S.H. analyzed the data. R.V.P. and B.A.C. interpreted the data. All authors edited the manuscript.

                [* ]Correspondence: mstaller@ 123456berkeley.edu (M.V.S.), cohen@ 123456wustl.edu (B.A.C.)
                Article
                NIHMS1814597
                10.1016/j.cels.2022.01.002
                9241528
                35120642
                e8cf5bc2-d0da-410e-9833-8089122be9e6

                This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/).

                History
                Categories
                Article

                Comments

                Comment on this article