0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      MiST 4.0: a new release of the microbial signal transduction database, now with a metagenomic component

      research-article
      , ,
      Nucleic Acids Research
      Oxford University Press

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Signal transduction systems in bacteria and archaea link environmental stimuli to specific adaptive cellular responses. They control gene expression, motility, biofilm formation, development and other processes that are vital to survival. The microbial signal transduction (MiST) database is an online resource that stores tens of thousands of genomes and allows users to explore their signal transduction profiles, analyze genomes in bulk using the database application programming interface (API) and make testable hypotheses about the functions of newly identified signaling systems. However, signal transduction in metagenomes remained completely unexplored. To lay the foundation for research in metagenomic signal transduction, we have prepared a new release of the MiST database, MiST 4.0, which features over 10 000 metagenome-assembled genomes (MAGs), a scaled representation of proteins and detailed BioSample information. In addition, several thousands of new genomes have been processed and stored in the database. A new interface has been developed that allows users to seamlessly switch between genomes and MAGs. MiST 4.0 is freely available at https://mistdb.com; metagenomes and MAGs can also be explored using the API available on the same page.

          Graphical Abstract

          Graphical Abstract

          Related collections

          Most cited references34

          • Record: found
          • Abstract: found
          • Article: not found

          Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation

          The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55 000 organisms (>4800 viruses, >40 000 prokaryotes and >10 000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Pfam: The protein families database in 2021

            Abstract The Pfam database is a widely used resource for classifying protein sequences into families and domains. Since Pfam was last described in this journal, over 350 new families have been added in Pfam 33.1 and numerous improvements have been made to existing entries. To facilitate research on COVID-19, we have revised the Pfam entries that cover the SARS-CoV-2 proteome, and built new entries for regions that were not covered by Pfam. We have reintroduced Pfam-B which provides an automatically generated supplement to Pfam and contains 136 730 novel clusters of sequences that are not yet matched by a Pfam family. The new Pfam-B is based on a clustering by the MMseqs2 software. We have compared all of the regions in the RepeatsDB to those in Pfam and have started to use the results to build and refine Pfam repeat families. Pfam is freely available for browsing and download at http://pfam.xfam.org/.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Two-component signal transduction.

              Most prokaryotic signal-transduction systems and a few eukaryotic pathways use phosphotransfer schemes involving two conserved components, a histidine protein kinase and a response regulator protein. The histidine protein kinase, which is regulated by environmental stimuli, autophosphorylates at a histidine residue, creating a high-energy phosphoryl group that is subsequently transferred to an aspartate residue in the response regulator protein. Phosphorylation induces a conformational change in the regulatory domain that results in activation of an associated domain that effects the response. The basic scheme is highly adaptable, and numerous variations have provided optimization within specific signaling systems. The domains of two-component proteins are modular and can be integrated into proteins and pathways in a variety of ways, but the core structures and activities are maintained. Thus detailed analyses of a relatively small number of representative proteins provide a foundation for understanding this large family of signaling proteins.
                Bookmark

                Author and article information

                Contributors
                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                05 January 2024
                04 October 2023
                04 October 2023
                : 52
                : D1
                : D647-D653
                Affiliations
                Department of Microbiology and Translational Data Analytics Institute, The Ohio State University , Columbus, OH 43210, USA
                Ulritech, LLC , Mount Pleasant, SC 29466, USA
                Department of Microbiology and Translational Data Analytics Institute, The Ohio State University , Columbus, OH 43210, USA
                Author notes
                To whom correspondence should be addressed. Tel: +1 614 292 4860; Email: gumerov.1@ 123456osu.edu
                Correspondence may also be addressed to Igor B. Zhulin. Tel: +1 614 292 4860; Email: jouline.1@ 123456osu.edu
                Author information
                https://orcid.org/0000-0003-1670-7679
                https://orcid.org/0000-0002-6708-5323
                Article
                gkad847
                10.1093/nar/gkad847
                10767990
                37791884
                a62a2bbd-f592-4cbd-ada0-21307917d5c3
                © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 21 September 2023
                : 15 September 2023
                : 29 July 2023
                Page count
                Pages: 7
                Funding
                Funded by: National Institutes of Health, DOI 10.13039/100000002;
                Award ID: R35GM131760
                Award ID: R35GM131760
                Categories
                AcademicSubjects/SCI00010
                Database Issue

                Genetics
                Genetics

                Comments

                Comment on this article