The 2022 <i>Nucleic Acids Research</i> database issue and the online molecular biology database collection

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The 2022 Nucleic Acids Research Database Issue contains 185 papers, including 87 papers reporting on new databases and 85 updates from resources previously published in the Issue. Thirteen additional manuscripts provide updates on databases most recently published elsewhere. Seven new databases focus specifically on COVID-19 and SARS-CoV-2, including SCoV2-MD, the first of the Issue's Breakthrough Articles. Major nucleic acid databases reporting updates include MODOMICS, JASPAR and miRTarBase. The AlphaFold Protein Structure Database, described in the second Breakthrough Article, is the stand-out in the protein section, where the Human Proteoform Atlas and GproteinDb are other notable new arrivals. Updates from DisProt, FuzDB and ELM comprehensively cover disordered proteins. Under the metabolism and signalling section Reactome, ConsensusPathDB, HMDB and CAZy are major returning resources. In microbial and viral genomes taxonomy and systematics are well covered by LPSN, TYGS and GTDB. Genomics resources include Ensembl, Ensembl Genomes and UCSC Genome Browser. Major returning pharmacology resource names include the IUPHAR/BPS guide and the Therapeutic Target Database. New plant databases include PlantGSAD for gene lists and qPTMplants for post-translational modifications. The entire Database Issue is freely available online on the Nucleic Acids Research website ( https://academic.oup.com/nar). Our latest update to the NAR online Molecular Biology Database Collection brings the total number of entries to 1645. Following last year's major cleanup, we have updated 317 entries, listing 89 new resources and trimming 80 discontinued URLs. The current release is available at http://www.oxfordjournals.org/nar/database/c/.

Related collections

Most cited references 110

Record: found
Abstract: found
Article: found

Is Open Access

Highly accurate protein structure prediction with AlphaFold

John Jumper, Richard Evans, Alexander Pritzel … (2021)

Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.

0 comments Cited 8425 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The FAIR Guiding Principles for scientific data management and stewardship

Mark D Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg … (2016)

There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.

0 comments Cited 2971 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

UniProt: the universal protein knowledgebase in 2021

(2020)

Abstract The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this article, we describe significant updates that we have made over the last two years to the resource. The number of sequences in UniProtKB has risen to approximately 190 million, despite continued work to reduce sequence redundancy at the proteome level. We have adopted new methods of assessing proteome completeness and quality. We continue to extract detailed annotations from the literature to add to reviewed entries and supplement these in unreviewed entries with annotations provided by automated systems such as the newly implemented Association-Rule-Based Annotator (ARBA). We have developed a credit-based publication submission interface to allow the community to contribute publications and annotations to UniProt entries. We describe how UniProtKB responded to the COVID-19 pandemic through expert curation of relevant entries that were rapidly made available to the research community through a dedicated portal. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.

0 comments Cited 2318 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Daniel J Rigden

Xosé M Fernández

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Journal ID (publisher-id): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date Collection: 07 January 2022

Publication date (Electronic): 27 December 2021

Publication date PMC-release: 27 December 2021

Volume: 50

Issue: D1

Pages: D1-D10

Affiliations

Institute of Systems, Molecular and Integrative Biology, University of Liverpool , Crown Street, Liverpool L69 7ZB, UK

Institut Curie , 25 rue d’Ulm, 75005 Paris, France

Author notes

To whom correspondence should be addressed. Tel: +44 151 795 4467; Email: nardatabase@ 123456gmail.com

Article

Publisher ID: gkab1195

DOI: 10.1093/nar/gkab1195

PMC ID: 8728296

PubMed ID: 34986604

SO-VID: 54244bd4-3f4e-400d-ac1a-0087b6d6a598

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Page count

Pages: 10

Funding

Funded by: Oxford University Press, DOI 10.13039/501100007723;

Comments

Comment on this article

scite_

Cited by 20

See all cited by

Most referenced authors 3,523

See all reference authors

- Version 1

The 2022 Nucleic Acids Research database issue and the online molecular biology database collection

Read this article at

Abstract

Related collections

Novel Coronavirus Disease COVID-19

Most cited references 110

Highly accurate protein structure prediction with AlphaFold

The FAIR Guiding Principles for scientific data management and stewardship

UniProt: the universal protein knowledgebase in 2021

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 20

Cited by 20

Most referenced authors 3,523