Blog
About

  • Record: found
  • Abstract: found
  • Article: found
Is Open Access

The FAIR Guiding Principles for scientific data management and stewardship

1 , 2 , 3 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 3 , 15 , 16 , 17 , 18 , 19 , 3 , 20 , 21 , 22 , 23 , 24 ,   25 , 22 , 26 , 27 , 28 , 29 , 30 , 31 , 18 , 32 , 33 , 18 , 34 , 35 , 36 , 37 , 38 , 32 , 39 , 39 , 40 , 41 , 42 , 43 , 44 , a , 45 , 46 , 47

Scientific Data

Nature Publishing Group

Research data, Publication characteristics

Read this article at

Bookmark
      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

      Abstract

      There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.

      Related collections

      Most cited references 25

      • Record: found
      • Abstract: found
      • Article: not found

      The Protein Data Bank.

      The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
        Bookmark
        • Record: found
        • Abstract: found
        • Article: found

        UniProt: a hub for protein information

        UniProt is an important collection of protein sequences and their annotations, which has doubled in size to 80 million sequences during the past year. This growth in sequences has prompted an extension of UniProt accession number space from 6 to 10 characters. An increasing fraction of new sequences are identical to a sequence that already exists in the database with the majority of sequences coming from genome sequencing projects. We have created a new proteome identifier that uniquely identifies a particular assembly of a species and strain or subspecies to help users track the provenance of sequences. We present a new website that has been designed using a user-experience design process. We have introduced an annotation score for all entries in UniProt to represent the relative amount of knowledge known about each protein. These scores will be helpful in identifying which proteins are the best characterized and most informative for comparative analysis. All UniProt data is provided freely and is available on the web at http://www.uniprot.org/.
          Bookmark
          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          GenBank

          GenBank® (http://www.ncbi.nlm.nih.gov) is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.
            Bookmark

            Author and article information

            Affiliations
            [1 ] Center for Plant Biotechnology and Genomics, Universidad Politécnica de Madrid , Madrid 28223, Spain
            [2 ] Stanford University , Stanford 94305-5411, USA
            [3 ]Elsevier, Amsterdam 1043 NX, The Netherlands
            [4 ] Nature Genetics , New York 10004-1562, USA
            [5 ] Euretos and Phortos Consultants , Rotterdam 2741 CA, The Netherlands
            [6 ] ELIXIR, Wellcome Genome Campus , Hinxton CB10 1SA, UK
            [7 ] Lygature , Eindhoven 5656 AG, The Netherlands
            [8 ] Vrije Universiteit Amsterdam, Dutch Techcenter for Life Sciences , Amsterdam 1081 HV, The Netherlands
            [9 ] Office of the Director, National Institutes of Health , Rockville 20892, USA
            [10 ] TNO , Zeist 3700 AJ, The Netherlands
            [11 ] Department of Genetics, University of Leicester , Leicester LE1 7RH, UK
            [12 ] Harvard Medical School , Boston, Massachusetts MA 02115, USA
            [13 ] Harvard University , Cambridge, Massachusetts MA 02138, USA
            [14 ] Data Archiving and Networked Services (DANS) , The Hague 2593 HW, The Netherlands
            [15 ] GigaScience, Beijing Genomics Institute , Shenzhen 518083, China
            [16 ] Department of Bioinformatics, Maastricht University , Maastricht 6200 MD, The Netherlands
            [17 ] Wageningen UR Plant Breeding , Wageningen 6708 PB, The Netherlands
            [18 ] Oxford e-Research Center, University of Oxford , Oxford OX1 3QG, UK
            [19 ] Heriot-Watt University , Edinburgh EH14 4AS, UK
            [20 ] School of Computer Science, University of Manchester , Manchester M13 9PL, UK
            [21 ] Center for Research in Biological Systems, School of Medicine, University of California San Diego, La Jolla, California 92093-0446, USA
            [22 ] Dutch Techcenter for the Life Sciences , Utrecht 3501 DE, The Netherlands
            [23 ] Department of Human Genetics, Leiden University Medical Center, Dutch Techcenter for the Life Sciences , Leiden 2300 RC, The Netherlands
            [24 ] Dutch TechCenter for Life Sciences and ELIXIR-NL , Utrecht 3501 DE, The Netherlands
            [25 ] VU University Amsterdam , Amsterdam 1081 HV, The Netherlands
            [26 ] Leiden Center of Data Science, Leiden University , Leiden 2300 RA, The Netherlands
            [27 ] Netherlands eScience Center , Amsterdam 1098 XG, The Netherlands
            [28 ] National Center for Microscopy and Imaging Research, UCSD , San Diego 92103, USA
            [29 ] Phortos Consultants , San Diego 92011, USA
            [30 ] SciELO/FAPESP Program, UNIFESP Foundation , São Paulo 05468-901, Brazil
            [31 ] Bioinformatics Infrastructure for Life Sciences (BILS), Science for Life Laboratory, Dept of Cell and Molecular Biology, Uppsala University , S-751 24, Uppsala, Sweden
            [32 ] Leiden University Medical Center , Leiden 2333 ZA, The Netherlands
            [33 ] Bayer CropScience , Gent Area 1831, Belgium
            [34 ] Leiden Institute for Advanced Computer Science, Leiden University Medical Center , Leiden 2300 RA, The Netherlands
            [35 ] Swiss Institute of Bioinformatics and University of Basel , Basel 4056, Switzerland
            [36 ] Cray, Inc. , Seattle 98164, USA
            [37 ]Unaffiliated
            [38 ] University Medical Center Groningen (UMCG), University of Groningen , Groningen 9713 GZ, The Netherlands
            [39 ] Erasmus MC , Rotterdam 3015 CE, The Netherlands
            [40 ] Independent Open Access and Open Science Advocate , Guildford GU1 3PW, UK
            [41 ] Micelio , Antwerp 2180, Belgium
            [42 ] Max Planck Compute and Data Facility, MPS , Garching 85748, Germany
            [43 ] Leiden Institute of Advanced Computer Science, Leiden University , Leiden 2333 CA, The Netherlands
            [44 ] Department of Computer Science, Oxford University , Oxford OX1 3QD, UK
            [45 ] Leiden University Medical Center, Leiden and Dutch TechCenter for Life Sciences , Utrecht 2333 ZA, The Netherlands
            [46 ] Netherlands eScience Center , Amsterdam 1098 XG, The Netherlands
            [47 ] Erasmus MC , Rotterdam 3015 CE, The Netherlands
            Author notes
            [a ] B.M. (email: barend.mons@ 123456dtls.nl )
            []

            M.W. was the primary author of the manuscript, and participated extensively in the drafting and editing of the FAIR Principles. M.D. was significantly involved in the drafting of the FAIR Principles. B.M. conceived of the FAIR Data Initiative, contributed extensively to the drafting of the principles, and to this manuscript text. All other authors are listed alphabetically, and contributed to the manuscript either by their participation in the initial workshop and/or by editing or commenting on the manuscript text.

            Journal
            Sci Data
            Sci Data
            Scientific Data
            Nature Publishing Group
            2052-4463
            15 March 2016
            2016
            : 3
            26978244 4792175 sdata201618 10.1038/sdata.2016.18
            Copyright © 2016, Macmillan Publishers Limited

            This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0 Metadata associated with this Data Descriptor is available at http://www.nature.com/sdata/ and is released under the CC0 waiver to maximize reuse.

            Categories
            Comment

            Comments

            Comment on this article