• Record: found
  • Abstract: found
  • Article: not found
Is Open Access

The FAIR Guiding Principles for scientific data management and stewardship.

1 , 2 , , , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , , 14 , 15 , 16 , 17 , 18 , , 19 , 20 , 21 , 22 , 23 ,   24 , 21 , 25 , 26 , 27 , 28 , 29 , 30 , 17 , 31 , 32 , 17 , 33 , 34 , 35 , , 36 , 31 , 37 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 26 , 37

Scientific data

Springer Nature

Read this article at

      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


      There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders-representing academia, industry, funding agencies, and scholarly publishers-have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.

      Related collections

      Most cited references 25

      • Record: found
      • Abstract: found
      • Article: not found

      The Protein Data Bank.

      The Protein Data Bank (PDB; ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
        • Record: found
        • Abstract: found
        • Article: found

        UniProt: a hub for protein information

        UniProt is an important collection of protein sequences and their annotations, which has doubled in size to 80 million sequences during the past year. This growth in sequences has prompted an extension of UniProt accession number space from 6 to 10 characters. An increasing fraction of new sequences are identical to a sequence that already exists in the database with the majority of sequences coming from genome sequencing projects. We have created a new proteome identifier that uniquely identifies a particular assembly of a species and strain or subspecies to help users track the provenance of sequences. We present a new website that has been designed using a user-experience design process. We have introduced an annotation score for all entries in UniProt to represent the relative amount of knowledge known about each protein. These scores will be helpful in identifying which proteins are the best characterized and most informative for comparative analysis. All UniProt data is provided freely and is available on the web at
          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access


          GenBank® ( is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page:

            Author and article information

            [1 ] Center for Plant Biotechnology and Genomics, Universidad Politécnica de Madrid, Madrid 28223, Spain.
            [2 ] Stanford University, Stanford 94305-5411, USA.
            [3 ] Nature Genetics, New York 10004-1562, USA.
            [4 ] Euretos and Phortos Consultants, Rotterdam 2741 CA, The Netherlands.
            [5 ] ELIXIR, Wellcome Genome Campus, Hinxton CB10 1SA, UK.
            [6 ] Lygature, Eindhoven 5656 AG, The Netherlands.
            [7 ] Vrije Universiteit Amsterdam, Dutch Techcenter for Life Sciences, Amsterdam 1081 HV, The Netherlands.
            [8 ] Office of the Director, National Institutes of Health, Rockville 20892, USA.
            [9 ] TNO, Zeist 3700 AJ, The Netherlands.
            [10 ] Department of Genetics, University of Leicester, Leicester LE1 7RH, UK.
            [11 ] Harvard Medical School, Boston, Massachusetts MA 02115, USA.
            [12 ] Harvard University, Cambridge, Massachusetts MA 02138, USA.
            [13 ] Data Archiving and Networked Services (DANS), The Hague 2593 HW, The Netherlands.
            [14 ] GigaScience, Beijing Genomics Institute, Shenzhen 518083, China.
            [15 ] Department of Bioinformatics, Maastricht University, Maastricht 6200 MD, The Netherlands.
            [16 ] Wageningen UR Plant Breeding, Wageningen 6708 PB, The Netherlands.
            [17 ] Oxford e-Research Center, University of Oxford, Oxford OX1 3QG, UK.
            [18 ] Heriot-Watt University, Edinburgh EH14 4AS, UK.
            [19 ] School of Computer Science, University of Manchester, Manchester M13 9PL, UK.
            [20 ] Center for Research in Biological Systems, School of Medicine, University of California San Diego, La Jolla, California 92093-0446, USA.
            [21 ] Dutch Techcenter for the Life Sciences, Utrecht 3501 DE, The Netherlands.
            [22 ] Department of Human Genetics, Leiden University Medical Center, Dutch Techcenter for the Life Sciences, Leiden 2300 RC, The Netherlands.
            [23 ] Dutch TechCenter for Life Sciences and ELIXIR-NL, Utrecht 3501 DE, The Netherlands.
            [24 ] VU University Amsterdam, Amsterdam 1081 HV, The Netherlands.
            [25 ] Leiden Center of Data Science, Leiden University, Leiden 2300 RA, The Netherlands.
            [26 ] Netherlands eScience Center, Amsterdam 1098 XG, The Netherlands.
            [27 ] National Center for Microscopy and Imaging Research, UCSD, San Diego 92103, USA.
            [28 ] Phortos Consultants, San Diego 92011, USA.
            [29 ] SciELO/FAPESP Program, UNIFESP Foundation, São Paulo 05468-901, Brazil.
            [30 ] Bioinformatics Infrastructure for Life Sciences (BILS), Science for Life Laboratory, Dept of Cell and Molecular Biology, Uppsala University, S-751 24, Uppsala, Sweden.
            [31 ] Leiden University Medical Center, Leiden 2333 ZA, The Netherlands.
            [32 ] Bayer CropScience, Gent Area 1831, Belgium.
            [33 ] Leiden Institute for Advanced Computer Science, Leiden University Medical Center, Leiden 2300 RA, The Netherlands.
            [34 ] Swiss Institute of Bioinformatics and University of Basel, Basel 4056, Switzerland.
            [35 ] Cray, Inc., Seattle 98164, USA.
            [36 ] University Medical Center Groningen (UMCG), University of Groningen, Groningen 9713 GZ, The Netherlands.
            [37 ] Erasmus MC, Rotterdam 3015 CE, The Netherlands.
            [38 ] Independent Open Access and Open Science Advocate, Guildford GU1 3PW, UK.
            [39 ] Micelio, Antwerp 2180, Belgium.
            [40 ] Max Planck Compute and Data Facility, MPS, Garching 85748, Germany.
            [41 ] Leiden Institute of Advanced Computer Science, Leiden University, Leiden 2333 CA, The Netherlands.
            [42 ] Department of Computer Science, Oxford University, Oxford OX1 3QD, UK.
            [43 ] Leiden University Medical Center, Leiden and Dutch TechCenter for Life Sciences, Utrecht 2333 ZA, The Netherlands.
            Sci Data
            Scientific data
            Springer Nature
            Mar 15 2016
            : 3
            26978244 sdata201618 10.1038/sdata.2016.18 4792175


            Comment on this article