Blog
About

0
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      SeqDB: Biological Collection Management with Integrated DNA Sequence Tracking 

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Agriculture and Agri-Food Canada (AAFC) is home to a world-class taxonomy program based on Canada’s national agricultural collections for Botany, Mycology and Entomology.  These collections contain valuable resources, such as type specimen for authoritative identification using approaches that include phenotyping, DNA barcoding, and whole genome sequencing.  These authoritative references allow for accurate identification of the taxonomic biodiversity found in environmental samples in fields such as metagenomics. AAFC’s internally developed web application, termed SeqDB, tracks the complete workflow and provenance chain from source specimen information through DNA extractions, PCR reactions, and sequencing leading to binary DNA sequence files.  In the context of Next Generation Sequencing (NGS) of environmental samples, SeqDB tracks sampling metadata, DNA extractions, and library preparation workflow leading to demultiplexed sequence files.  SeqDB implements the Taxonomic Databases Working Group (TDWG) Darwin Core standard Wieczorek et al. 2012 for Biodiversity Occurrence Data, as well as the Genome Standards Consortium (GSC) Minimum Information about any (X) Sequences (MIxS) specification Yilmaz et al. 2011. When coupled with the built-in data standards validation system, this has  led to the ability to search consistent metadata across multiple studies. Furthermore, the application enables tracking the physical storage of the aforementioned specimens and their derivative molecular extracts using an integrated barcode printing and reading system.   All the information is presented using a graphical user interface that features intuitive molecular workflows as well as a RESTful API that facilitates integration with external applications and programmatic access of the data. The success of SeqDB has been due to the close collaboration with scientists and technicians undertaking molecular research involving the national collection, and the centralization of their data sets in an access controlled relational database implementing internationally recognized standards. We will describe the overall system, and some of our lessons learned in building it.

          Related collections

          Most cited references 2

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Darwin Core: An Evolving Community-Developed Biodiversity Data Standard

          Biodiversity data derive from myriad sources stored in various formats on many distinct hardware and software platforms. An essential step towards understanding global patterns of biodiversity is to provide a standardized view of these heterogeneous data sources to improve interoperability. Fundamental to this advance are definitions of common terms. This paper describes the evolution and development of Darwin Core, a data standard for publishing and integrating biodiversity information. We focus on the categories of terms that define the standard, differences between simple and relational Darwin Core, how the standard has been implemented, and the community processes that are essential for maintenance and growth of the standard. We present case-study extensions of the Darwin Core into new research communities, including metagenomics and genetic resources. We close by showing how Darwin Core records are integrated to create new knowledge products documenting species distributions and changes due to environmental perturbations.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications.

            Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environmental packages' apply to any genome sequence of known origin and can be used in combination with MIMARKS and other GSC checklists. Finally, to establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, we present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere.
              Bookmark

              Author and article information

              Journal
              Proceedings of TDWG
              TDWGProc
              Pensoft Publishers
              2535-0897
              August 26 2017
              August 26 2017
              : 1
              : e20608
              Article
              10.3897/tdwgproceedings.1.20608
              © 2017

              http://creativecommons.org/licenses/by/4.0/

              Comments

              Comment on this article