25
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Crowdsourcing the identification of organisms: A case-study of iSpot

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Abstract

          Accurate species identification is fundamental to biodiversity science, but the natural history skills required for this are neglected in formal education at all levels. In this paper we describe how the web application ispotnature.org and its sister site ispot.org.za (collectively, “iSpot”) are helping to solve this problem by combining learning technology with crowdsourcing to connect beginners with experts. Over 94% of observations submitted to iSpot receive a determination. External checking of a sample of 3,287 iSpot records verified > 92% of them. To mid 2014, iSpot crowdsourced the identification of 30,000 taxa (>80% at species level) in > 390,000 observations with a global community numbering > 42,000 registered participants. More than half the observations on ispotnature.org were named within an hour of submission. iSpot uses a unique, 9-dimensional reputation system to motivate and reward participants and to verify determinations. Taxon-specific reputation points are earned when a participant proposes an identification that achieves agreement from other participants, weighted by the agreers’ own reputation scores for the taxon. This system is able to discriminate effectively between competing determinations when two or more are proposed for the same observation. In 57% of such cases the reputation system improved the accuracy of the determination, while in the remainder it either improved precision (e.g. by adding a species name to a genus) or revealed false precision, for example where a determination to species level was not supported by the available evidence. We propose that the success of iSpot arises from the structure of its social network that efficiently connects beginners and experts, overcoming the social as well as geographic barriers that normally separate the two.

          Related collections

          Most cited references 17

          • Record: found
          • Abstract: not found
          • Article: not found

          A survey of trust and reputation systems for online service provision

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Crowdsourcing systems on the World-Wide Web

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Extraordinary phylogenetic diversity and metabolic versatility in aquifer sediment

              Terrestrial sediments are massive reservoirs of fresh water and organic matter1. They also host a large fraction of the Earth’s living biomass1 2 3. In the marine sedimentary environment, microbial metabolism is responsible for both the production and destruction of methane and other carbon compounds, processes that influence discharge of greenhouse gases into the atmosphere4 5. In the terrestrial environment, sediments provide the structure for aquifers, and microorganisms within them control the turnover of buried organic carbon6, influence the speciation, and thus fate and transport of metals, and alter the chemical form of contaminants, such as uranium7 or arsenic8. Despite the many characteristics that make sediments of great interest and importance, comparatively little is known about their microbiology. Metagenomic approaches have opened up new approaches for defining the microbiology of natural environments, yet the methods have not found extensive application to sediments due to the anticipated high complexity of the microbial community. In the current study, we apply shotgun sequencing to whole-community DNA to directly analyse the membership and reconstruct metabolic characteristics for previously unstudied organisms from a contaminated aquifer adjacent to the Colorado River, CO, USA. This aquifer has been intensively characterized as part of an investigation of the potential for acetate addition to stimulate uranium bioreduction7 9, yet essentially nothing is known about the background sediment community. The Geobacteraceae have been of primary interest because they bloom in response to acetate addition and are known to impact metal speciation9; however, their representation in background sediment is uncertain. Our results demonstrate the utility of high-throughput short-read sequencing for extensive and simultaneous sampling of hundreds of genomes from sediments with very even species abundance levels in which the dominant organism comprises 5 kb in length. This approach yielded average sequence coverage of 28 × and 37 × for the 5 and 6 m samples, respectively. No individual organism comprised more than 1% of any community, an indication of the very high species evenness of this ecosystem (Fig. 1 and Supplementary Data 1). The Pielou’s index evenness score10 for the 161 highest abundance taxa in the 5 m sample is J′=0.91, where a value of 1 indicates a perfectly even sample. Although we reconstructed single fragments encoding at least eight ribosomal proteins from organisms comprising as little as 0.03% of the community, the previously described Geobacter species that proliferate upon acetate amendment of the uranium-contaminated Rifle, CO, aquifer9 remained below the detection limit. This finding emphasizes the strength of selection imposed by acetate addition. To evaluate genomic novelty, we constructed a phylogenetic tree from concatenated alignments of 16 ribosomal proteins (selected based on published metrics of lateral gene transfer frequencies) colocated on single genome fragments11 12 (Methods). The resulting tree includes representatives of all genomically sampled bacterial and archaeal phyla (Supplementary Fig. S1). Remarkably, almost every genotype detected was substantially divergent from previously sequenced genomes (Fig. 1). For just the 161 organisms in the 5-m depth sample with sufficient genomic sampling to enable inclusion in Fig. 1 and Supplementary Fig. S1 (a single genome fragment encoding at least 8 of the 16 selected ribosomal proteins), we detect 15 previously genomically unsampled phyla, including 3 clades of archaeal sequences and 12 clades of bacterial sequences (22 distinct sequences total) without clear affiliation to the existing phyla (Supplementary Data 1). We also analysed and classified 317 distinct metagenome-derived 16S rRNA gene sequences (Supplementary Figs S2–S4 and Methods). Of these, 50 individual sequences were classifiable only as ‘Bacteria’ or ‘Unclassified Bacteria’. The SILVA classification identified an additional 14 sequences as members of currently genomically unsequenced phyla, including Armantimonadetes (previously Candidate division OP10; 2 sequences), and candidate divisions WS3 (4 sequences), KB1 (1 sequence), OP9 (1 sequence), JL-ETNP-Z39 (1 sequence), GOUTA4 (1 sequence), SM2F11 (1 sequence), TA06 (1 sequence), TM6 (1 sequence) and WCHB1-60 (1 sequence) (Supplementary Figs S3, S4 and Supplementary Data 2). We found that although most of the metagenome-derived 16S rRNA sequences (264 of 317 sequences) share 90–100% identity with publicly available database sequences, the vast majority have 50% of the 16 genes were identified from the Rifle sediment 5 m depth data set. The identified Rifle ribosomal proteins were searched against the NCBI ‘nr’ database using BLASTp to identify the closest sequenced genome for each sequence, and any genomes not already present in the reference set were added. The complete data set contained 1,021 taxa. Each individual gene data set was aligned using Muscle version 3.8.31 (ref. 54) and then manually curated to remove end gaps and single-taxon insertions. Model selection for evolutionary analysis was determined using ProtTest3 (ref. 55) for each single gene alignment. The curated alignments were concatenated to form a 16-gene, 1,021 taxa alignment with 3,010 unambiguously aligned positions. A maximum likelihood phylogeny for the concatenated alignment was conducted using PhyML56 under the LG+α+γ model of evolution and with 100 bootstrap replicates. A total of 161 genotypes were phylogenetically placed: the phylogenetic tree resolves the known phyla and shows that almost every genotype detected was substantially divergent from previously sequenced genomes (Fig. 1 and Supplementary Fig. S1). Taxonomic classification Rifle organisms were classified based on a bootstrap-supported nearest-neighbour methodology of Wu and Eisen57. Starting from the immediate ancestor node connecting the Rifle query sequence to a sequenced genome with 70%) bootstrap support. Sequences most closely associated to phyla with only one sequenced representative (for example, Elusimicrobium, Gemmatimonadetes) were assigned at the phyla level to those groups. This conservative classification method identified 22 sequences forming 15 distinct clades that were classifiable only to the level of Domain, and which, given the taxon sampling on the tree (Supplementary Fig. S1), are likely representatives of phyla not currently genomically sampled. In addition, 102 sequences forming 37 distinct clades were classifiable to the phylum level but not further, indicating these are additional novel sequences. A minority of sequences could be classified to lower levels of taxonomy: 21, 3, 4 and 9 sequences to the class, subclass, order and family levels, respectively (Supplementary Data 1). Protein phylogenetic analyses Protein tree topologies were inferred using the neighbour-joining method. The distances were computed using the Poisson correction method and are in the units of the number of amino acid substitutions per site. All positions containing alignment gaps and missing data were eliminated based on pairwise sequence comparisons (pairwise deletion option). Phylogenetic analyses were conducted in MEGA5 (ref. 58). Protein modelling Three-dimensional structure predictions were generated by the SWISS-MODEL based on protein alignment and secondary structure prediction59. SWISS-MODEL is an automated protein homology-modelling server. The alignment mode was utilized for a first approach based on a user-defined target-template alignment. Conservation of key catalytic residues and the secondary structure for each model was confirmed by manual inspection. 16S rRNA gene phylogenetic analyses Phylogenetic placement of RBG-1 was done using a full-length 16S rRNA gene sequence (1,552 bp) derived from the RBG-1 genome. The RBG-1 sequence was included in a 16S rRNA reference gene data set that contained representatives of all known bacterial phyla, candidate phyla sequences identified from the Rifle aquifer21, as well as best matches based on alignment of the RBG-1 16S rRNA to Greengenes (environmental and named species) and SILVA (v108) small subunit rRNA databases60. The SILVA-derived alignment was masked to remove positions containing only gaps or single taxon insertions and the phylogeny conducted using PhyML under the HKY85+γ model of evolution with 100 bootstrap resamplings. Additional 16S rRNA genes were identified from the RBG community through BLASTn of the metagenome scaffolds against a 16S rRNA reference database. Three hundred and seventeen sequences longer than 600 bp were identified (Supplementary Fig. S3). 16S rRNA genes and fragments were excised from the scaffolds and aligned to the SILVA database using the SINA alignment tool with concurrent classification by the SINA LCA algorithm60. All 16S rRNA genes of 600+ bp in length were additionally searched against the NCBI ‘nr’ and ‘refseq_genomic’ databases using BLASTn. Author contributions B.C.T., L.A.H. and J.F.B. performed the binning and assembly; C.J.C. performed the metabolic reconstruction and bioinformatic analyses; L.A.H. performed phylogenetic analysis; D.W. and J.A.E. contributed to taxonomic analyses; K.C.W. and S.W.S. contributed to the metabolic analysis; K.H.W. provided the samples; S.G.T. handled the sequencing; C.J.C. and J.F.B wrote the paper. All authors discussed the results and commented on the manuscript. Additional information Accession codes: Sequences for the rifle sediment metagenome have been deposited at the NCBI Sequence Read Archive (SRA) with Project number BioProject ID# PRJNA167727, under the accession code SRP013381. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession AUYT00000000. The version described in this paper is version AUYT01000000. How to cite this article: Castelle, C. J. et al. Extraordinary phylogenetic diversity and metabolic versatility in aquifer sediment. Nat. Commun. 4:2120 doi: 10.1038/ncomms3120 (2013). Supplementary Material Supplementary Figures, Supplementary Tables, Supplementary Notes and Supplementary References Supplementary Figures S1-S9, Supplementary Tables S1-S2, Supplementary Notes 1-2 and Supplementary References Supplementary Dataset 1 Rank abundance and taxonomic affiliation information corresponding to Figure 1 Supplementary Dataset 2 Taxonomic Classification of the RBG 16S rRNA sequences by SILVA's SINA alignment tool Supplementary Dataset 3 RBG-1 genome overview Supplementary Dataset 4 Genome_RBG1 Supplementary Dataset 5 DMSO_RBG1_RefSeq
                Bookmark

                Author and article information

                Journal
                Zookeys
                Zookeys
                ZooKeys
                ZooKeys
                Pensoft Publishers
                1313-2989
                1313-2970
                2015
                2 February 2015
                : 480
                : 125-146
                Affiliations
                [1 ]Department of Environment, Earth and Ecosystems, The Open University, Milton Keynes, MK7 6AA, UK
                [2 ]Institute of Educational Technology, The Open University, Milton Keynes, MK7 6AA, UK
                [3 ]Faculty of Maths, Computing and Technology, The Open University, Milton Keynes, MK7 6AA, UK
                [4 ]South African National Biodiversity Institute, Kirstenbosch, Claremont, Cape Town, South Africa
                [5 ]Current address: Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Charlotte Auerbach Road, Edinburgh EH9 3FL, Scotland, UK
                Author notes
                Corresponding author: Jonathan Silvertown ( Jonathan.Silvertown@ 123456ed.ac.uk )

                Academic editor: V. Smith

                Article
                10.3897/zookeys.480.8803
                4319112
                Jonathan Silvertown, Martin Harvey, Richard Greenwood, Mike Dodd, Jon Rosewell, Tony Rebelo, Janice Ansine, Kevin McConway

                This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                Categories
                Research Article

                Comments

                Comment on this article