+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A generalized approach for producing, quantifying, and validating citizen science data from wildlife images


      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Citizen science has the potential to expand the scope and scale of research in ecology and conservation, but many professional researchers remain skeptical of data produced by nonexperts. We devised an approach for producing accurate, reliable data from untrained, nonexpert volunteers. On the citizen science website www.snapshotserengeti.org, more than 28,000 volunteers classified 1.51 million images taken in a large‐scale camera‐trap survey in Serengeti National Park, Tanzania. Each image was circulated to, on average, 27 volunteers, and their classifications were aggregated using a simple plurality algorithm. We validated the aggregated answers against a data set of 3829 images verified by experts and calculated 3 certainty metrics—level of agreement among classifications (evenness), fraction of classifications supporting the aggregated answer (fraction support), and fraction of classifiers who reported “nothing here” for an image that was ultimately classified as containing an animal (fraction blank)—to measure confidence that an aggregated answer was correct. Overall, aggregated volunteer answers agreed with the expert‐verified data on 98% of images, but accuracy differed by species commonness such that rare species had higher rates of false positives and false negatives. Easily calculated analysis of variance and post‐hoc Tukey tests indicated that the certainty metrics were significant indicators of whether each image was correctly classified or classifiable. Thus, the certainty metrics can be used to identify images for expert review. Bootstrapping analyses further indicated that 90% of images were correctly classified with just 5 volunteers per image. Species classifications based on the plurality vote of multiple citizen scientists can provide a reliable foundation for large‐scale monitoring of African wildlife.

          Translated abstract

          Una Estrategia Generalizada para la Producción, Cuantificación y Validación de los Datos de Ciencia Ciudadana a partir de Imágenes de Vida Silvestre


          La ciencia ciudadana tiene el potencial de expandir el alcance y la escala de la investigación en la ecología y la conservación, pero muchos investigadores profesionales permanecen escépticos sobre los datos producidos por quienes no son expertos. Diseñamos una estrategia para generar datos precisos y fiables a partir de voluntarios no expertos y sin entrenamiento. En el sitio web de ciencia ciudadana www.snapshotserengeti.org más de 28, 000 voluntarios clasificaron 1.51 millón de imágenes que fueron tomadas en un censo a gran escala de cámaras trampa en el Parque Nacional Serengueti, Tanzania. Cada imagen llegó, en promedio, hasta 27 voluntarios, cuyas clasificaciones se conjuntaron mediante el uso de un algoritmo de pluralidad simple. Validamos el conjunto de respuestas frente a un juego de datos de 3, 829 imágenes verificadas por expertos y calculamos tres medidas de certeza: nivel de concordancia entre las clasificaciones (uniformidad), fracción de clasificaciones que apoyan al conjunto de respuestas (fracción de apoyo) y fracción de clasificadores que reportaron “nada aquí” en una imagen que al final se clasificó como que sí tenía un animal (fracción en blanco). Estas medidas se usaron para estimar la confianza de que un conjunto de respuestas estuviera en lo correcto. En general, el conjunto de respuestas de los voluntarios estuvo de acuerdo con los datos verificados por los expertos en un 98 % de las imágenes, pero la certeza varió según la preponderancia de la especie, de tal forma que las especies raras tuvieron una tasa más alta de falsos positivos y falsos negativos. El análisis de varianza calculado fácilmente y las pruebas post‐hoc de Tukey indicaron que las medidas de certeza fueron indicadores significativos de si cada imagen estuvo clasificada correctamente o si era clasificable. Por esto, las medidas de certeza pueden utilizarse para identificar imágenes para una revisión de expertos. Los análisis de bootstrapping indicaron más a fondo que el 90 % de las imágenes estuvieron clasificadas correctamente con sólo cinco voluntarios por imagen. Las clasificaciones de especies basadas en el voto de pluralidad de múltiples científicos ciudadanos puede proporcionar un fundamento fiable para un monitoreo a gran escala de la vida silvestre africana.

          Related collections

          Most cited references2

          • Record: found
          • Abstract: not found
          • Article: not found

          Species-diversity and pattern-diversity in the study of ecological succession.

            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Crowdsourcing the identification of organisms: A case-study of iSpot

            Abstract Accurate species identification is fundamental to biodiversity science, but the natural history skills required for this are neglected in formal education at all levels. In this paper we describe how the web application ispotnature.org and its sister site ispot.org.za (collectively, “iSpot”) are helping to solve this problem by combining learning technology with crowdsourcing to connect beginners with experts. Over 94% of observations submitted to iSpot receive a determination. External checking of a sample of 3,287 iSpot records verified > 92% of them. To mid 2014, iSpot crowdsourced the identification of 30,000 taxa (>80% at species level) in > 390,000 observations with a global community numbering > 42,000 registered participants. More than half the observations on ispotnature.org were named within an hour of submission. iSpot uses a unique, 9-dimensional reputation system to motivate and reward participants and to verify determinations. Taxon-specific reputation points are earned when a participant proposes an identification that achieves agreement from other participants, weighted by the agreers’ own reputation scores for the taxon. This system is able to discriminate effectively between competing determinations when two or more are proposed for the same observation. In 57% of such cases the reputation system improved the accuracy of the determination, while in the remainder it either improved precision (e.g. by adding a species name to a genus) or revealed false precision, for example where a determination to species level was not supported by the available evidence. We propose that the success of iSpot arises from the structure of its social network that efficiently connects beginners and experts, overcoming the social as well as geographic barriers that normally separate the two.

              Author and article information

              Conserv Biol
              Conserv. Biol
              Conservation Biology
              John Wiley and Sons Inc. (Hoboken )
              25 April 2016
              June 2016
              : 30
              : 3 ( doiID: 10.1111/cobi.2016.30.issue-3 )
              : 520-531
              [ 1 ] Department of Ecology, Evolution and BehaviorUniversity of Minnesota Saint Paul MN 55108U.S.A.
              [ 2 ] Department of PhysicsUniversity of Oxford Denys Wilkinson Building Oxford OX1 3RHU.K.
              [ 3 ] Current address: Department of Organismic and Evolutionary BiologyHarvard University Cambridge MA 02138U.S.A.
              Author notes
              [*] [* ]Address for correspondence: Department of Physics, University of Oxford, Denys Wilkinson Building, Oxford OX1 3RH, U.K. email ali@ 123456zooniverse.org
              © 2016 The Authors. Conservation Biology published by Wiley Periodicals, Inc. on behalf of Society for Conservation Biology.

              This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

              Page count
              Pages: 12
              Special Section: Moving from Citizen to Civic Science to Address Wicked Conservation Problems
              Special Section: Moving from Citizen to Civic Science to AddressWicked Conservation Problems
              Custom metadata
              June 2016
              Converter:WILEY_ML3GV2_TO_NLMPMC version:4.9.4 mode:remove_FC converted:25.08.2016

              big data,camera traps,crowdsourcing,data aggregation,data validation,image processing,snapshot serengeti,zooniverse,cámaras trampa,conjunto de datos,datos grandes,procesamiento de imágenes,validación de datos


              Comment on this article