19
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      Publish your biodiversity research with us!

      Submit your article here.

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      LightningBug ONE: An experiment in high-throughput digitization of pinned insects

      ,
      Biodiversity Information Science and Standards
      Pensoft Publishers

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Digital technology presents us with new and compelling opportunities for discovery when focused on the world's natural history collections. The outstanding barrier to applying existing and forthcoming computational methods for large-scale study of this important resource is that it is (largely) not yet in the digital realm.  Without development of new and much faster methods for digitizing objects in these collections, it will be a long time before these data are available in digital form. For example, methods that are currently employed for capturing, cataloguing, and indexing pinned insect specimen data will require many tens of years or more to process collections with millions of dry specimens, and so we need to develop a much faster pipeline. In this paper we describe a capture system capable of collecting and archiving the imagery necessary to digitize a collection of circa 4.5 million specimens in one or two years of production operation. To minimize the time required to digitize each specimen, we have proposed (Hereld et al. 2017) developing multi-camera systems to capture the pinned insect and its accompanying labels from many angles in a single exposure. Using a sampling (21 randomly drawn drawers, totalling 5178 insects) of the 4.5 million specimens in the collection at the Field Museum of Natural History, we estimated that a large fraction of that collection (97.6% +/- 2.2%) consists of pinned insects with labels that are visible from one angle or another without requiring adjustment or removal of elements on the pin. In this situation a multi-camera system with enough angular coverage could provide imagery for reconstructing virtual labels from fragmentary views taken from different directions. Agarwal et al. (2018) demonstrated a method for combining these multiple views into a virtual label that could be transcribed by automated optical character recognition software. We have now designed, built and tested a prototype snapshot 3D digitization station to allow rapid capture of multi-view imagery for automated capture of pinned insect specimens and labels. It consists of twelve very small and light 8-megapixel cameras (Fig. 1), each controlled by a small dedicated computer. The cameras are arrayed around the target volume, six on each side of the sample feed path. Their positions and orientations are fixed by a 3D-printed scaffolding designed for the purpose. The twelve camera controllers and a master computer are connected to a dedicated high-speed data network over which all of the coordinating control signals and returning images and metadata are passed. The system is integrated with a high-performance object store that includes a database for metadata and the archived images comprising each snapshot. The system is designed so that it can be readily extended to include additional or different sensors. The station is meant to be fed with specimens by a conveyor belt whose motion is coordinated with the exposure of the multi-view snapshots. In order to test the performance of the system we added a recirculating specimen feeder designed expressly for this experiment. With it integrated into the system in place of a conventional conveyor belt we are able to provide a continuous stream of targets for the digitization system to facilitate long tests of its performance and robustness. We demonstrated the ability to capture data at a peak rate of 1400 specimens per hour and an average rate of 1000 specimens per hour over the course of a sustained 6 hour run. The dataset (Hereld and Ferrier 2018) collected in this experiment provides fodder for the further development of algorithms for the offline reconstruction and automatic transcription of the label contents.

          Related collections

          Most cited references2

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Designing a High-Throughput Pipeline for Digitizing Pinned Insects

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Towards Automated Transcription of Label Text from Pinned Insect Collections

              Bookmark

              Author and article information

              Journal
              Biodiversity Information Science and Standards
              BISS
              Pensoft Publishers
              2535-0897
              June 19 2019
              June 19 2019
              : 3
              Article
              10.3897/biss.3.37228
              cc55f30a-f735-4784-929e-80869e6a9ec9
              © 2019

              http://creativecommons.org/licenses/by/4.0/

              History

              Comments

              Comment on this article