22
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Applications of deep convolutional neural networks to digitized natural history collections

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Abstract

          Natural history collections contain data that are critical for many scientific endeavors. Recent efforts in mass digitization are generating large datasets from these collections that can provide unprecedented insight. Here, we present examples of how deep convolutional neural networks can be applied in analyses of imaged herbarium specimens. We first demonstrate that a convolutional neural network can detect mercury-stained specimens across a collection with 90% accuracy. We then show that such a network can correctly distinguish two morphologically similar plant families 96% of the time. Discarding the most challenging specimen images increases accuracy to 94% and 99%, respectively. These results highlight the importance of mass digitization and deep learning approaches and reveal how they can together deliver powerful new investigative tools.

          Related collections

          Most cited references7

          • Record: found
          • Abstract: not found
          • Article: not found

          The Value of Museum Collections for Research and Society

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Automated species identification: why not?

            Where possible, automation has been a common response of humankind to many activities that have to be repeated numerous times. The routine identification of specimens of previously described species has many of the characteristics of other activities that have been automated, and poses a major constraint on studies in many areas of both pure and applied biology. In this paper, we consider some of the reasons why automated species identification has not become widely employed, and whether it is a realistic option, addressing the notions that it is too difficult, too threatening, too different or too costly. Although recognizing that there are some very real technical obstacles yet to be overcome, we argue that progress in the development of automated species identification is extremely encouraging that such an approach has the potential to make a valuable contribution to reducing the burden of routine identifications. Vision and enterprise are perhaps more limiting at present than practical constraints on what might possibly be achieved.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Going deeper in the automated identification of Herbarium specimens

              Background Hundreds of herbarium collections have accumulated a valuable heritage and knowledge of plants over several centuries. Recent initiatives started ambitious preservation plans to digitize this information and make it available to botanists and the general public through web portals. However, thousands of sheets are still unidentified at the species level while numerous sheets should be reviewed and updated following more recent taxonomic knowledge. These annotations and revisions require an unrealistic amount of work for botanists to carry out in a reasonable time. Computer vision and machine learning approaches applied to herbarium sheets are promising but are still not well studied compared to automated species identification from leaf scans or pictures of plants in the field. Results In this work, we propose to study and evaluate the accuracy with which herbarium images can be potentially exploited for species identification with deep learning technology. In addition, we propose to study if the combination of herbarium sheets with photos of plants in the field is relevant in terms of accuracy, and finally, we explore if herbarium images from one region that has one specific flora can be used to do transfer learning to another region with other species; for example, on a region under-represented in terms of collected data. Conclusions This is, to our knowledge, the first study that uses deep learning to analyze a big dataset with thousands of species from herbaria. Results show the potential of Deep Learning on herbarium species identification, particularly by training and testing across different datasets from different herbaria. This could potentially lead to the creation of a semi, or even fully automated system to help taxonomists and experts with their annotation, classification, and revision works.
                Bookmark

                Author and article information

                Contributors
                Journal
                Biodivers Data J
                Biodivers Data J
                Biodiversity Data Journal
                Biodiversity Data Journal
                Biodiversity Data Journal
                Pensoft Publishers
                1314-2828
                2017
                02 November 2017
                : 5
                Affiliations
                [1 ] National Museum of Natural History, Smithsonian Institution, Washington, DC, United States of America
                [2 ] Office of the Chief Information Officer, Smithsonian Institution, Washington, DC, United States of America
                [3 ] NVIDIA, Santa Clara, CA, United States of America
                Author notes
                Corresponding author: Eric Schuettpelz ( schuettpelze@ 123456si.edu ).

                Academic editor: Vincent Smith

                Article
                Biodiversity Data Journal 8292
                10.3897/BDJ.5.e21139
                5680669
                7859d4cc-f0ec-4b88-b9c8-a6ac4debc3ed
                Eric Schuettpelz, Paul B. Frandsen, Rebecca B. Dikow, Abel Brown, Sylvia Orli, Melinda Peters, Adam Metallo, Vicki A. Funk, Laurence J. Dorr

                This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (CC-BY), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                Page count
                Figures: 2, Tables: 2, References: 9
                Categories
                Research Article

                convolutional neural networks,deep learning,machine learning,mass digitization,natural history collections

                Comments

                Comment on this article