34
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Random Forests for Global and Regional Crop Yield Predictions

      research-article

      Read this article at

      ScienceOpenPublisherPMC
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Accurate predictions of crop yield are critical for developing effective agricultural and food policies at the regional and global scales. We evaluated a machine-learning method, Random Forests (RF), for its ability to predict crop yield responses to climate and biophysical variables at global and regional scales in wheat, maize, and potato in comparison with multiple linear regressions (MLR) serving as a benchmark. We used crop yield data from various sources and regions for model training and testing: 1) gridded global wheat grain yield, 2) maize grain yield from US counties over thirty years, and 3) potato tuber and maize silage yield from the northeastern seaboard region. RF was found highly capable of predicting crop yields and outperformed MLR benchmarks in all performance statistics that were compared. For example, the root mean square errors (RMSE) ranged between 6 and 14% of the average observed yield with RF models in all test cases whereas these values ranged from 14% to 49% for MLR models. Our results show that RF is an effective and versatile machine-learning method for crop yield predictions at regional and global scales for its high accuracy and precision, ease of use, and utility in data analysis. RF may result in a loss of accuracy when predicting the extreme ends or responses beyond the boundaries of the training data.

          Related collections

          Most cited references4

          • Record: found
          • Abstract: found
          • Article: not found

          Random forests for classification in ecology.

          Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature.
            • Record: found
            • Abstract: found
            • Article: not found

            Machine learning methods without tears: a primer for ecologists.

            Machine learning methods, a family of statistical techniques with origins in the field of artificial intelligence, are recognized as holding great promise for the advancement of understanding and prediction about ecological phenomena. These modeling techniques are flexible enough to handle complex problems with multiple interacting elements and typically outcompete traditional approaches (e.g., generalized linear models), making them ideal for modeling ecological systems. Despite their inherent advantages, a review of the literature reveals only a modest use of these approaches in ecology as compared to other disciplines. One potential explanation for this lack of interest is that machine learning techniques do not fall neatly into the class of statistical modeling approaches with which most ecologists are familiar. In this paper, we provide an introduction to three machine learning approaches that can be broadly used by ecologists: classification and regression trees, artificial neural networks, and evolutionary computation. For each approach, we provide a brief background to the methodology, give examples of its application in ecology, describe model development and implementation, discuss strengths and weaknesses, explore the availability of statistical software, and provide an illustrative example. Although the ecological application of machine learning approaches has increased, there remains considerable skepticism with respect to the role of these techniques in ecology. Our review encourages a greater understanding of machin learning approaches and promotes their future application and utilization, while also providing a basis from which ecologists can make informed decisions about whether to select or avoid these approaches in their future modeling endeavors.
              • Record: found
              • Abstract: found
              • Article: not found

              Prediction of protein-protein interactions using random decision forest framework.

              Protein interactions are of biological interest because they orchestrate a number of cellular processes such as metabolic pathways and immunological recognition. Domains are the building blocks of proteins; therefore, proteins are assumed to interact as a result of their interacting domains. Many domain-based models for protein interaction prediction have been developed, and preliminary results have demonstrated their feasibility. Most of the existing domain-based methods, however, consider only single-domain pairs (one domain from one protein) and assume independence between domain-domain interactions. In this paper, we introduce a domain-based random forest of decision trees to infer protein interactions. Our proposed method is capable of exploring all possible domain interactions and making predictions based on all the protein domains. Experimental results on Saccharomyces cerevisiae dataset demonstrate that our approach can predict protein-protein interactions with higher sensitivity (79.78%) and specificity (64.38%) compared with that of the maximum likelihood approach. Furthermore, our model can be used to infer interactions not only for single-domain pairs but also for multiple domain pairs.

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                3 June 2016
                2016
                : 11
                : 6
                : e0156571
                Affiliations
                [1 ]School of Environmental and Forest Sciences, College of the Environment, University of Washington, Box 354115, Seattle, WA 98195, United States of America
                [2 ]Department of Geographical Sciences, University of Maryland, College Park, MD, United States of America
                [3 ]Crop Systems and Global Change Laboratory, USDA-ARS, Beltsville, MD 20705, United States of America
                [4 ]Department of Earth and Planetary Sciences, Harvard University, Cambridge, MA 02138, United States of America
                [5 ]Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, United States of America
                [6 ]Department of Forest Resources, University of Minnesota, St. Paul, MN 55108, United States of America
                [7 ]Climate Change & Agroecology Division, National Institute of Agricultural Science, RDA, Suwon, Korea
                [8 ]Institute on the Environment, University of Minnesota, St. Paul, MN 55108, United States of America
                Instituto de Agricultura Sostenible (CSIC), SPAIN
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: SHK JHJ. Performed the experiments: JHJ SHK JPR NDM EEB DHF. Analyzed the data: JHJ SHK JPR NDM DHF DJT KY EEB. Contributed reagents/materials/analysis tools: SHK KMS DHF DJT JSG VRR. Wrote the paper: JHJ SHK JPR NDM DHF EEB JSG KY DJT KMS VRR.

                Author information
                http://orcid.org/0000-0003-3879-4080
                Article
                PONE-D-15-24117
                10.1371/journal.pone.0156571
                4892571
                27257967
                2bf0433c-f328-4cf2-b582-3c1689710c03

                This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

                History
                : 3 June 2015
                : 17 May 2016
                Page count
                Figures: 3, Tables: 2, Pages: 15
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/501100003627, Rural Development Administration;
                Award ID: PJ01000707
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100003627, Rural Development Administration;
                Award ID: PJ01000707
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000199, U.S. Department of Agriculture;
                Award ID: 58-1265-1-074
                Award Recipient :
                Funded by: National Institute of Food and Agriculture
                Award ID: 2011-68004-30057
                Award Recipient :
                Funded by: USDA-ARS Headquarters Postdoctoral Research Associate Program
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100005825, National Institute of Food and Agriculture;
                Award ID: 2016-67012-25208
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000085, Directorate for Geosciences;
                Award ID: 1521210
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000008, David and Lucile Packard Foundation;
                Award Recipient :
                This study was supported by a Cooperative Research Program for Agricultural Science and Technology Development (Project No. PJ01000707), Rural Development Administration, Republic of Korea (SHK; KMS). Additional support was provided in part by a Specific Cooperative Agreement: 58-1265-1-074 between University of Washington and USDA-ARS (SHK; VRR), the USDA-ARS Headquarters Postdoctoral Research Associate Program (DHF), the USDA-NIFA-AFRI Grant no. 2011-68004-30057: Enhancing Food Security of Underserved Populations in the Northeast through Sustainable Regional Food Systems (DHF), the USDA AFRI fellowship 2016-67012-25208 (NDM), the NSF Hydrological Sciences grant 1521210 (NDM), and the Packard Foundation (EEB).
                Categories
                Research Article
                Biology and Life Sciences
                Agriculture
                Crop Science
                Crops
                Cereal Crops
                Maize
                Biology and Life Sciences
                Organisms
                Plants
                Grasses
                Maize
                Research and Analysis Methods
                Model Organisms
                Plant and Algal Models
                Maize
                Biology and Life Sciences
                Agriculture
                Crop Science
                Crops
                Cereal Crops
                Wheat
                Biology and Life Sciences
                Organisms
                Plants
                Grasses
                Wheat
                Biology and Life Sciences
                Organisms
                Plants
                Solanum
                Potato
                Biology and Life Sciences
                Agriculture
                Crop Science
                Crops
                Vegetables
                Potato
                Biology and Life Sciences
                Organisms
                Plants
                Vegetables
                Potato
                Biology and Life Sciences
                Plant Science
                Plant Anatomy
                Tubers
                Biology and Life Sciences
                Agriculture
                Crop Science
                Crops
                Cereal Crops
                Biology and Life Sciences
                Agriculture
                Agrochemicals
                Fertilizers
                Research and Analysis Methods
                Mathematical and Statistical Techniques
                Statistical Methods
                Regression Analysis
                Linear Regression Analysis
                Physical Sciences
                Mathematics
                Statistics (Mathematics)
                Statistical Methods
                Regression Analysis
                Linear Regression Analysis
                Biology and Life Sciences
                Agriculture
                Agricultural Soil Science
                Ecology and Environmental Sciences
                Soil Science
                Agricultural Soil Science
                Custom metadata
                All relevant data and information are within the paper or cited therein.

                Uncategorized
                Uncategorized

                Comments

                Comment on this article

                Related Documents Log