+1 Recommend
1 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Tale of Two “Forests”: Random Forest Machine Learning Aids Tropical Forest Carbon Mapping

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Accurate and spatially-explicit maps of tropical forest carbon stocks are needed to implement carbon offset mechanisms such as REDD+ (Reduced Deforestation and Degradation Plus). The Random Forest machine learning algorithm may aid carbon mapping applications using remotely-sensed data. However, Random Forest has never been compared to traditional and potentially more reliable techniques such as regionally stratified sampling and upscaling, and it has rarely been employed with spatial data. Here, we evaluated the performance of Random Forest in upscaling airborne LiDAR (Light Detection and Ranging)-based carbon estimates compared to the stratification approach over a 16-million hectare focal area of the Western Amazon. We considered two runs of Random Forest, both with and without spatial contextual modeling by including—in the latter case—x, and y position directly in the model. In each case, we set aside 8 million hectares (i.e., half of the focal area) for validation; this rigorous test of Random Forest went above and beyond the internal validation normally compiled by the algorithm (i.e., called “out-of-bag”), which proved insufficient for this spatial application. In this heterogeneous region of Northern Peru, the model with spatial context was the best preforming run of Random Forest, and explained 59% of LiDAR-based carbon estimates within the validation area, compared to 37% for stratification or 43% by Random Forest without spatial context. With the 60% improvement in explained variation, RMSE against validation LiDAR samples improved from 33 to 26 Mg C ha −1 when using Random Forest with spatial context. Our results suggest that spatial context should be considered when using Random Forest, and that doing so may result in substantially improved carbon stock modeling for purposes of climate change mitigation.

          Related collections

          Most cited references 10

          • Record: found
          • Abstract: found
          • Article: not found

          Dispersal, environment, and floristic variation of western Amazonian forests.

          The distribution of plant species, the species compositions of different sites, and the factors that affect them in tropical rain forests are not well understood. The main hypotheses are that species composition is either (i) uniform over large areas, (ii) random but spatially autocorrelated because of dispersal limitation, or (iii) patchy and environmentally determined. Here we test these hypotheses, using a large data set from western Amazonia. The uniformity hypothesis gains no support, but the other hypotheses do. Environmental determinism explains a larger proportion of the variation in floristic differences between sites than does dispersal limitation; together, these processes explain 70 to 75% of the variation. Consequently, it is important that management planning for conservation and resource use take into account both habitat heterogeneity and biogeographic differences.
            • Record: found
            • Abstract: found
            • Article: not found

            Random forests for classification in ecology.

            Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature.
              • Record: found
              • Abstract: found
              • Article: not found

              Continental-scale patterns of canopy tree composition and function across Amazonia.

              The world's greatest terrestrial stores of biodiversity and carbon are found in the forests of northern South America, where large-scale biogeographic patterns and processes have recently begun to be described. Seven of the nine countries with territory in the Amazon basin and the Guiana shield have carried out large-scale forest inventories, but such massive data sets have been little exploited by tropical plant ecologists. Although forest inventories often lack the species-level identifications favoured by tropical plant ecologists, their consistency of measurement and vast spatial coverage make them ideally suited for numerical analyses at large scales, and a valuable resource to describe the still poorly understood spatial variation of biomass, diversity, community composition and forest functioning across the South American tropics. Here we show, by using the seven forest inventories complemented with trait and inventory data collected elsewhere, two dominant gradients in tree composition and function across the Amazon, one paralleling a major gradient in soil fertility and the other paralleling a gradient in dry season length. The data set also indicates that the dominance of Fabaceae in the Guiana shield is not necessarily the result of root adaptations to poor soils (nodulation or ectomycorrhizal associations) but perhaps also the result of their remarkably high seed mass there as a potential adaptation to low rates of disturbance.

                Author and article information

                Role: Editor
                PLoS One
                PLoS ONE
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                28 January 2014
                : 9
                : 1
                Department of Global Ecology, Carnegie Institution for Science, Stanford, California, United States of America
                DOE Pacific Northwest National Laboratory, United States of America
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: JM GPA DEK. Performed the experiments: JM GPA DEK. Analyzed the data: JM GPA DEK TK REM CA MH KDC. Wrote the paper: JM GPA.


                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                Page count
                Pages: 9
                This study was supported by the John D. and Catherine T. MacArthur Foundation and the endowment of the Carnegie Institution for Science. The Carnegie Airborne Observatory is made possible by the Avatar Alliance Foundation, Gordon and Betty Moore Foundation, W. M. Keck Foundation, Margaret A. Cargill Foundation, Grantham Foundation for the Protection of the Environment, Mary Anne Nyburg Baker and G. Leonard Baker Jr., and William R. Hearst III. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Research Article
                Conservation Science
                Global Change Ecology
                Spatial and Landscape Ecology
                Computer Science
                Earth Sciences
                Atmospheric Science
                Climate Change
                Applied Mathematics



                Comment on this article