7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Technical note: Flagging inconsistencies in flux tower data

      Read this article at

      ScienceOpenPublisher
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Abstract. Global collections of synthesized flux tower data such as FLUXNET have accelerated scientific progress beyond the eddy covariance community. However, remaining data issues in FLUXNET data pose challenges for users, particularly for multi-site synthesis and modelling activities. Here, we present complementary consistency flags (C2Fs) for flux tower data, which rely on multiple indications of inconsistency among variables, along with a methodology to detect discontinuities in time series. The C2F relates to carbon and energy fluxes, as well as to core meteorological variables, and consists of the following: (1) flags for daily data values, (2) flags for entire-site variables, and (3) flags at time stamps that mark large discontinuities in the time series. The flagging is primarily based on combining outlier scores from a set of predefined relationships among variables. The methodology to detect break points in the time series is based on a non-parametric test for the difference in distributions of model residuals. Applying C2F to the FLUXNET 2015 dataset reveals the following: (1) among the considered variables, gross primary productivity and ecosystem respiration data were flagged most frequently, in particular during rain pulses under dry and hot conditions. This information is useful for modelling and analysing ecohydrological responses. (2) There are elevated flagging frequencies for radiation variables (shortwave, photosynthetically active, and net). This information can improve the interpretation and modelling of ecosystem fluxes with respect to issues in the driver. (3) The majority of long-term sites show temporal discontinuities in the time series of latent energy, net ecosystem exchange, and radiation variables. This should be useful for carefully assessing the results in terms of interannual variations in and trends of ecosystem fluxes. The C2F methodology is flexible for customizing and allows for varying the desired strictness of consistency. We discuss the limitations of the approach that can present starting points for future developments.

          Related collections

          Most cited references25

          • Record: found
          • Abstract: found
          • Article: not found

          MissForest--non-parametric missing value imputation for mixed-type data.

          Modern data acquisition based on high-throughput technology is often facing the problem of missing data. Algorithms commonly used in the analysis of such large-scale data often depend on a complete set. Missing value imputation offers a solution to this problem. However, the majority of available imputation methods are restricted to one type of variable only: continuous or categorical. For mixed-type data, the different types are usually handled separately. Therefore, these methods ignore possible relations between variable types. We propose a non-parametric method which can cope with different types of variables simultaneously. We compare several state of the art methods for the imputation of missing values. We propose and evaluate an iterative imputation method (missForest) based on a random forest. By averaging over many unpruned classification or regression trees, random forest intrinsically constitutes a multiple imputation scheme. Using the built-in out-of-bag error estimates of random forest, we are able to estimate the imputation error without the need of a test set. Evaluation is performed on multiple datasets coming from a diverse selection of biological fields with artificially introduced missing values ranging from 10% to 30%. We show that missForest can successfully handle missing values, particularly in datasets including different types of variables. In our comparative study, missForest outperforms other methods of imputation especially in data settings where complex interactions and non-linear relations are suspected. The out-of-bag imputation error estimates of missForest prove to be adequate in all settings. Additionally, missForest exhibits attractive computational efficiency and can cope with high-dimensional data. The package missForest is freely available from http://stat.ethz.ch/CRAN/. stekhoven@stat.math.ethz.ch; buhlmann@stat.math.ethz.ch
            • Record: found
            • Abstract: not found
            • Article: not found

            On the separation of net ecosystem exchange into assimilation and ecosystem respiration: review and improved algorithm

              • Record: found
              • Abstract: not found
              • Article: not found

              Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

                Author and article information

                Contributors
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                Journal
                Biogeosciences
                Biogeosciences
                Copernicus GmbH
                1726-4189
                2024
                April 15 2024
                : 21
                : 7
                : 1827-1846
                Article
                10.5194/bg-21-1827-2024
                608fe470-a0e7-4eaf-bfa7-7d483c24598e
                © 2024

                https://creativecommons.org/licenses/by/4.0/

                History

                Comments

                Comment on this article

                Related Documents Log