25
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Missing data is a common problem in epidemiological studies, and is particularly prominent in longitudinal data, which involve multiple waves of data collection. Traditional multiple imputation (MI) methods (fully conditional specification (FCS) and multivariate normal imputation (MVNI)) treat repeated measurements of the same time-dependent variable as just another ‘distinct’ variable for imputation and therefore do not make the most of the longitudinal structure of the data. Only a few studies have explored extensions to the standard approaches to account for the temporal structure of longitudinal data. One suggestion is the two-fold fully conditional specification (two-fold FCS) algorithm, which restricts the imputation of a time-dependent variable to time blocks where the imputation model includes measurements taken at the specified and adjacent times. To date, no study has investigated the performance of two-fold FCS and standard MI methods for handling missing data in a time-varying covariate with a non-linear trajectory over time – a commonly encountered scenario in epidemiological studies.

          Methods

          We simulated 1000 datasets of 5000 individuals based on the Longitudinal Study of Australian Children (LSAC). Three missing data mechanisms: missing completely at random (MCAR), and a weak and a strong missing at random (MAR) scenarios were used to impose missingness on body mass index (BMI) for age z-scores; a continuous time-varying exposure variable with a non-linear trajectory over time. We evaluated the performance of FCS, MVNI, and two-fold FCS for handling up to 50% of missing data when assessing the association between childhood obesity and sleep problems.

          Results

          The standard two-fold FCS produced slightly more biased and less precise estimates than FCS and MVNI. We observed slight improvements in bias and precision when using a time window width of two for the two-fold FCS algorithm compared to the standard width of one.

          Conclusion

          We recommend the use of FCS or MVNI in a similar longitudinal setting, and when encountering convergence issues due to a large number of time points or variables with missing values, the two-fold FCS with exploration of a suitable time window.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s12874-017-0372-y) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references34

          • Record: found
          • Abstract: found
          • Article: not found

          Childhood obesity: public-health crisis, common sense cure

          The Lancet, 360(9331), 473-482
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Using the outcome for imputation of missing predictor values was preferred.

            Epidemiologic studies commonly estimate associations between predictors (risk factors) and outcome. Most software automatically exclude subjects with missing values. This commonly causes bias because missing values seldom occur completely at random (MCAR) but rather selectively based on other (observed) variables, missing at random (MAR). Multiple imputation (MI) of missing predictor values using all observed information including outcome is advocated to deal with selective missing values. This seems a self-fulfilling prophecy. We tested this hypothesis using data from a study on diagnosis of pulmonary embolism. We selected five predictors of pulmonary embolism without missing values. Their regression coefficients and standard errors (SEs) estimated from the original sample were considered as "true" values. We assigned missing values to these predictors--both MCAR and MAR--and repeated this 1,000 times using simulations. Per simulation we multiple imputed the missing values without and with the outcome, and compared the regression coefficients and SEs to the truth. Regression coefficients based on MI including outcome were close to the truth. MI without outcome yielded very biased--underestimated--coefficients. SEs and coverage of the 90% confidence intervals were not different between MI with and without outcome. Results were the same for MCAR and MAR. For all types of missing values, imputation of missing predictor values using the outcome is preferred over imputation without outcome and is no self-fulfilling prophecy.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Fully conditional specification in multivariate imputation

                Bookmark

                Author and article information

                Contributors
                anurikad@student.unimelb.edu.au
                margarita.moreno@mcri.edu.au
                alyshad@unimelb.edu.au
                katherine.lee@mcri.edu.au
                +61 3 8344 0732 , julieas@unimelb.edu.au
                Journal
                BMC Med Res Methodol
                BMC Med Res Methodol
                BMC Medical Research Methodology
                BioMed Central (London )
                1471-2288
                25 July 2017
                25 July 2017
                2017
                : 17
                : 114
                Affiliations
                [1 ]ISNI 0000 0001 2179 088X, GRID grid.1008.9, , Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, ; Melbourne, VIC Australia
                [2 ]Clinical Epidemiology and Biostatistics Unit, Murdoch Childrens Research Institute, Royal Children’s Hospital, Melbourne, VIC Australia
                [3 ]ISNI 0000 0004 1936 7857, GRID grid.1002.3, Department of Epidemiology and Preventive Medicine, , Monash University, ; Melbourne, VIC Australia
                [4 ]ISNI 0000 0001 2179 088X, GRID grid.1008.9, Department of Paediatrics, , University of Melbourne, ; Melbourne, VIC Australia
                Article
                372
                10.1186/s12874-017-0372-y
                5526258
                28743256
                ed87ee3e-76e3-473f-99de-fd3fe93d19a1
                © The Author(s). 2017

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 3 November 2016
                : 26 June 2017
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100000925, National Health and Medical Research Council;
                Award ID: 1035261
                Award ID: 1104975
                Award ID: 1053609
                Award Recipient :
                Funded by: Victorian International Research Scholarship
                Funded by: Melbourne International Fee Remission Scholarship
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2017

                Medicine
                fully conditional specification,longitudinal data,missing data,multiple imputation,multivariate normal imputation,non-linear trajectory,time-dependent covariate

                Comments

                Comment on this article