32
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Introduction to computational causal inference using reproducible Stata, R, and Python code: A tutorial

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The main purpose of many medical studies is to estimate the effects of a treatment or exposure on an outcome. However, it is not always possible to randomize the study participants to a particular treatment, therefore observational study designs may be used. There are major challenges with observational studies; one of which is confounding. Controlling for confounding is commonly performed by direct adjustment of measured confounders; although, sometimes this approach is suboptimal due to modeling assumptions and misspecification. Recent advances in the field of causal inference have dealt with confounding by building on classical standardization methods. However, these recent advances have progressed quickly with a relative paucity of computational-oriented applied tutorials contributing to some confusion in the use of these methods among applied researchers. In this tutorial, we show the computational implementation of different causal inference estimators from a historical perspective where new estimators were developed to overcome the limitations of the previous estimators (ie, nonparametric and parametric g-formula, inverse probability weighting, double-robust, and data-adaptive estimators). We illustrate the implementation of different methods using an empirical example from the Connors study based on intensive care medicine, and most importantly, we provide reproducible and commented code in Stata, R, and Python for researchers to adapt in their own observational study. The code can be accessed at https://github.com/migariane/Tutorial_Computational_Causal_Inference_Estimators.

          Related collections

          Most cited references49

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples

          The propensity score is a subject's probability of treatment, conditional on observed baseline covariates. Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates. Propensity-score matching is a popular method of using the propensity score in the medical literature. Using this approach, matched sets of treated and untreated subjects with similar values of the propensity score are formed. Inferences about treatment effect made using propensity-score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates. In this paper we discuss the following methods for assessing whether the propensity score model has been correctly specified: comparing means and prevalences of baseline characteristics using standardized differences; ratios comparing the variance of continuous covariates between treated and untreated subjects; comparison of higher order moments and interactions; five-number summaries; and graphical methods such as quantile–quantile plots, side-by-side boxplots, and non-parametric density plots for comparing the distribution of baseline covariates between treatment groups. We describe methods to determine the sampling distribution of the standardized difference when the true standardized difference is equal to zero, thereby allowing one to determine the range of standardized differences that are plausible with the propensity score model having been correctly specified. We highlight the limitations of some previously used methods for assessing the adequacy of the specification of the propensity-score model. In particular, methods based on comparing the distribution of the estimated propensity score between treated and untreated subjects are uninformative. Copyright © 2009 John Wiley & Sons, Ltd.
            Bookmark
            • Record: found
            • Abstract: not found
            • Book: not found

            The Jackknife, the Bootstrap and Other Resampling Plans

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Estimating causal effects of treatments in randomized and nonrandomized studies.

                Bookmark

                Author and article information

                Journal
                8215016
                7188
                Stat Med
                Stat Med
                Statistics in medicine
                0277-6715
                1097-0258
                18 January 2025
                30 January 2022
                28 October 2021
                05 February 2025
                : 41
                : 2
                : 407-432
                Affiliations
                [1 ]Inequalities in Cancer Outcomes Network, Department of Non-communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK
                [2 ]Department of Epidemiology and Biostatistics, Tehran University of Medical Sciences, Tehran, Iran
                [3 ]Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
                [4 ]Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
                [5 ]Non-communicable Disease and Cancer Epidemiology Group, Instituto de Investigacion Biosanitaria de Granada (ibs.GRANADA), Andalusian School of Public Health, University of Granada, Granada, Spain
                [6 ]Biomedical Network Research Centers of Epidemiology and Public Health (CIBERESP), Madrid, Spain
                Author notes

                AUTHOR CONTRIBUTIONS

                The article arises from the motivation to disseminate the principles of modern epidemiology among clinicians and applied researchers. Miguel A. Luque-Fernandez developed the concept, designed the first draft of the article and the computing code. All authors interpreted and reviewed the code and the data, drafted and revised the article. All authors read and approved the final version of the article. Miguel A. Luque-Fernandez is the guarantor of the article.

                Correspondence Miguel A. Luque-Fernandez, Inequalities in Cancer Outcomes Network, Department of Non-communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, Keppel Street, Bloomsbury, London WC1E 7HT, UK., miguel-angel.luque@ 123456lshtm.ac.uk
                Author information
                http://orcid.org/0000-0002-8739-9565
                http://orcid.org/0000-0002-9932-1095
                http://orcid.org/0000-0001-6683-5164
                Article
                NIHMS1890988
                10.1002/sim.9234
                11795351
                34713468
                ccaf02b8-43be-459e-a5ac-6e10db5d31f5

                This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

                History
                Categories
                Article

                Biostatistics
                causal inference,double-robust methods,g-formula,g-methods,inverse probability weighting,machine learning,propensity score,regression adjustment,targeted maximum likelihood estimation

                Comments

                Comment on this article