9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Primer on Causality in Data Science

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Many questions in Data Science are fundamentally causal in that our objective is to learn the effect of some exposure (randomized or not) on an outcome interest. Even studies that are seemingly non-causal (e.g. prediction or prevalence estimation) have causal elements, such as differential censoring or measurement. As a result, we, as Data Scientists, need to consider the underlying causal mechanisms that gave rise to the data, rather than simply the pattern or association observed in the data. In this work, we review the "Causal Roadmap", a formal framework to augment our traditional statistical analyses in an effort to answer the causal questions driving our research. Specific steps of the Roadmap include clearly stating the scientific question, defining of the causal model, translating the scientific question into a causal parameter, assessing the assumptions needed to translate the causal parameter into a statistical estimand, implementation of statistical estimators including parametric and semi-parametric methods, and interpretation of our findings. Throughout we focus on the effect of an exposure occurring at a single time point and provide extensions to more advanced settings.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: not found
          • Article: not found

          Marginal Structural Models and Causal Inference in Epidemiology

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Estimating causal effects from epidemiological data.

            In ideal randomised experiments, association is causation: association measures can be interpreted as effect measures because randomisation ensures that the exposed and the unexposed are exchangeable. On the other hand, in observational studies, association is not generally causation: association measures cannot be interpreted as effect measures because the exposed and the unexposed are not generally exchangeable. However, observational research is often the only alternative for causal inference. This article reviews a condition that permits the estimation of causal effects from observational data, and two methods -- standardisation and inverse probability weighting -- to estimate population causal effects under that condition. For simplicity, the main description is restricted to dichotomous variables and assumes that no random error attributable to sampling variability exists. The appendix provides a generalisation of inverse probability weighting.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Marginal structural models for the estimation of direct and indirect effects.

              The estimation of controlled direct effects can be carried out by fitting a marginal structural model and using inverse probability of treatment weighting. To use marginal structural models to estimate natural direct and indirect effects, 2 marginal structural models can be used: 1 for the effects of the treatment and mediator on the outcome and 1 for the effect of the treatment on the mediator. Unlike marginal structural models typically used in epidemiologic research, the marginal structural models used to estimate natural direct and indirect effects are made conditional on the covariates.
                Bookmark

                Author and article information

                Journal
                07 September 2018
                Article
                1809.02408
                9cafb04c-0fe3-4cb7-a49e-a2e0200a198d

                http://creativecommons.org/licenses/by-nc-sa/4.0/

                History
                Custom metadata
                17 pages (with references); 4 figures
                stat.AP stat.ME stat.ML

                Applications,Machine learning,Methodology
                Applications, Machine learning, Methodology

                Comments

                Comment on this article