17
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Evaluating large-scale propensity score performance through real-world and synthetic data experiments

      1 , 2 , 1 , 3 , 4
      International Journal of Epidemiology
      Oxford University Press (OUP)

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          <div class="section"> <a class="named-anchor" id="s1"> <!-- named anchor --> </a> <h5 class="section-title" id="d266634e152">Background</h5> <p id="d266634e154">Propensity score adjustment is a popular approach for confounding control in observational studies. Reliable frameworks are needed to determine relative propensity score performance in large-scale studies, and to establish optimal propensity score model selection methods. </p> </div><div class="section"> <a class="named-anchor" id="s2"> <!-- named anchor --> </a> <h5 class="section-title" id="d266634e157">Methods</h5> <p id="d266634e159">We detail a propensity score evaluation framework that includes synthetic and real-world data experiments. Our synthetic experimental design extends the ‘plasmode’ framework and simulates survival data under known effect sizes, and our real-world experiments use a set of negative control outcomes with presumed null effect sizes. In reproductions of two published cohort studies, we compare two propensity score estimation methods that contrast in their model selection approach: <span class="inline-formula"> <math id="IM1" overflow="scroll"> <msub> <mrow> <mi>L</mi> </mrow> <mrow> <mn>1</mn> </mrow> </msub> </math> </span>-regularized regression that conducts a penalized likelihood regression, and the ‘high-dimensional propensity score’ (hdPS) that employs a univariate covariate screen. We evaluate methods on a range of outcome-dependent and outcome-independent metrics. </p> </div><div class="section"> <a class="named-anchor" id="s3"> <!-- named anchor --> </a> <h5 class="section-title" id="d266634e174">Results</h5> <p id="d266634e176"> <span class="inline-formula"> <math id="IM2" overflow="scroll"> <msub> <mrow> <mi>L</mi> </mrow> <mrow> <mn>1</mn> </mrow> </msub> </math> </span>-regularization propensity score methods achieve superior model fit, covariate balance and negative control bias reduction compared with the hdPS. Simulation results are mixed and fluctuate with simulation parameters, revealing a limitation of simulation under the proportional hazards framework. Including regularization with the hdPS reduces commonly reported non-convergence issues but has little effect on propensity score performance. </p> </div><div class="section"> <a class="named-anchor" id="s4"> <!-- named anchor --> </a> <h5 class="section-title" id="d266634e191">Conclusions</h5> <p id="d266634e193"> <span class="inline-formula"> <math id="IM3" overflow="scroll"> <msub> <mrow> <mi>L</mi> </mrow> <mrow> <mn>1</mn> </mrow> </msub> </math> </span>-regularization incorporates all covariates simultaneously into the propensity score model and offers propensity score performance superior to the hdPS marginal screen. </p> </div>

          Related collections

          Most cited references31

          • Record: found
          • Abstract: found
          • Article: not found

          High-dimensional propensity score adjustment in studies of treatment effects using health care claims data.

          Adjusting for large numbers of covariates ascertained from patients' health care claims data may improve control of confounding, as these variables may collectively be proxies for unobserved factors. Here, we develop and test an algorithm that empirically identifies candidate covariates, prioritizes covariates, and integrates them into a propensity-score-based confounder adjustment model. We developed a multistep algorithm to implement high-dimensional proxy adjustment in claims data. Steps include (1) identifying data dimensions, eg, diagnoses, procedures, and medications; (2) empirically identifying candidate covariates; (3) assessing recurrence of codes; (4) prioritizing covariates; (5) selecting covariates for adjustment; (6) estimating the exposure propensity score; and (7) estimating an outcome model. This algorithm was tested in Medicare claims data, including a study on the effect of Cox-2 inhibitors on reduced gastric toxicity compared with nonselective nonsteroidal anti-inflammatory drugs (NSAIDs). In a population of 49,653 new users of Cox-2 inhibitors or nonselective NSAIDs, a crude relative risk (RR) for upper GI toxicity (RR = 1.09 [95% confidence interval = 0.91-1.30]) was initially observed. Adjusting for 15 predefined covariates resulted in a possible gastroprotective effect (0.94 [0.78-1.12]). A gastroprotective effect became stronger when adjusting for an additional 500 algorithm-derived covariates (0.88 [0.73-1.06]). Results of a study on the effect of statin on reduced mortality were similar. Using the algorithm adjustment confirmed a null finding between influenza vaccination and hip fracture (1.02 [0.85-1.21]). In typical pharmacoepidemiologic studies, the proposed high-dimensional propensity score resulted in improved effect estimates compared with adjustment limited to predefined covariates, when benchmarked against results expected from randomized trials.
            Bookmark
            • Record: found
            • Abstract: not found
            • Book: not found

            Causal Inference for Statistics, Social, and Biomedical Sciences

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study.

              The propensity score--the probability of exposure to a specific treatment conditional on observed variables--is increasingly being used in observational studies. Creating strata in which subjects are matched on the propensity score allows one to balance measured variables between treated and untreated subjects. There is an ongoing controversy in the literature as to which variables to include in the propensity score model. Some advocate including those variables that predict treatment assignment, while others suggest including all variables potentially related to the outcome, and still others advocate including only variables that are associated with both treatment and outcome. We provide a case study of the association between drug exposure and mortality to show that including a variable that is related to treatment, but not outcome, does not improve balance and reduces the number of matched pairs available for analysis. In order to investigate this issue more comprehensively, we conducted a series of Monte Carlo simulations of the performance of propensity score models that contained variables related to treatment allocation, or variables that were confounders for the treatment-outcome pair, or variables related to outcome or all variables related to either outcome or treatment or neither. We compared the use of these different propensity scores models in matching and stratification in terms of the extent to which they balanced variables. We demonstrated that all propensity scores models balanced measured confounders between treated and untreated subjects in a propensity-score matched sample. However, including only the true confounders or the variables predictive of the outcome in the propensity score model resulted in a substantially larger number of matched pairs than did using the treatment-allocation model. Stratifying on the quintiles of any propensity score model resulted in residual imbalance between treated and untreated subjects in the upper and lower quintiles. Greater balance between treated and untreated subjects was obtained after matching on the propensity score than after stratifying on the quintiles of the propensity score. When a confounding variable was omitted from any of the propensity score models, then matching or stratifying on the propensity score resulted in residual imbalance in prognostically important variables between treated and untreated subjects. We considered four propensity score models for estimating treatment effects: the model that included only true confounders; the model that included all variables associated with the outcome; the model that included all measured variables; and the model that included all variables associated with treatment selection. Reduction in bias when estimating a null treatment effect was equivalent for all four propensity score models when propensity score matching was used. Reduction in bias was marginally greater for the first two propensity score models than for the last two propensity score models when stratification on the quintiles of the propensity score model was employed. Furthermore, omitting a confounding variable from the propensity score model resulted in biased estimation of the treatment effect. Finally, the mean squared error for estimating a null treatment effect was lower when either of the first two propensity scores was used compared to when either of the last two propensity score models was used. Copyright 2006 John Wiley & Sons, Ltd.
                Bookmark

                Author and article information

                Journal
                International Journal of Epidemiology
                Oxford University Press (OUP)
                0300-5771
                1464-3685
                December 2018
                December 01 2018
                June 22 2018
                December 2018
                December 01 2018
                June 22 2018
                : 47
                : 6
                : 2005-2014
                Affiliations
                [1 ]Department of Biomathematics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, CA, USA
                [2 ]Epidemiology Department, Janssen Research and Development LLC, Titusville, NJ, USA
                [3 ]Department of Biostatistics, UCLA Fielding School of Public Health, University of California, Los Angeles, CA, USA
                [4 ]Department of Human Genetics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, CA, USA
                Article
                10.1093/ije/dyy120
                6280944
                29939268
                568a8f2a-c5b5-4a76-b5d6-ef38b493861a
                © 2018

                https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model

                History

                Comments

                Comment on this article