In psychosocial and behavioral studies count outcomes recording the frequencies of the occurrence of some health or behavior outcomes (such as the number of unprotected sexual behaviors during a period of time) often contain a preponderance of zeroes because of the presence of ‘structural zeroes’ that occur when some subjects are not at risk for the behavior of interest. Unlike random zeroes (responses that can be greater than zero, but are zero due to sampling variability), structural zeroes are usually very different, both statistically and clinically. False interpretations of results and study findings may result if differences in the two types of zeroes are ignored. However, in practice, the status of the structural zeroes is often not observed and this latent nature complicates the data analysis. In this article, we focus on one model, the zero-inflated Poisson (ZIP) regression model that is commonly used to address zero-inflated data. We first give a brief overview of the issues of structural zeroes and the ZIP model. We then given an illustration of ZIP with data from a study on HIV-risk sexual behaviors among adolescent girls. Sample codes in SAS and Stata are also included to help perform and explain ZIP analyses.
在社会心理学和行为学的研究中,记录某些健 康或行为结果发生频率的计数中(如在一段时间内无 防护措施的性行为的次数)往往含有大量的零,这是 因为当某些对象对于某种研究行为没有危险时就会产 生“结构性零”。不像随机零(结果可以是大于零,但 是也可能由于样本变异性而成为零),结构性零在统 计和临床上通常是非常不同的。如果两种类型零的差 异被忽略,就可能会导致对结果和研究发现的错误解 释。然而在实践中,结构性零经常会没有被观察到而 这种潜在性使数据分析复杂化了。在这篇文章中,我 们专注于一种模式,即通常用于解决零膨胀数据的零 膨胀泊松(Zero-inflated Poisson,ZIP)回归模型。首先, 我们对结构性零和ZIP模型做一个简要概述。然后我 们以一项青春期少女艾滋病高危性行为的研究数据来 阐述ZIP模型。文中还附有SAS和Stata的示例代码, 以帮助运行和解释ZIP分析。