Analysis of Tobacco Consumption in Australia Using a Zero-Inflated Ordered Probit Model

Posted: 11 Jun 2007

Abstract

Health related data with discrete ordered dependent variables are often characterised by excessive 'zero' observations that may relate to two distinct data generating processes. For example, there may be a large proportion of 'no visit' in modeling discrete ordered levels of health service uses (such as doctor or hospital visits) in, say, the past twelve months, or a large number of non-participants in discrete ordered levels of private health insurance covers, or respondents who reported 'excellent' in the self-reported global health status grades that consist of people with genuine good health as well as those whose health is not necessarily good in the medical sense but who has a positive attitudes or a good job or a good circle of friends and prefers to report an excellent health.

Traditional ordered probit/logit models have limited capacity in explaining this preponderance of zero observations. In a manner analogous to the zero-inflated/augmented Poisson (ZIP/ZAP) models in the count data literature and double-hurdle models in the limited dependent variable literature, we propose a zero-inflated ordered probit (ZIOP) model using a double-hurdle combination of a split probit model and an ordered probit model (which relate to potentially different sets of covariates and have correlated errors) to account for the possibility that the zeros can arise from two different aspects of individual behaviour. We also present some Monte Carlo experiment results on the finite sample performance of the ML estimator and some suggested model selection strategies including Likelihood Ratio and Hausman tests for nested models, Vuong's test for non-nested models, as well as some traditional information criteria.

The model is applied to discrete data of levels of tobacco consumption from a large scale Australian survey on recreational drug uses. The empirical application demonstrates the advantages of the ZIOP model in separating the different behavioural schemes for participants and non-participants. In particular, we allow for the split of the observed non-users (zeros) into two groups: those of non-participants who choose not to smoke due to health concerns or other non-economic factors, and those zero consumption potential users who may be the result of a demand schedule 'corner' solution and therefore are responsive to economic factors such as prices and income. Our example shows that the use of a conventional ordered probit model would confuse the effects of some important explanatory variables that may have opposing impacts on the two schemes. One such example is the effect of income. While higher income, acting as an indicator for social class and health awareness, has a positive effect on genuine non-participation 'zeros', it decrease the chance of zero consumption for participants, due to a positive income effect on the levels of consumption decision with standard consumer demand theory at work. The application indicates that policy recommendations could be misleading if the splitting process is ignored.

Keywords: Ordered discrete data, zero-inflated responses, double hurdle model, two-part model, tobacco consumption

JEL Classification: C3, D1, I1

Suggested Citation

Zhao, Xueyan and Harris, Mark N., Analysis of Tobacco Consumption in Australia Using a Zero-Inflated Ordered Probit Model. Journal of Econometrics, Forthcoming, iHEA 2007 6th World Congress: Explorations in Health Economics Paper, Available at SSRN: https://ssrn.com/abstract=992657

Xueyan Zhao (Contact Author)

Monash University ( email )

PO Box 11E
Monash University,
Melbourne, VIC 3800, Victoria
Australia
+61 3 9905 2415 (Phone)
+61 3 9905 5474 (Fax)

Mark N. Harris

Curtin University ( email )

Kent Street
Bentley
Perth, WA WA 6102
Australia

HOME PAGE: http://business.curtin.edu.au/contact/staff_directory/?profile=Mark-Harris

Do you have negative results from your research you’d like to share?

Paper statistics

Abstract Views
1,041
PlumX Metrics