Modeling Discrete Choice: Categorical Dependent Variables, Logistic Regression, and Maximum Likelihood Estimation

9 Pages Posted: 2 Jun 2017

See all articles by Anton Ovchinnikov

Anton Ovchinnikov

Smith School of Business - Queen's University; INSEAD - Decision Sciences

Abstract

This technical note introduces business students to the concepts of modeling discrete choice (e.g., a consumer purchasing brand A versus brand B) using logistic regression and maximum-likelihood estimation. It draws the analogy between modeling discrete choice and building a regression model with a dummy dependent variable and on an example illustrates the need for estimating the probability of a choice rather than the choice itself, which leads to a special kind of regression – logistic regression. The note presents the concepts of utility and a random utility choice model, of which the logistic regression model is the most commonly used. It shows how choice probabilities can be constructed from utilities leading to the logit model. It then presents the maximum-likelihood estimation (MLE) method of fitting the logit model to the choice data. Working through a detailed example using Solver and accompanying spreadsheet model, the note gives students deep understanding for how MLE works and how it is similar and different to the standard least-squared estimation in linear regression. The note concludes by presenting the results of estimation using StatTools, a commercial statistical software. The note avoids the use of heavy mathematical machinery but still requires rudimentary knowledge of exponent and logarithmic functions, probability, and optimization with Solver, as well as familiarity with the “standard” linear regression. Applications include building of models for consumer choice, estimating price elasticity, price optimization, product versioning, product line design, and conjoint analysis.

Excerpt

UVA-QA-0779

Nov. 7, 2011

MODELING DISCRETE CHOICE: CATEGORICAL DEPENDENT VARIABLES, LOGISTIC REGRESSION, AND MAXIMUM LIKELIHOOD ESTIMATION

Consider an individual choosing between two or more discrete alternatives: a shopper in a grocery store deciding between apple or orange juice, or a prospective student determining which of several university offers he ought to accept. For the juice manufacturer and the university, the ability to predict the outcome of such choices is of vital importance.

In this note, we will discuss how this might be done. The process we will follow bears some similarity to a regular linear regression but also has substantial differences, primarily due to the fact that the choices are discrete; that is, they correspond to a categorical dependent variable in regression.

The Concept of Utility

. . .

Keywords: Logit, logistic, multi nomial, MNL, regression, discrete, consumer, choice, model, modeling, models, random, utility, maximum-likelihood estimation, MLE, conjoint, analysis, likelihood, log-likelihood, exponent, logarithm, marketing, analytics, price, e

Suggested Citation

Ovchinnikov, Anton, Modeling Discrete Choice: Categorical Dependent Variables, Logistic Regression, and Maximum Likelihood Estimation. Darden Case No. UVA-QA-0779, Available at SSRN: https://ssrn.com/abstract=2975146 or http://dx.doi.org/10.2139/ssrn.2975146

Anton Ovchinnikov (Contact Author)

Smith School of Business - Queen's University ( email )

143 Union Str. West
Kingston, ON K7L3N6
Canada

INSEAD - Decision Sciences ( email )

United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
2
Abstract Views
1,100
PlumX Metrics