12: Logistic Regression

Printer-friendly versionPrinter-friendly version

In many regression applications the response variable has only two outcomes: an event either did or did not occur, (such as having pain or having no pain). Such a variable is often referred to as a binary or binomial variable as its behavior is related to the binomial distribution. A regression model with this type of response can be interpreted as a model that estimates the effect of the independent variable(s) on the probability of the event occurring.

Binary response data typically appear in one of two ways:

  • When observations represent individual subjects, the response is represented by a dummy or indicator variable having any two values. The most commonly used values are zero if the event does not occur and unity (unity? what kind of value is this, a 1?) if it does.
  • When observations summarize the occurrence of events for each set of unique combinations of the independent variables, the response variable is x/n where x is the number of occurrences and n the number of observations in the set.

Let's take a look at an example...