A common mis-understanding of linear regression and logistic regression is that the intercept is thought to encode the unconditional mean or the training data prevalence.
This is easily seen to not be the case. Consider the following example in R.
We set up our example data.
And let’s fit a logistic regression.
## (Intercept) x1 x2 ## -1.2055937 -0.3129307 1.3620590
The probability encoded in the intercept term is given as follows.
## 1 ## 0.2304816
Notice the prediction 0.2304816 is neither the training outcome (
y) prevalence (0.2857143) nor the observed
y-rate for rows that have
x1, x2 = 0 (0).
The non-intercept coefficients do have an interpretation as the expected change in log-odds ratio implied by a given variable (assuming all other variables are held constant, which may not be a property of the data!).
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.