## Solving for Hidden Data

Introduction Let’s continue along the lines discussed in Omitted Variable Effects in Logistic Regression. The issue is as follows. For logistic regression, omitted variables cause parameter estimation bias. This is true even for independent variables, which is not the case for more familiar linear regression. This is a known problem […]

## Omitted Variable Effects in Logistic Regression

Introduction I would like to illustrate a way which omitted variables interfere in logistic regression inference (or coefficient estimation). These effects are different than what is seen in linear regression, and possibly different than some expectations or intuitions. Our Example Data Let’s start with a data example in R. # […]

## An Example of a Calibrated Model that is not Fully Calibrated

Our group has written a lot on calibration of models and even conditional calibration of models. In our last note we mentioned the possibility of “fully calibrated models.” This note is an example of a probability model that is calibrated in the traditional sense, but not fully calibrated in a […]

## The Shift and Balance Fallacies

Two related fallacies I see in machine learning practice are the shift and balance fallacies (for an earlier simple fallacy, please see here). They involve thinking logistic regression has a bit simpler structure that it actually does, and also thinking logistic regression is a bit less powerful than it actually […]

## Tailored Models are Not The Same as Simple Corrections

Let’s take a stab at our first note on a topic that pre-establishing the definitions of probability model homotopy makes much easier to write. In this note we will discuss tailored probability models. There are models deliberately fit to training data that has an outcome prevalence equal to the expected […]

## The Intercept Fallacy

A common mis-understanding of linear regression and logistic regression is that the intercept is thought to encode the unconditional mean or the training data prevalence. This is easily seen to not be the case. Consider the following example in R. library(wrapr) We set up our example data. # build our […]

## An Example Where Square Loss of a Sigmoid Prediction is not Convex in the Parameters

I’ve added a worked R example of the non-convexity, with respect to model parameters, of square loss of a sigmoid-derived prediction here. This is finishing an example for our Python note “Why not Square Error for Classification?”. Reading that note will give a usable context and background for this diagram. […]

## Why not Square Error for Classification?

Win Vector LLC has been developing and delivering a lot of “statistics, machine learning, and data science for engineers” intensives in the past few years. These are bootcamps, or workshops, designed to help software engineers become more comfortable with machine learning and artificial intelligence tools. The current thinking is: not […]

## Don’t Use Classification Rules for Classification Problems

There’s a common, yet easy to fix, mistake that I often see in machine learning and data science projects and teaching: using classification rules for classification problems. This statement is a bit of word-play which I will need to unroll a bit. However, the concrete advice is that you often […]

## Linear and Logistic Regression in Practical Data Science with R 2nd Edition

One of the chapters that we are especially proud of in Practical Data Science with R is Chapter 7, “Linear and Logistic Regression.” We worked really hard to explain the fundamental principles behind both methods in a clear and easy-to-understand form, and to document diagnostics returned by the R implementations […]