I’d like to address how and why I am making the recent light-board video lectures (please check them out: A/B testing and Simpson’s Paradox, and Bayes’s Law and Odds). How How is the easy part. There are a number of tutorials on how to do this. The one I found […]

Estimated reading time: 5 minutes

Introduction We’ve been writing on the distribution density shapes expected for probability models in ROC (receiver operator characteristic) plots, double density plots, and normal/logit-normal densities frameworks. I thought I would re-approach the issue with a specific family of examples.

Estimated reading time: 12 minutes

Our group has written a lot on calibration of models and even conditional calibration of models. In our last note we mentioned the possibility of “fully calibrated models.” This note is an example of a probability model that is calibrated in the traditional sense, but not fully calibrated in a […]

Estimated reading time: 4 minutes

The double density plot contains a lot of useful information. This is a plot that shows the distribution of a continuous model score, conditioned on the binary categorical outcome to be predicted. As with most density plots: the y-axis is an abstract quantity called density picked such that the area […]

Estimated reading time: 2 minutes

For classification problems I argue one of the biggest steps you can take to improve the quality and utility of your models is to prefer models that return scores or return probabilities instead of classification rules. Doing this also opens a second large opportunity for improvement: working with your domain […]

Estimated reading time: 19 minutes

Two related fallacies I see in machine learning practice are the shift and balance fallacies (for an earlier simple fallacy, please see here). They involve thinking logistic regression has a bit simpler structure that it actually does, and also thinking logistic regression is a bit less powerful than it actually […]

Estimated reading time: 7 minutes

This note is a little break from our model homotopy series. I have a neat example where one combines two classifiers to get a better classifier using a method I am calling “ROC surgery.” In ROC surgery we look at multiple ROC plots and decide we want to cut out […]

Estimated reading time: 40 seconds