Menu Home

The Shift and Balance Fallacies

Two related fallacies I see in machine learning practice are the shift and balance fallacies (for an earlier simple fallacy, please see here). They involve thinking logistic regression has a bit simpler structure that it actually does, and also thinking logistic regression is a bit less powerful than it actually […]

Surgery on ROC Plots

This note is a little break from our model homotopy series. I have a neat example where one combines two classifiers to get a better classifier using a method I am calling “ROC surgery.” In ROC surgery we look at multiple ROC plots and decide we want to cut out […]

Estimating Uncertainty of Utility Curves

Recently, we showed how to use utility estimates to pick good classifier thresholds. In that article, we used model performance on an evaluation set, combined with estimates of rewards and penalties for correct and incorrect classifications, to find a threshold that optimized model utility. In this article, we will show […]

Unrolling the ROC

In our data science teaching, we present the ROC plot (and the area under the curve of the plot, or AUC) as a useful tool for evaluating score-based classifier models, as well as for comparing multiple such models. The ROC is informative and useful, but it’s also perhaps overly concise […]

Soft margin is not as good as hard margin

This note is a link to an excerpt from my upcoming monster support vector machine article (where I work through a number of sections of [Vapnik, 1998] Vapnik, V. N. (1998), Statistical Learning Theory, Wiley). I try to run down how the original theoretical support vector machine claims are precisely […]

The Geometry of Classifiers

As John mentioned in his last post, we have been quite interested in the recent study by Fernandez-Delgado, et.al., “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?” (the “DWN study” for short), which evaluated 179 popular implementations of common classification algorithms over 120 or so data […]