Menu Home

Why not Square Error for Classification?

Win Vector LLC has been developing and delivering a lot of “statistics, machine learning, and data science for engineers” intensives in the past few years.

These are bootcamps, or workshops, designed to help software engineers become more comfortable with machine learning and artificial intelligence tools. The current thinking is: not every engineer is going to become a machine learning scientist, but most engineers are going to be working on projects with machine learning scientists. There are a great number of quality *passive* courses already out there, and we have found that the engineers are left desiring the ability to work with an expert consultant on an example project, and to be able to ask critical questions about what they are learning.

Our workshops have been transformative for engineers, and have taught *us* a lot about what are the basic critical questions about our field.

The teacher is moved from having an *opinion* on what concepts and alternatives are needed to rapidly master the material, to having seen what works and what open questions cause discomfort.

For example we teach, as we also have in our book Practical Data Science with R that statistical deviance (or equivalently cross entropy style methods) is an excellent tool for evaluating probability models. In *late* evaluation we find it more useful than AUC (which is very useful in *early* model assessment). Deviance, like AUC, doesn’t require the needless and wasteful conversion of a probability model to a mere decision rule, as evaluation metrics such as precision require.

When teaching something seemingly exotic like deviance we often get asked the following reasonable question:

Why can’t you just use square error?

Frankly that is a brilliant question reflecting experience from ordinary regression. This question deserves a prepared crisp answer.

And we think we have that discussion in our Python JupyterLab notebook: Why not Square Error for Classification?. Please check it out!

Categories: Mathematics Pragmatic Data Science Tutorials

Tagged as:


Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

%d bloggers like this: