Menu Home

Why not Square Error for Classification?

Win Vector LLC has been developing and delivering a lot of “statistics, machine learning, and data science for engineers” intensives in the past few years.

These are bootcamps, or workshops, designed to help software engineers become more comfortable with machine learning and artificial intelligence tools. The current thinking is: not every engineer is going to become a machine learning scientist, but most engineers are going to be working on projects with machine learning scientists. There are a great number of quality *passive* courses already out there, and we have found that the engineers are left desiring the ability to work with an expert consultant on an example project, and to be able to ask critical questions about what they are learning.

Our workshops have been transformative for engineers, and have taught *us* a lot about what are the basic critical questions about our field.

The teacher is moved from having an *opinion* on what concepts and alternatives are needed to rapidly master the material, to having seen what works and what open questions cause discomfort.

For example we teach, as we also have in our book Practical Data Science with R that statistical deviance (or equivalently cross entropy style methods) is an excellent tool for evaluating probability models. In *late* evaluation we find it more useful than AUC (which is very useful in *early* model assessment). Deviance, like AUC, doesn’t require the needless and wasteful conversion of a probability model to a mere decision rule, as evaluation metrics such as precision require.

When teaching something seemingly exotic like deviance we often get asked the following reasonable question:

Why can’t you just use square error?

Frankly that is a brilliant question reflecting experience from ordinary regression. This question deserves a prepared crisp answer.

And we think we have that discussion in our Python JupyterLab notebook: Why not Square Error for Classification?. Please check it out!

Categories: Mathematics Practical Data Science Pragmatic Data Science Pragmatic Machine Learning Statistics Tutorials

Tagged as:

jmount

Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: