Menu Home

Variable Utility is not Intrinsic

There is much ado about variable selection or variable utility valuation in supervised machine learning. In this note we will try to disarm some possibly common fallacies, and to set reasonable expectations about how variable valuation can work. Introduction In general variable valuation is estimating the utility that a column […]

Don’t Feel Guilty About Selecting Variables

We have an exciting new article to share: Don’t Feel Guilty About Selecting Variables. If you are at all interested in the probabilistic justification of important data science techniques, such as variable selection or pruning, this should be an informative and fun read. “Data Science” is often criticized with the […]

A Theory of Nested Cross Simulation

[Reader’s Note. Some of our articles are applied and some of our articles are more theoretical. The following article is more theoretical, and requires fairly formal notation to even work through. However, it should be of interest as it touches on some of the fine points of cross-validation that are […]