Menu Home

A bit more on testing

If you liked Nina Zumel’s article on the limitations of Random Test/Train splits you might want to check out her recent article on predictive analytics product evaluation hosted by our friends at Fliptop.

The related concepts from the two articles are:

  • limitations of Random Test/Train splits: a randomized split of data into test and training is generally a good idea. However, in the presence of omitted variables, time dependent effects, serial correlation, concept changes, or data-grouping it can fail to estimate your classifier performance correctly. The point is: splitting data from a retrospective study randomly is no where near as powerful as prior randomized test design (though some seem to intentionally conflate the two situations for their own benefit).
  • predictive analytics product evaluation: If your end-goal was to predict well only in a back-testing environment, then you in fact could use simple black-box testing as your only evaluation step. If your actual goal is to work well on unknown future data, then you may need to take some additional steps to try and correctly estimate how a product would perform in such a new situation.

The reason that these issues don’t usually get commented on is: usually we exhaust our allotted time trying to get beginning analysts to even implement randomized retrospective testing (a great good, but not a complete panacea). Moving on to proper prior experimental design, or structured simulations of good prior experimental design often seems like a bridge too far.

Categories: Administrativia

Tagged as:


Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

%d bloggers like this: