The core of our “statistics to English translation” series is Nina Zumel’s sequence of articles: “I don’t think that means what you think it means;” Statistics to English Translation, Part 1: Accuracy Measures Statistics to English Translation, Part 2a: ’Significant’ Doesn’t Always Mean ’Important’ Statistics to English Translation, Part 2b: […]
0.83 (or more precisely 5/6) is a special Area Under the Curve (AUC), which we will show in this note.
I would like to re-share links to our free vtreat data preparation system introduction videos, which show you what sort of machine learning problems vtreat can help you with. Python vtreat introduction video (PyData LA 2019), slides here. R vtreat introduction video (Why R? Foundation). The idea is: instead of […]
Win Vector LLC’s Dr. Nina Zumel has had great success applying y-aware methods to machine learning problems, and working out the detailed cross-validation methods needed to make y-aware procedures safe. I thought I would try our hand at y-aware neural net or deep learning methods here.
We have a new Win Vector data science article to share: Cross-Methods are a Leak/Variance Trade-Off John Mount (Win Vector LLC), Nina Zumel (Win Vector LLC) March 10, 2020 We work some exciting examples of when cross-methods (cross validation, and also cross-frames) work, and when they do not work. Abstract […]
We had such a positive reception to our last Introduction to Data Science promotion, that we are going to try and make the course available to more people by lowering the base-price to $29.99. We are also creating a 1 month promotional price of $20.99. To get a permanent subscription […]
To celebrate the new year and the recent release of Practical Data Science with R 2nd Edition, we are offering a free coupon for our video course “Introduction to Data Science.” The following URL and code should get you permanent free access to the video course, if used between now […]
Video of our PyData Los Angeles 2019 talk Preparing Messy Real World Data for Supervised Machine Learning is now available. In this talk describe how to use vtreat, a package available in R and in Python, to correctly re-code real world data for supervised machine learning tasks. Please check it […]
Slides for PyData LA 2019 vtreat Talk are here!
Practical Data Science with R, 2nd Edition author Dr. Nina Zumel, with a fresh author’s copy of her book!