Nina and I are proud to share our lecture: “Prepping Data for Analysis using R” from ODSC West 2015. Nina Zumel and John Mount ODSC WEST 2015 It is about 90 minutes, and covers a lot of the theory behind the vtreat data preparation library. We also have a Github […]
Nina Zumel and I are proud to announce our R vtreat variable treatment library has just been accepted by CRAN!
Image by Liz Sullivan, Creative Commons. Source: Wikimedia An all too common approach to modeling in data science is to throw all possible variables at a modeling procedure and “let the algorithm sort it out.” This is tempting when you are not sure what are the true causes or predictors […]
In our previous post in this series, we introduced sessionization, or converting log data into a form that’s suitable for analysis. We looked at basic considerations, like dealing with time, choosing an appropriate dataset for training models, and choosing appropriate (and achievable) business goals. In that previous example, we sessionized […]