Introduction I would like to talk about the nature of supervised machine learning and overfitting. One of the cornerstones of our data science intensives is giving the participants the experiences of a data scientist in a safe controlled environment. We hope by working examples they can quickly get to the […]
The double density plot contains a lot of useful information. This is a plot that shows the distribution of a continuous model score, conditioned on the binary categorical outcome to be predicted. As with most density plots: the y-axis is an abstract quantity called density picked such that the area […]
In our data science teaching, we present the ROC plot (and the area under the curve of the plot, or AUC) as a useful tool for evaluating score-based classifier models, as well as for comparing multiple such models. The ROC is informative and useful, but it’s also perhaps overly concise […]
I have put a new release of the WVPlots package up on CRAN. This release adds palette and/or color controls to most of the plotting functions in the package. WVPlots was originally a catch-all package of ggplot2 visualizations that we at Win-Vector tended to use repeatedly, and wanted to turn […]
Nina Zumel and I have been working on packaging our favorite graphing techniques in a more reusable way that emphasizes the analysis task at hand over the steps needed to produce a good visualization. We are excited to announce the WVPlots is now at version 1.0.0 on CRAN!
A while back Simon Jackson and Kara Woo shared some great ideas and graphs on grouped bar charts and density plots (link). Win-Vector LLC‘s Nina Zumel just added a graph of this type to the development version of WVPlots. Nina has, as usual, some great documentation here.
Many data scientists (and even statisticians) often suffer under one of the following misapprehensions: They believe a technique doesn’t work in their current situation (when in fact it does), leading to useless precautions and missed opportunities. They believe a technique does work in their current situation (when in fact it […]
The Win-Vector public R packages now all have new pkgdown documentation sites! (And, a thank-you to Hadley Wickham for developing the pkgdown tool.) Please check them out (hint: vtreat is our favorite).
Recently Dirk Eddelbuettel pointed out that our R function debugging wrappers would be more convenient if they were available in a low-dependency micro package dedicated to little else. Dirk is a very smart person, and like most R users we are deeply in his debt; so we (Nina Zumel and […]