I recently came across the thoughtful article “On Moving from Statistics to Machine Learning, the Final Stage of Grief”. It makes some good points, and is worth the read. However, it also reminded me of the unexamined claim “data science is statistics done wrong.” Frankly this is not the case, […]
Nina Zumel has updated our training page to describe the Python data science intensive for software engineers we have been conducting for a couple of years. This is private group training in addition to our usual R training for scientists, and consulting offerings. Please check it out.
I would like to share a video where we show how to use the vtreat data transformer in the KNIME data science platform.
Allison Horst, Alison Hill, and Kristen Gorman are working to make a neat new example data set available to R users: the palmer penguins. It is a nice alternative to the over-used Iris data set as it has more rows, some missing values, nicer examples of Simpson’s Paradox, and more […]
Chapter 8 “Advanced Data Preparation” of Practical Data Science with R is a study in: Using the R vtreat package for advanced data preparation. Cross-validated data preparation. It is the professionally edited, ready to cite version of an important data preparation methodology. An advantage being: a number of well documented […]
One of the chapters that we are especially proud of in Practical Data Science with R is Chapter 7, “Linear and Logistic Regression.” We worked really hard to explain the fundamental principles behind both methods in a clear and easy-to-understand form, and to document diagnostics returned by the R implementations […]
We have an exciting new article to share: Don’t Feel Guilty About Selecting Variables. If you are at all interested in the probabilistic justification of important data science techniques, such as variable selection or pruning, this should be an informative and fun read. “Data Science” is often criticized with the […]
Thank you very much Why R? for being awesome hosts. We are really pleased with how your virtual MeetUp went. For those who missed it here is a video link.
Here is a fun combinatorial puzzle. I’ve probably seen this used to teach before, but let’s try to define or work this one from memory. I would love to hear more solutions/analyses of this problem. Suppose you have n kettles of soup labeled 0 through n-1. For our problem we […]
We have a discount on Manning Books, including our own Practical Data Science with R 2nd Edition!