I’d like some feedback on a possible article or series. I am thinking about writing and/or recording videos on the measure theoretic foundations of probability. The idea is: empirical probability (probabilities of coin flips, dice rolls, and finite sequences) is fairly well taught and approachable. However, theoretical probability (the type […]
I recently came across the thoughtful article “On Moving from Statistics to Machine Learning, the Final Stage of Grief”. It makes some good points, and is worth the read. However, it also reminded me of the unexamined claim “data science is statistics done wrong.” Frankly this is not the case, […]
Nina Zumel has updated our training page to describe the Python data science intensive for software engineers we have been conducting for a couple of years. This is private group training in addition to our usual R training for scientists, and consulting offerings. Please check it out.
I would like to share a video where we show how to use the vtreat data transformer in the KNIME data science platform.
Allison Horst, Alison Hill, and Kristen Gorman are working to make a neat new example data set available to R users: the palmer penguins. It is a nice alternative to the over-used Iris data set as it has more rows, some missing values, nicer examples of Simpson’s Paradox, and more […]
Nina and I are cleaning up websites, links, and projects. I would like to take the opportunity re-share my old genetic art project through a short demonstration video. Read more about the Genetic Art Project here.
I’d like to share a video of my old knot editing software in action. Read more about KnotEd here.
Chapter 8 “Advanced Data Preparation” of Practical Data Science with R is a study in: Using the R vtreat package for advanced data preparation. Cross-validated data preparation. It is the professionally edited, ready to cite version of an important data preparation methodology. An advantage being: a number of well documented […]
Just a heads-up, Nina and I are working on re-structuring and updating the website. In particular we are finally moving to https. Please don’t be alarmed if things are in flux, and some links break. We are managing all of https://www.winvector.com, http://www.win-vector.com/, and https://winvector.wordpress.com. The new no-dash URL is not […]
One of the chapters that we are especially proud of in Practical Data Science with R is Chapter 7, “Linear and Logistic Regression.” We worked really hard to explain the fundamental principles behind both methods in a clear and easy-to-understand form, and to document diagnostics returned by the R implementations […]