Chapter 8 “Advanced Data Preparation” of Practical Data Science with R is a study in:
- Using the
R vtreat
package for advanced data preparation. - Cross-validated data preparation.
It is the professionally edited, ready to cite version of an important data preparation methodology. An advantage being: a number of well documented result improving transforms are added to your predictive analytic work in one documented step.
We also have a number of free data preparation resources (for both the R vtreat
package and the Python vtreat
package; notice we believe in cross-language data science tools and practice):
- Free
R
video lecture on advanced variable preparation. - Free
Python
video lecture on advanced variable preparation. - Task oriented (and cross-linked) examples in
R
andPython
.
Categories: Administrativia data science Exciting Techniques Practical Data Science Pragmatic Data Science Pragmatic Machine Learning Statistics Tutorials
jmount
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.