Chapter 8 “Advanced Data Preparation” of Practical Data Science with R is a study in:
- Using the
R vtreatpackage for advanced data preparation. - Cross-validated data preparation.
It is the professionally edited, ready to cite version of an important data preparation methodology. An advantage being: a number of well documented result improving transforms are added to your predictive analytic work in one documented step.
We also have a number of free data preparation resources (for both the R vtreat package and the Python vtreat package; notice we believe in cross-language data science tools and practice):
- Free
Rvideo lecture on advanced variable preparation. - Free
Pythonvideo lecture on advanced variable preparation. - Task oriented (and cross-linked) examples in
RandPython.
Categories: Administrativia Exciting Techniques Pragmatic Data Science Tutorials
jmount
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.