Chapter 8 “Advanced Data Preparation” of Practical Data Science with R is a study in:
- Using the
R vtreat
package for advanced data preparation. - Cross-validated data preparation.
It is the professionally edited, ready to cite version of an important data preparation methodology. An advantage being: a number of well documented result improving transforms are added to your predictive analytic work in one documented step.
We also have a number of free data preparation resources (for both the R vtreat
package and the Python vtreat
package; notice we believe in cross-language data science tools and practice):
- Free
R
video lecture on advanced variable preparation. - Free
Python
video lecture on advanced variable preparation. - Task oriented (and cross-linked) examples in
R
andPython
.
Categories: Administrativia Exciting Techniques Pragmatic Data Science Tutorials
jmount
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.