Nina Zumel finished some great new documentation showing how to use
vtreat to prepare data for multinomial classification mode. And I have finally finished porting the documentation to
vtreat. So we now have good introductions on how to use
vtreat to prepare data for the common tasks of:
- Unsupervised data preparation:
- Multinomial classification:
Rmultinomial classification example,
Pythonmultinomial classification example.
That is now 8 introductions to start with. To use
vtreat you only have to work through one introduction (the one helping with the task you have at hand in the language you are using).
As I have said before:
vtreathelps with project blocking issues commonly seen in real world data: missing values, re-coding categorical variables, and dealing high cardinality categorical variables.
- If you aren’t using a tool like
vtreatin your data science projects: you are really missing out (and making more work for yourself).
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.