Nina Zumel and John Mount will be speaking at the online University of San Francisco Seminar Series in Data Science! How and why to use probability models to outperform decision rules Friday April 30, 2021 12:30pm – 2pm Pacific Time See here for full details and to RSVP In this […]
Estimated reading time: 58 seconds
A good fraction of R users use Apple computers. Apple machines historically have sat at a sweet spot of convenience, power, and utility: Convenience: Apple machines are available at retail stores, come with purchasable support, and can run a lot of common commercial software. Power: R packages such as parallel […]
Estimated reading time: 7 minutes
For more and more clients we have been using a nice coding pattern taught to us by Garrett Grolemund in his book Hands-On Programming with R: make a function that returns a list of functions. This turns out to be a classic functional programming techique: use closures to implement objects […]
Estimated reading time: 19 minutes
Win-Vector LLC’s Nina Zumel has a great new article on the issue of taste in design and problem solving: Design, Problem Solving, and Good Taste. I think it is a big issue: how can you expect good work if you can’t even discuss how to tell good from bad?
Estimated reading time: 35 seconds
Any practicing data scientist is going to eventually have to work with a data stored in a Microsoft Excel spreadsheet. A lot of analysts use this format, so if you work with others you are going to run into it. We have already written how we don’t recommend using Excel-like […]
Estimated reading time: 9 minutes
This was originally posted at ninazumel.com. I’m re-blogging it here. Photo: John Mount I came across a post from Emily Willingham the other day: “Is a PhD required for Good Science Writing?”. As a science writer with a science PhD, her answer is: is it not required, and it can […]
Estimated reading time: 14 minutes
There is no excuse for a digital creative person to not use some sort of version control or source control. In the past disk space was too dear, version control systems were too expensive and software was not powerful enough; this is no longer the case. Unless your work is […]
Estimated reading time: 10 minutes
I tend to prefer command line Linux and full window OSX for my work. The development and data handling tool chain is a bit better in Linux and the user interface reliability of the complete vertical stack is a bit better in OSX. I repeat here a couple of tips […]
Estimated reading time: 5 minutes
I think I have been pretty productive on technical tasks lately and the method is (at least to me) interesting. The effect was accidental but I think one can explain it and reproduce it by synthesizing three important observations on human behavior.
Estimated reading time: 2 minutes
How do you get access to current and historical research articles if you are not affiliated with a university or large research organization? Our second public service article discusses some useful online research archives.
Estimated reading time: 13 minutes