Menu Home

Must Have Software

Having worked with Unix (BSD, HPUX, IRIX, Linux and OSX), Windows (NT4, 2000, XP, Vista and 7) for quite a while I have seen a lot of different software tools. I would like to quickly exhibit my “must have” list. These are the packages that I find to be the single “must have offerings” in a number of categories. I have avoided some categories (such as editors, email programs, programing language, IDEs, photo editors, backup solutions, databases, database tools and web tools) where I have no feeling of having seen a single absolute best offering.

The spirit of the list is to pick items such that: if you disagree with an item in this list then either you are wrong or you know something I would really like to hear about.

Encryption, disk images: TrueCrypt (open source: Linux, Windows, OSX)
TrueCrypt can create portable encrypted virtual disks (files that can be mounted as a disk on any operating system).

On 28 May 2014, the TrueCrypt website announced that the project was no longer maintained and recommended users to find alternative solutions (wikipedia).

Encryption, files: GnuPG (open source: Linux, Windows, OSX)
GnuPG is the tool to use to encrypt files for email.
Presentation: Apple Keynote (commercial: OSX)
Keynote is not quite as friendly as Microsoft PowerPoint, but it quickly produces beautiful presentations.

Reference Library: Papers (commercial: OSX)
“iTunes for PDF.” Manage thousands of PDFs and references, annotate with meta-data, place papers into multiple project folders.

An interesting alternative is BibDesk (open source: OSX, not an in-PDF searcher).
Papers acquired by Springer, added tons of flakey “social sharing” buttons (can no longer rate papers without sharing the rating on many of the user interface paths) and increasing OSX indexing flakiness has greatly reduced the reliability of search on Apple OSX.

Spreadsheet: Microsoft Excel (commercial: Windows, OSX)
Open Office and Google Docs are getting better every day, but neither come close to Microsoft Excel in functionality and versatility of user interface. If you are on a platform that supports Excel, working regularly with spreadsheets and using something other than Excel: it really means that you do not value your time.
Statistics Software: R (open source: Linux, Windows, OSX)
R is rapidly becoming the platform of choice for statisticians and is (with the addition of lattice and ggplot2) the best way to produce graphs. R has fairly nasty programming language, but has so many statistical operations available that it can not be avoided.
Technical Documentation: LaTeX (open source: Linux, Windows, OSX)
It may seem antiquated but TeX/LaTex is still far more powerful than the “WSYWYG” pretenders. The separation of presentation from specification, automatic management of references, table of contents and being able
to include PDFs from external files (which get refreshed when you re-build the document) are all lifesavers.
Version Control: git (open source: Linux, Windows, OSX)
Just about the only version control system that: doesn’t damage the data you are trying to manage by adding dot-files into all of the directories, can routinely handle large files and can work productively without a network connection. Perforce is powerful central server commercial option (with the ability to have central policies, control and review).

I look forward to learning which of my choices are considered poor and what your must-haves are.

Categories: Opinion Tutorials

Tagged as:


Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

2 replies

  1. Didn’t know about _Papers_, downloading it now. Thanks!

    I would add:

    Statistics Software: *RPy2* – Open source Python to R bridge. Do away with R’s obtuse language and enjoy nearly seamless integration of R and Python (

    Technical Documentation: *Sphinx* – Open source documentation system. I never could grok LaTeX. Instead I reach for Sphinx when creating cross-referenced HTML and PDF output. But of course, equations are few and far between in my documents. (

    Spreadsheet: *Resolver One* – Commercial. “The spreadsheet is the program.” An Excel-compatible spreadsheet programmed in IronPython instead of VBA. Simple to integrate with almost any .NET component. Supports live Bloomberg and Reuters data feeds. (

  2. Thanks for the info about TrueCrypt.
    I agree on the R, Excel, and LaTeX recommendations.
    JabRef (open-source based on BibTeX) is quite good for bibliography management.

%d bloggers like this: