# Neglected R Super Functions

R has a lot of under-appreciated super powerful functions. I list a few of our favorites below.

Atlas, carrying the sky. Royal Palace (Paleis op de Dam), Amsterdam.

Photo: Dominik Bartsch, CC some rights reserved.

• stats::approx(): approximate a curve/function.
• base::cumsum(): cumulative ordered sum.
• stats::ecdf(): estimate the cumulative distribution function.
• base::findInterval(): assign values to bins.
• base::match(): bulk computation of first match. Can lookup and sort data and even find non-duplicate data.
• base::Reduce(): nifty functional method to combine multiple function evaluations.
• base::tapply(): grouped summary function.
• base::unlist(): build arrays of atomic values from more complicated nested structures.
• base::Vectorize(): Convert scalar functions into functions ready to operate on arrays.

Categories: Opinion Programming Statistics

Tagged as:

### jmount

Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

### 14 replies ›

1. Nathan says:

Not sure if they’re “neglected,” but I use setdiff and intersect a lot. They’re both just calls to match with a little extra logic, but they definitely improve readability.

Like

1. Good point setdiff(), intersect(), and unique() are good to keep in mind.

Like

2. Jeffrey Magouirk says:

table() is a function I use a great deal

Like

3. Anonymous says:

base::aggregate() works like similar to tapply gut returns a data frame
base::mapply() can substitute some nested for loops by taking multiple arguments
base::assign() is very useful when creating several variables from a for loop

Like

4. Bill Venables says:

Why “neglected”? I use all of these on a regular basis. The ones you may have missed are stats::approxfun() and stats::splinefun(). I find the functional versions of approx() and spline() much easier to get my head around.

Like

1. “neglected” may be a stretch- but these functions are so great they definitely deserve an extra call-out.

Like

5. My favorites, not in your list are the hdquantile function of the Hmisc package, sapply from base, and probably Matrix from the Matrix package, with its compressed matrix formats.

As general techniques, splines seemed undermentioned, whether Akima interpolating via akima package and aspline, or pspline package and its smooth.Pspline function.

Like

1. Bill Venables says:

You mean I don’t need to use match(x, sort(x)) any more?
Next you’ll be telling me there’s a function for match(sort(x), x) called ‘order(x)’ or something…
Sheesh!

Like

1. Again, more in the spirit of: I remembered an odd sorting application that rank() is convenient for (in contrast to rank()‘s obvious utility in ranking things).

Like

6. Alan Haynes says:

For plotting, I find graphics::grconvertX() and graphics::grconvertY() very useful, particularly with boxplots. graphics::layout() is also very handy.

Like

7. I like pretty especially in base R plots. Setting ylim=range(pretty(X)) makes plots (boxplots, barplots, scatterplots)… prettier :)

Like

8. Scott Locklin says:

It’s funny, all of these are essentially base primitives in APL languages. It’s kind of amazing Iverson thought of everything before he even had a repl to work with.

Like