R tip: force the use of named arguments when designing function signatures.
R’s named function argument binding is a great aid in writing correct programs. It is a good idea, if practical, to force optional arguments to only be usable by name. To do this declare the additional arguments after “...
” and enforce that none got lost in the “...
trap” by using a checker such as wrapr::stop_if_dot_args().
Example:
#' Increment x by inc. #' #' @param x item to add to #' @param ... not used for values, forces later arguments to bind by name #' @param inc (optional) value to add #' @return x+inc #' #' @examples #' #' f(7) # returns 8 #' f <- function(x, ..., inc = 1) { wrapr::stop_if_dot_args(substitute(list(...)), "f") x + inc } f(7) #> [1] 8 f(7, inc = 2) #> [1] 9 f(7, q = mtcars) #> Error: f unexpected arguments: q = mtcars f(7, 2) #> Error: f unexpected arguments: 2
By R function evaluation rules: any unexpected/undeclared arguments are captured by the “...
” argument. Then “wrapr::stop_if_dot_args()” inspects for such values and reports an error if there are such. The "f" string is returned as part of the error, I chose the name of the function as in this case. The “substitute(list(…))” part is R’s way of making the contents of “…” available for inspection.
You can also use the technique on required arguments. wrapr::stop_if_dot_args() is a simple low-dependency helper function intended to make writing code such as the above easier. This is under the rubric that hidden errors are worse than thrown exceptions. It is best to find and signal problems early, and near the cause.
The idea is that you should not expect a user to remember the positions of more than 1 to 3 arguments, the rest should only be referable by name. Do not make your users count along large sequences of arguments, the human brain may have special cases for small sequences.
If you have a procedure with 10 parameters, you probably missed some.
Alan Perlis, “Epigrams on Programming”, ACM SIGPLAN Notices 17 (9), September 1982, pp. 7–13.
Note that the “substitute(list(...))
” part is the R idiom for capturing the unevaluated contents of “...
“, I felt it best to use standard R as much a possible in favor of introducing any additional magic invocations.
jmount
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.
The epigrams on programming are new to me, and amazing! Thank you for that.
Wow, thanks!
Tipping my hand a bit about content I am going to use in future articles, but here are some more amazing sources:
Akin’s Laws of Spacecraft Design
Augustine’s laws
Edsger W. Dijkstra quotes
(I use on of Akin’s “(Mar’s Law) Everything is linear if plotted log-log with a fat magic marker” in the LogLogPlot documentation).
What happened when you have unknown number of arguments like dplyr select?
That is a case where it is not practical to use the method.
However, I prefer for things like
dplyr
select
to pass the column names in as a single string vector. Example:select(mtcars, one_of(qc(mpg, cyl, disp)))
(qc
coming from here).I totally agree that naming arguments is often the best way to write readable code. But this is preventing a behavior specifically allowed by a language by adding an unused parameter to exploit a related language detail. That doesn’t sit well with me.
A better solution is to tell people why naming arguments is good. Then let them decide for themselves in each scenario. Unless they’re coworkers, in which case you can smack them with the style guide.
Nathan, thanks for your comment and thinking on this. You have a number of good points, and I would like to hear more from you on this.
Let me try to get to common ground on this.
If it was a rule it would indeed be bad, as it tries to steer the language away from a legitimate good feature.
I am very glad you brought up style guides. Most of the current ones for R are fairly abusive. I do not mean to write one in this series. And confusion on that may explain some reactions. I need to find a way to get the following out. I mean for my tips to be tricks to use when you want a given effect or extra precautions to take when you want extra safety in some situation. None of these are supposed to be the only way.
I feel, for a given proposed function that had no use for
...
values it is okay to use...
to check for other errors as long as you document that. The idea is get the programming system to check what it can for you. I come from a more statically typed programming background, so I do like it when invariants can be enforced at the function interface.So definitely I agree with your point. In particular I agree with every word of your second paragraph. I didn’t realize I seemed to be tacking against it.
Just browsing Brett Slatkin’s “Effective Python.” Item 21: “Enforce Clarity with Keyword-Only Arguments.” Yey!