Let’s consider piping in `R`

both using the `magrittr`

package and using the `wrapr`

package.

`magrittr`

pipelines

The `magittr`

pipe glyph “`%>%`

” is the most popular piping symbol in `R`

.

`magrittr`

documentation describes `%>%`

as follow.

Basic piping:

`x %>% f`

is equivalent to`f(x)`

`x %>% f(y)`

is equivalent to`f(x, y)`

`x %>% f %>% g %>% h`

is equivalent to`h(g(f(x)))`

The argument placeholder

`x %>% f(y, .)`

is equivalent to`f(y, x)`

`x %>% f(y, z = .)`

is equivalent to`f(y, z = x)`

Re-using the placeholder for attributesIt is straight-forward to use the placeholder several times in a right-hand side expression. However, when the placeholder only appears in a nested expressions magrittr will still apply the first-argument rule. The reason is that in most cases this results more clean code.

`x %>% f(y = nrow(.), z = ncol(.))`

is equivalent to`f(x, y = nrow(x), z = nrow(x))`

The behavior can be overruled by enclosing the right-hand side in braces:

`x %>% {f(y = nrow(.), z = ncol(.))}`

is equivalent to`f(y = nrow(x), z = nrow(x))`

That is a bit of simplification, but is the taught mental model.

Grolemund, Wickham, *R for Data Science*, O’Reilly Media, 2017; “Pipes” describes the `magrittr`

pipe as follows.

foo_foo %>% hop(through = forest) %>% scoop(up = field_mouse) %>% bop(on = head)[…]

The pipe works by performing a “lexical transformation”: behind the scenes, magrittr reassembles the code in the pipe to a form that works by overwriting an intermediate object. When you run a pipe like the one above, magrittr does something like this:

my_pipe <- function(.) { . <- hop(., through = forest) . <- scoop(., up = field_mice) bop(., on = head) } my_pipe(foo_foo)

Roughly they are saying `x %>% f(ARGS)`

can be considered shorthand for `{ . <- x; f(., ARGS) }`

where the evaluation in question happens in a temporary environment.

## A Mental Model for Planning Pipelines

To safely and confidently use piping one must eventually know what all of the commonly used related notations mean. For example it is important to know what each of the following evaluate to:

`5 %>% sin`

: the notation demonstrated in the`magrittr`

excerpt.`5 %>% sin()`

: possibly the notation one would abstract from the*R for Data Science*excerpt.`5 %>% sin(.)`

: the notation we recomend (especially for the part time`R`

user).

Also, there are questions of how one pipes into general expressions (instead of names, functions, or partially specified function evaluation signatures).

These may seem like details: but they are the steps required to move from copying code from examples and hoping it works (a state of learned helplessness, especially when simple variations fail) or having an effective (even if approximate) mental model for the operators one has decided to work with and plan over.

`wrapr`

pipelines

`wrapr`

supplies its own piping glyph: “dot pipe” `%.>%`

. `wrapr`

’s goal is to supply an operator that is a regular and safe with `a %.>% b`

being *approximately* syntactic sugar for `{ . <- a; b }`

(with, visible side-effects, i.e. we can actually see the “`.`

” assignment happen).

```
library("wrapr")
# calculate sin(5)
5 %.>% sin(.)
```

`## [1] -0.9589243`

```
# 5 left in dot, a visible side-effect
print(.)
```

`## [1] 5`

```
# clear dot, so no later failing example
# falsely appears to work
rm(list = ".")
```

We think `wrapr`

piping is very comprehensible (non-magic) expression oriented pipe with a few rules and additional admonitions:

- Use explicit dots, i.e. write
`5 %.>% sin(.)`

and not`5 %.>% sin()`

or`5 %.>% sin`

. It good to make it obvious to the reader that “`.`

” is a free-name in the right-hand side expression, allowing the easy application of the convention of treating the right-hand side expression as an implicit function of “`.`

”. - You get some free de-referencing such as in
`5 %.>% sin`

and function application as in`5 %.>% function(x) { sin(x) }`

. - Outer parentheses do not change meaning (as is commonly the case outside pipelines, modulo
`R`

’s visibility controls). - Outer braces treat contents as raw statements, turning off
`wrapr`

convenience transforms and safety checking. This is compatible with the subtle`R`

convention that brace-blocks`{}`

are considered more opaque and not as eagerly looked into as parenthesized expressions (one such example can be found here). `wrapr`

is grammar in the sense some statements are deliberately not part of the accepted notation. Some of the “errors” in the next set of examples are in fact`wrapr`

refusing certain pipelines.- Advanced users can extend
`wrapr`

by using`R`

`S3`

methodology to specify their own rules for various classes (such as building pipable`ggplot2`

code). Technical details can be found here.

## Examples

Let’s consider the following attempts of writing piped variations of `sin(5)`

in both `magritter`

and `wrapr`

notations.

```
exprs = c(
"5 PIPE_GLYPH sin",
"5 PIPE_GLYPH sin()",
"5 PIPE_GLYPH sin(.)",
"5 PIPE_GLYPH base::sin",
"5 PIPE_GLYPH base::sin()",
"5 PIPE_GLYPH base::sin(.)",
"5 PIPE_GLYPH ( sin )",
"5 PIPE_GLYPH ( sin() )",
"5 PIPE_GLYPH ( sin(.) )",
"5 PIPE_GLYPH { sin }",
"5 PIPE_GLYPH { sin() }",
"5 PIPE_GLYPH { sin(.) }",
"5 PIPE_GLYPH function(x) { sin(x) }",
"5 PIPE_GLYPH ( function(x) { sin(x) } )",
"5 PIPE_GLYPH { function(x) { sin(x) } }",
"f <- function(x) { sin(x) }; 5 PIPE_GLYPH f"
)
```

The point is in a room full of students in a lab setting if you show them “`5 %>% sin`

” some of them are going to try variations or have variations from their work that are important to them. This possibly includes: package-qualifying the function name, wrapping expressions in parenthesis, altering arguments, building functions, and retrieving functions from data structures. The pipeline (for convenience) tries to lower the distinctions between expressions, functions, and function names. However the pipeline notation does not completely eliminate the differences.

A non-expert `magrittr`

/`dplyr`

user might expect all the pipe examples we are about to discuss to evaluate to `sin(5)`

= -0.9589243. As `R`

is routinely used by self-described non-programmers (such as scientists, analysts, and statisticians) the non-expert or part time `R`

user is a very important class of `R`

users (and in fact distinct from beginning `R`

users). So how a system meets or misses simplified expectations is quite important in `R`

.

To run our examples we will use a fairly involved function `work_examples()`

that takes the vector of examples and returns an annotated `data.frame`

of evaluation results. For completeness this code is given here, but can be safely skipped when reading this article.

Now we can work our examples, and return the comparison in tabular format.

```
work_examples(exprs, sin(5)) %.>%
knitr::kable(., format = "html", escape = FALSE) %.>%
column_spec(., 1:4, width = "1.75in") %.>%
kable_styling(., "striped", full_width = FALSE)
```

magrittr expr | magrittr res | wrapr expr | wrapr res |
---|---|---|---|

5 %>% sin | -0.959 | 5 %.>% sin | -0.959 |

5 %>% sin() | -0.959 | 5 %.>% sin() | wrapr::pipe_step.default does not allow direct piping into a no-argument function call expression (such as “sin()”, please use sin(.)). |

5 %>% sin(.) | -0.959 | 5 %.>% sin(.) | -0.959 |

5 %>% base::sin | unused argument (sin) | 5 %.>% base::sin | -0.959 |

5 %>% base::sin() | -0.959 | 5 %.>% base::sin() | wrapr::pipe_step.default does not allow direct piping into a no-argument function call expression (such as “base::sin()”, please use base::sin(.)). |

5 %>% base::sin(.) | -0.959 | 5 %.>% base::sin(.) | -0.959 |

5 %>% ( sin ) | -0.959 | 5 %.>% ( sin ) | -0.959 |

5 %>% ( sin() ) | 0 arguments passed to ‘sin’ which requires 1 | 5 %.>% ( sin() ) | wrapr::pipe_step.default does not allow direct piping into a no-argument function call expression (such as “sin()”, please use sin(.)). |

5 %>% ( sin(.) ) | object ‘.’ not found | 5 %.>% ( sin(.) ) | -0.959 |

5 %>% { sin } | .Primitive(“sin”) | 5 %.>% { sin } | .Primitive(“sin”) |

5 %>% { sin() } | 0 arguments passed to ‘sin’ which requires 1 | 5 %.>% { sin() } | 0 arguments passed to ‘sin’ which requires 1 |

5 %>% { sin(.) } | -0.959 | 5 %.>% { sin(.) } | -0.959 |

5 %>% function(x) { sin(x) } | Anonymous functions myst be parenthesized | 5 %.>% function(x) { sin(x) } | -0.959 |

5 %>% ( function(x) { sin(x) } ) | -0.959 | 5 %.>% ( function(x) { sin(x) } ) | -0.959 |

5 %>% { function(x) { sin(x) } } | function (x) { sin(x) } | 5 %.>% { function(x) { sin(x) } } | function (x) { sin(x) } |

f <- function(x) { sin(x) }; 5 %>% f | -0.959 | f <- function(x) { sin(x) }; 5 %.>% f | -0.959 |

As can now see, some statements were not roughly equivalent to `sin(5)`

.

One related case to consider is the following (which we run by hand, as it seems to default `knitr`

or `kableExtra`

`html`

styling, note: the “‘[’” and other formatting errors are an artifacts of `HTML`

quoting/rendering, and not part of the expressions):

```
c("lst <- list(h = sin); 5 PIPE_GLYPH lst$h",
"lst <- list(h = sin); 5 PIPE_GLYPH lst$h()",
"lst <- list(h = sin); 5 PIPE_GLYPH lst$h(.)",
"lst <- list(h = sin); 5 PIPE_GLYPH lst[['h']]",
"lst <- list(h = sin); 5 PIPE_GLYPH lst[['h']]()",
"lst <- list(h = sin); 5 PIPE_GLYPH lst[['h']](.)") %.>%
work_examples(., sin(5)) %.>%
knitr::kable(., format = "html", escape = FALSE)
```

magrittr expr | magrittr res | wrapr expr | wrapr res |
---|---|---|---|

lst <- list(h = sin); 5 %>% lst$h | 3 arguments passed to ‘$’ which requires 2 | lst <- list(h = sin); 5 %.>% lst$h | -0.959 |

lst <- list(h = sin); 5 %>% lst$h() | -0.959 | lst <- list(h = sin); 5 %.>% lst$h() | wrapr::pipe_step.default does not allow direct piping into a no-argument function call expression (such as “lst$h()”, please use lst$h(.)). |

lst <- list(h = sin); 5 %>% lst$h(.) | -0.959 | lst <- list(h = sin); 5 %.>% lst$h(.) | -0.959 |

lst <- list(h = sin); 5 %>% lst[[‘h’]] | incorrect number of subscripts | lst <- list(h = sin); 5 %.>% lst[[‘h’]] | -0.959 |

lst <- list(h = sin); 5 %>% lst[[‘h’]]() | -0.959 | lst <- list(h = sin); 5 %.>% lst[[‘h’]]() | wrapr::pipe_step.default does not allow direct piping into a no-argument function call expression (such as “lst[[”h“]]()”, please use lst[[“h”]](.)). |

lst <- list(h = sin); 5 %>% lst[[‘h’]](.) | -0.959 | lst <- list(h = sin); 5 %.>% lst[[‘h’]](.) | -0.959 |

## Analysis

`magrittr`

Results

The `magrittr`

exceptions include the following.

`::`

is a function, as so many things are in`R`

. So`base::sin`

is not really the package qualified name for`sin()`

, it is actually shorthand for``::`("base", "sin")`

which is a function evaluation that performs the look-up. So`5 %>% base::sin`

expands to an analogue of`. <- 5; `::`(., "base", "sin")`

, leading to the observed error message.`()`

is`magrittr`

’s “evaluate before piping into” notation, so`5 %>% ( sin() )`

and`5 %>% ( sin(.) )`

both throw an error as evaluation is attempted before any alteration of arguments is attempted.`{}`

is`magrittr`

’s “treat the contents as raw statements” notation (which is not in fact`magrittr`

’s default behavior). Thus`magrittr`

’s function evaluation signature alteration transforms are not applied to`5 %>% { sin }`

or`5 %>% { sin() }`

.

Again, the above are not `magrittr`

bugs, they are just how `magrittr`

’s behavior differs from a very regular or naive internalization of `magrittr`

rules. Notice neither of “`()`

” nor “`{}`

” are neutral notations in `magrittr`

(the first adds an extra evaluation, and second switches to an expression mode with fewer substitutions). Also note the above is an argument for preferring “`sin(.)`

” to “`sin()`

”, or “`sin`

”; as “`sin(.)`

” had the most regular `magrittr`

behavior (not changing with the introduction of “`()`

”, “`{}`

”, or “`base::`

”).

Regularity is especially important for part time users, as you want reasonable variations of what is taught to work so that experimentation is positive and not an exercise in learned helplessness. It is convenient when your tools happen to work the way you might remember.

`wrapr`

Results

The `wrapr`

error messages and non-numeric returns are driven by the following:

`5 %.>% sin()`

is not an allowed`wrapr`

notation. The`wrapr`

philosophy is not to alter evaluation signatures. The error message is signalling that the statement is not valid`wrapr`

grammar (not well formed in terms of`wrapr`

rules). Notice the error message suggests the alternate notation`sin(.)`

. Similar rules apply for`base::sin()`

. Then intent is that outer parenthesis are non-semantic, they do not change change`wrapr`

pipe behavior.`5 %.>% { sin }`

returns just the`sin`

function. This is because`{}`

triggers`wrapr`

’s “leave the contents alone” behavior.

The user only encounters two exceptions in the above variations. The first is “don’t write `sin()`

”, which comes with a clear error message and help (“try `sin(.)`

”). The second is “outer `{}`

treats its contents as raw statements, turning off transforms and checking.

`wrapr`

is hoping to stay close the principle of least surprise.

The hope is that `wrapr`

piping is easy, powerful, useful, and not *too* different than `a %.>% b`

being treated as almost syntactic sugar for `{ . <- a; b }`

.

#### Aesthetics

An obvious down-side of `wrapr`

piping is the excess dots both in the operator and in the evaluation arguments. We *strongly* feel the extra dots in the evaluation arguments is actually a good trade in losing some conciseness in exchange for useful explicitness. We do not consider the extra dot in the pipe operator to be a problem (especially if you bind the operator to a keyboard shortcut). If the extra dot in the pipe operator is such a deal-breaker, consider that it could be gotten rid of by copying the pipe operator to your notation of choice (such as executing ``%>%` <- wrapr::`%.>%``

or ``%.%` <- wrapr::`%.>%``

at the top of your work). However such re-mappings are needlessly confusing and it is best to use the operator glyph that `wrapr`

directly supplies.

## Non-function examples

We can also try a few simpler expressions, that do not have an explicit function marker such as `sin(.)`

.

```
c("5 PIPE_GLYPH 1 + .",
"5 PIPE_GLYPH (1 + .)",
"5 PIPE_GLYPH {1 + .}") %.>%
work_examples(., 6) %.>%
knitr::kable(., format = "html", escape = FALSE) %.>%
column_spec(., 1:4, width = "1.75in") %.>%
kable_styling(., "striped", full_width = FALSE)
```

magrittr expr | magrittr res | wrapr expr | wrapr res |
---|---|---|---|

5 %>% 1 + . | attempt to apply non-function | 5 %.>% 1 + . | wrapr::pipe_step.default does not allow direct piping into simple values such as class:numeric, type:double. |

5 %>% (1 + .) | non-numeric argument to binary operator | 5 %.>% (1 + .) | 6 |

5 %>% {1 + .} | 6 | 5 %.>% {1 + .} | 6 |

Some of what caused exceptions above is “`5 %ANYTHING% 1 + .`

” is parsed (due to `R`

’s operator precedence rules) as “`(5 %ANYTHING% 1) + .`

”. So without extra grouping notations (“()” or “{}”) this is not a well-formed pipeline. With `wrapr`

it is safe to add in parenthesis, with `magrittr`

one must use `{}`

(though this can not be used with `5 %>% {sin}`

).

## The Importance of Strictness

For some operations that are unlikely to work close to reasonable user intent `wrapr`

includes checks to warn-off the user. The following shows a few more examples of this “defense of grammar.”

`5 %.>% 7`

`## Error in pipe_step.default(pipe_left_arg, pipe_right_arg, pipe_environment, : wrapr::pipe_step.default does not allow direct piping into simple values such as class:numeric, type:double.`

```
# magrittr's error message for the above is something of the form:
# "Error in function_list[[k]](value) : attempt to apply non-function"
5 %.>% .
```

`## Error in pipe_step.default(pipe_left_arg, pipe_right_arg, pipe_environment = pipe_environment, : wrapr::pipe_step.default does not allow direct piping into simple values such as class:numeric, type:double.`

```
# note: the above error message is improved to:
# "wrapr::pipe does not allow direct piping into '.'"
# in wrapr 1.4.1
5 %.>% return(.)
```

`## Error in pipe_step.default(pipe_left_arg, pipe_right_arg, pipe_environment, : wrapr::pipe_step.default does not allow direct piping into certain reserved words or control structures (such as "return").`

Throwing errors in these situations is based on the principle that non-signalling errors (often leading to result corruption) are much worse than signalling errors. The “`return`

” example is an interesting case in point.

Let’s first take a look at the effect with `magrittr`

. Suppose we were writing a simple function to find for a positive integer returns the smallest non-trivial (greater than `1`

*and* less than the value in question) positive integer divisor of the value in question (returning `NA`

if there is none such). Such a function might work like the following.

```
f_base <- function(x) {
u <- min(ceiling(sqrt(x)), x-1L)
i <- 2L
while(i<=u) {
if((x %% i)==0) {
return(i)
}
i <- i + 1L
}
NA_integer_
}
f_base(37)
```

`## [1] NA`

`f_base(35)`

`## [1] 5`

Now suppose we try to get fancy and use “`i %>% return`

” instead of “`return(i)`

”. This produces a function that thinks all integer are prime. The reason is: `magrittr`

can call the `return()`

function, but in this situation `return()`

can’t manage the control path of the original function.

```
f_magrittr <- function(x) {
u <- min(ceiling(sqrt(x)), x-1L)
i <- 2L
while(i<=u) {
if((x %% i)==0) {
i %>% return
}
i <- i + 1L
}
NA_integer_
}
f_magrittr(37)
```

`## [1] NA`

`f_magrittr(35)`

`## [1] NA`

Now suppose we tried the same thing with `wrapr`

pipe and write `i %.>% return(.)`

.

```
f_wrapr <- function(x) {
u <- min(ceiling(sqrt(x)), x-1L)
i <- 2L
while(i<=u) {
if((x %% i)==0) {
i %.>% return(.)
}
i <- i + 1L
}
NA_integer_
}
f_wrapr(37)
```

`## [1] NA`

`f_wrapr(35)`

`## Error in pipe_step.default(pipe_left_arg, pipe_right_arg, pipe_environment, : wrapr::pipe_step.default does not allow direct piping into certain reserved words or control structures (such as "return").`

`wrapr`

also can not handle `return()`

control flow correctly, however it (helpfully) throws an exception to indicate the problem.

## Conclusion

`R`

usually has more than one good way to perform tasks. In this case we talked about two methods of building pipelines in `R`

: `magrittr`

and `wrapr`

. There are more methods (some of which are listed here). Our preferred pipe is the `wrapr`

dot-pipe, and in the of style academic priority we try to credit alternatives and share fair comparisons (as we have done here). Priority is important to respect (as in: `magrittr`

is powerful, popular, came well before, and greatly influences `wrapr`

dot-pipe), but it is not monopoly rights (for example: the public CRAN release/announcement of `let()`

, our popular and still preferred substitution methodology and originally part of `replyr`

, predates the public CRAN release/announcement of `rlang`

/`tidyeval`

code re-writing methods). In client work we use whatever style is most compatible with the client’s work and needs, for example we feel it does not make sense to take a legacy `dplyr`

project and attempt to switch the pipe notation late in the game (and one does not want to needlessly mix notations).

I just wanted to make you aware of another piping operator: %>>% from the package pipeR (documentation: https://renkun-ken.github.io/pipeR-tutorial/index.html). It is similar to the pipe operator from the magrittr package, but more flexible and principled. It immediately became my preferred piping operator once I started using it!

LikeLike

I had read the pipeR documentation and appreciate the design. I haven’t use pipeR in a project yet. Thanks for your note, it encourages me to get around to trying it.

LikeLike

I’ll have to have a look at pipeR. It may be the choice of example, but I can’t imagine where I would use anything other than sin() or sin(.). I am aware the latter is better practice, but old habits die hard. Can you give an example when I might want to use some of the later versions?

Of course the magrittr pipe does have a shortcut which I use automatically – another habit.

PS I saw this on RBloggers

LikeLike

Hi Julia,

A few things I like about pipeR:

One is being able to pipe an argument to multiple commands using %>>% {…}. For example I use the data.table package for my most of my data manipulation, and so if I want to rename multiple variables I can do something like:

dt %>>% {

setnames(., ‘old1’, ‘new1’)

setnames(., ‘old2’, ‘new2’)

}

Or with a for loop, if I want to loop over just the numeric columns of a data table without repeating the name of the data table three times, I can do something like:

dt %>>% {

for (.j in names(.)[sapply(., is.numeric)]) {

.[get(.j) < 0, (.j) := NA]

}

}

Another useful feature is piping for side effects — especially in combination with the data.table package because data.table updates values by reference, so with piping for side effect you can update a value and then keep the updated data table moving through the pipeline.

Piping to dot (or a custom-names placeholder) is also useful when you’re piping to an argument other than the first. I know magrittr has the ability to pipe to dot as well, but with pipeR you can remove the syntax ambiguity that comes up in some cases (e.g. when piping to the data argument of lm() while also using a period in the regression formula) by specifying a different placeholder instead of a period to pipe to.

The documentation (https://renkun-ken.github.io/pipeR-tutorial/index.html) has more details on the other features of pipeR, although the ones I described here are the ones I use the most.

LikeLike