`wrapr`

`1.6.2`

is now up on CRAN. We have some neat new features for `R`

users to try (in addition to many earlier `wrapr`

goodies).

The first is the `%in_block%`

alternate notation for `let()`

.

The `wrapr`

`let()`

-block allows easy replacement of names in name-capturing interfaces (such as `transform()`

), as we show below.

```
library("wrapr")
column_mapping <- qc(
AREA_COL = Sepal.Area,
LENGTH_COL = Sepal.Length,
WIDTH_COL = Sepal.Width
)
# let-block notation
let(
alias = column_mapping,
iris %.>%
transform(.,
AREA_COL = (pi/4)*LENGTH_COL*WIDTH_COL) %.>%
subset(.,
select = qc(Species, AREA_COL)) %.>%
head(.)
)
```

```
## Species Sepal.Area
## 1 setosa 14.01936
## 2 setosa 11.54535
## 3 setosa 11.81239
## 4 setosa 11.19978
## 5 setosa 14.13717
## 6 setosa 16.54049
```

The `qc()`

notation allowed us to specify a named-`vector`

without quotes. `qc(a = b)`

is equivalent to `c("a" = "b")`

.

With the `%in_block%`

operator notation one writes the `let()`

-block as an in-line operator supplying the mapping into a code block. The above example can now be re-written as the following.

```
# %in_block% notation
column_mapping %in_block% {
iris %.>%
transform(.,
AREA_COL = (pi/4)*LENGTH_COL*WIDTH_COL) %.>%
subset(.,
select = qc(Species, AREA_COL)) %.>%
head(.)
}
```

```
## Species Sepal.Area
## 1 setosa 14.01936
## 2 setosa 11.54535
## 3 setosa 11.81239
## 4 setosa 11.19978
## 5 setosa 14.13717
## 6 setosa 16.54049
```

This notation can be handy for defining functions.

```
compute_area <- function(
.data,
area_col,
length_col,
width_col) c( # End of function argument definiton
AREA_COL = area_col,
LENGTH_COL = length_col,
WIDTH_COL = width_col
) %in_block% { # End of argument mapping block
.data %.>%
transform(.,
AREA_COL = (pi/4)*LENGTH_COL*WIDTH_COL)
} # End of function body block
iris %.>%
compute_area(.,
'Sepal.Area', 'Sepal.Length', 'Sepal.Width') %.>%
compute_area(.,
'Petal.Area', 'Petal.Length', 'Petal.Width') %.>%
subset(.,
select = c("Species", "Sepal.Area", "Petal.Area")) %.>%
head(.)
```

```
## Species Sepal.Area Petal.Area
## 1 setosa 14.01936 0.2199115
## 2 setosa 11.54535 0.2199115
## 3 setosa 11.81239 0.2042035
## 4 setosa 11.19978 0.2356194
## 5 setosa 14.13717 0.2199115
## 6 setosa 16.54049 0.5340708
```

We can think of the above function definition notation as having two blocks: the alias defining block (the portion before "`%in_block%`

") and the templated function body (the portion after "`%in_block%`

"). Notice how easy it is to use this notation to convert a non-standard (or name/code-capturing interface) into a value-oriented interface. The point is value-oriented interfaces are much more re-usable and easier to program over (use in for-loops, applies, and functions).

The second new feature is the `orderv()`

function, a value-oriented adapter for `base::order()`

. `orderv()`

uses a vector of column names to compute an ordering permutation for a `data.frame`

. We can use it as we show below.

```
library("wrapr")
sort_columns <- qc(mpg, hp, gear)
ordering <- orderv(mtcars[ , sort_columns, drop = FALSE],
decreasing = TRUE,
method = "radix")
head(mtcars[ordering, , drop = FALSE])
```

```
## mpg cyl disp hp drat wt qsec vs am gear carb
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
```

Of course we have also have all the steps wrapped in a convenient function: `sortv()`

.

```
mtcars %.>%
sortv(.,
sort_columns,
decreasing = TRUE,
method = "radix") %.>%
head(.)
```

```
## mpg cyl disp hp drat wt qsec vs am gear carb
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
```

For details on "`method = "radix"`

" please see our earlier tip here.

A third new feature is `mk_formula()`

. `mk_formula()`

is used to build simple formulas for modeling tasks (which may have a large number of variables) without any string processing or parsing.

Our usual advice for building simple formulas has been to use the `paste()`

-based methods exhibited in "R Tip: How to Pass a formula to lm". This remains good advice. However `mk_formula()`

is a more concise and more hygienic alternative. An example is given below.

```
# specifications of how to model,
# coming from somewhere else
outcome <- "mpg"
variables <- c("cyl", "disp", "hp", "carb")
# our modeling effort,
# fully parameterized!
f <- wrapr::mk_formula(outcome, variables)
print(f)
```

`## mpg ~ cyl + disp + hp + carb`

```
model <- lm(f, data = mtcars)
print(model)
```

```
##
## Call:
## lm(formula = f, data = mtcars)
##
## Coefficients:
## (Intercept) cyl disp hp carb
## 34.021595 -1.048523 -0.026906 0.009349 -0.926863
```

The above notation is good for programming over modeling tasks.

Edit: `mk_formula()`

duplicates some functionality of `stats::reformulate()`

. Though the current implementation of `stats::reformulate()`

appears to use the `paste()`

pattern (which I actually like). However we get “cluck-clucked” when we use `paste()`

to build up formulas, so our code is in terms of `stats::update.formula()`

(which appears to use terms and not pasting, though that is not confirmed).

Categories: Programming Tutorials

### jmount

Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

I like the idea of the

`%in_block%`

operator. Is`wrapr::mk_formula`

different from`reformulate`

? It looks very similarLikeLike

Thanks!

Looks like

`stats::reformulate()`

appears to do the job. I’ll update the note and package to reflect that (goof on my part not researching that further).However inside it’s code is:

And we have been nagged that “building up formulas using paste” is wrong/dangerous as it causes later risky parsing (not a problem if one uses reasonable column names). We’ve been using paste()-type solutions here.

`wrapr::mk_formula()`

uses only`stats::update_formula()`

, which appears to work only through`terms`

-strurctures (possibly no pasting and re-parsing). So those that are squeamish about strings don’t have grounds to complain.LikeLike

Good to know John, thank you!

LikeLike

Maybe you will like this conversation : https://github.com/tidyverse/glue/issues/108

LikeLike

Ugh. That was a task that

`paste(, collapse = " + ")`

is already good at.I have to admit I only wrote

`mk_formula()`

as a was getting tired of getting cluck-clucked at when I wrote`as.formula(paste(outcome, paste(variables, collapse = " + "), sep = " ~ "))`

(which turns out to be what`reformulate`

just does).It isn’t a hard problem.

LikeLike