While going over some of the discussion related to my last post I came up with a really neat way to use
Please read on to see the situation and example.Suppose we want to parameterize over a couple of names, one denoting a variable coming from the current environment and one denoting a column name. Further suppose we are worried the two names may be the same.
We can actually handle this quite neatly, using
tidyeval to denote intent (in this case using “
!!” to specify “take from environment instead of the data frame”) and allowing
wrapr::let() to perform the substitutions.
suppressPackageStartupMessages(library("dplyr")) library("wrapr") mass_col_name = 'mass' mass_const_name = 'mass' mass % transmute(height, (!! MASS_CONST), # `mass` from environment MASS_COL, # `mass` from data.frame h100 = height * (!! MASS_CONST), # env hm = height * MASS_COL # data ) %>% head() ) #> # A tibble: 6 x 5 #> height `(100)` mass h100 hm #> #> 1 172 100 77 17200 13244 #> 2 167 100 75 16700 12525 #> 3 96 100 32 9600 3072 #> 4 202 100 136 20200 27472 #> 5 150 100 49 15000 7350 #> 6 178 100 120 17800 21360
All in all, that is pretty neat.
tidyeval uses “
(!! )” deference notation in a number of ways, here we are only using it to specify environment, not for substitution.)
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.
wrapr::let()features (found in the development version) include
debugPrint=TRUEmodes. With these options you can see what would be executed or what is being executed. These are great for learning
For example in our above example we could run:
This results in:
which is exactly the code re-written by
wrapr::let()has prepared for execution (one can even pass it to
eval()for execution). This is an excellent way to see what
wrapr::let()does, and work out if it does what you want. The “
(!(!mass))” is just how
(!! mass), and as you see executes the same.
wrapr::let()both prints the replaced expression and then executes as usual.
If you want to be very strict (and completely unambiguous) you can use the
.data$pronoun form to force references to the
data.frame. We show this below.
We do not currently recommend using the pronoun in the form
.data[[my_var]]. If you use `rlang`/`tidyeval` to perform substitutions *always* write something such as
.data[[!!my_var]](some details here). This is due to complications described in `dplyr` issues 2904 and 2916.
This is one of the reasons we advise using `wrapr::let()` for substitution, even if you are using `rlang`/`tidyeval` (hence why you might end up using them together).
The `rlang`/`tidyeval` substitution issues can be subtle and are possibly why the data-pronoun example in the actual `June 13, 2017 dplyr 0.7.0` announcement is not correct even using the development version of `dplyr` and `rlang`/`tidyeval` as of June 30, 2017.
Notice when we re-run the start of example the
data.frameis altered in an unexpected way (an extra column named “
my_var” is added) and the data is grouped by the column “
my_var“, and not by the column “
homeworld” as in the earlier non-pronoun example (which presumably this example was supposed to match). This will be an issue if one tries to use or join this data after a `summarize()` step, as only named variables and the grouping variable survive `summarize()` (so the “
homeworld” will not be present for downstream code expecting to use it).