Menu Home

help(let, package=’replyr’)

A bit more on the let wrapper from our replyr R package.

  library("replyr")
  help(let, package="replyr")

(Edit: this has been updated to the `0.2.0` version of `replyr` which eliminates some of the `()` notation).

let {replyr} R Documentation

Execute expr with name substitutions specified in alias.

Description

let implements a mapping from desired names (names used directly in the expr code) to names used in the data. Mnemonic: "expr code symbols are on the left, external data and function argument names are on the right."

Usage

let(alias, expr)

Arguments

alias

mapping from free names in expr to target names to use.

expr

block to prepare for execution

Details

Code adapted from gtools::strmacro by Gregory R. Warnes (License: GPL-2, this portion also available GPL-2 to respect gtools license). Please see the replyr vignette for some discussion of let and crossing function call boundaries: vignette('replyr','replyr'). Transformation is performed by substitution on the expression parse tree, so be wary of name collisions or aliasing.

Something like let is only useful to get control of a function that is parameterized (in the sense it take column names) but non-standard (in that it takes column names from non-standard evaluation argument name capture, and not as simple variables or parameters). So replyr:let is not useful for non-parameterized functions (functions that work only over values such as base::sum), and not useful for functions take parameters in straightforward way (such as base::merge‘s "by" argument). dplyr::mutate is an example where
we can use a let helper. dplyr::mutate is parameterized (in the sense it can work over user supplied columns and expressions), but column names are captured through non-standard evaluation (and it rapidly becomes unwieldy to use complex formulas with the standard evaluation equivalent dplyr::mutate_). alias can not include the symbol ".".

Value

result of expr executed in calling environment

See Also

replyr_mapRestrictCols letp

Examples


library('dplyr')
d <- data.frame(Sepal_Length=c(5.8,5.7),
                Sepal_Width=c(4.0,4.4),
                Species='setosa',
                rank=c(1,2))

mapping = list(RankColumn='rank',GroupColumn='Species')
let(alias=mapping,
    expr={
       # Notice code here can be written in terms of
       # known or concrete names "RankColumn" and
       # "GroupColumn", but executes as if we
       # had written mapping specified columns
       # "rank" and "Species".

       # restart ranks at zero.
       d %>% mutate(RankColumn=RankColumn-1) -> dres

       # confirm set of groups.
       unique(d$GroupColumn) -> groups
    })
print(groups)
print(length(groups))
print(dres)

# It is also possible to pipe into let-blocks, but it takes some extra
# notation (notice the extra ". %>%" at the beginning and the extra
# "()" at the end, to signal %>% to treat the let-block as a
# function to evaluate).

d %>% let(alias=mapping,
         expr={
           . %>% mutate(RankColumn=RankColumn-1)
         })()

# Or:

d %>% letp(alias=mapping,
         expr={
           . %>% mutate(RankColumn=RankColumn-1)
         })

# Or:

f <- let(mapping,
         . %>% mutate(RankColumn=RankColumn-1)
         )
d %>% f

# Be wary of using any assignment to attempt
# side-effects in these "delayed pipelines", as
# the assignment tends to happen during the
# let dereference and not (as one would hope) during
# the later pipeline application.  Example:

g <- let(alias=mapping,
         expr={
           . %>% mutate(RankColumn=RankColumn-1) -> ZZZ
         })
print(ZZZ)
# Notice ZZZ has captured a copy of the sub-pipeline
# and not waited for application of g.  Applying g
# performs a calculation, but does not overwrite ZZZ.

g(d)
print(ZZZ)
# Notice ZZZ is not a copy of g(d), but instead
# still the pipeline fragment.

# let works by string substitution aligning on
# word boundaries, so it does (unfortunately) also
# re-write strings.
let(list(x='y'),'x')


[Package replyr version 0.2.0 Index]

Categories: Coding

Tagged as:

jmount

Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

3 replies

  1. Really `_` functions from dplyr (like _mutate, _filter/) aren’t enough and we need whole new package?

    What `()` and `()()` are for after `let` call in the examples : < it looks hilarious?

    1. (Edit: this comment and `replyr` were both revised after Marcin Kosiński (and others’) feedback to work with `replyr` version 0.2.0 which eliminates one of the `()` notations that many people had a problem with. I used to think I had a good reason for it, but instead with less wrapping I am happy to see it gone. I need to practice being thankful for feedback and learn as much from it as fast as practical.)

      Having to use mutate_ a bit for various projects, I’ve found mutate_ to not be at all convenient. The required stats::setNames or lazyeval::interp forms are hard to read (let alone write or remember). As they stand right now, I think they are not enough. Some changes are coming (please see here for an April 2016 note on lazyeval capabilities, but also note the comment at the end: “Currently neither ggplot2 nor dplyr actually use these tools since I’ve only just figured it out. But I’ll be working hard to make sure all my packages are consistent in the near future.”).

      If replyr::let isn’t to your tastes, then it isn’t something you should try. As far as needing one more package, replyr adds some useful functionality (let, gapply, and other functions) and brings in a moderate number of dependencies.

      If you are interested, consider the simple problem of trying to create a column which indicates which rows of another column are NA when both the column to be tested and where to land the result are not known until later (i.e. we have to take the column names from variables).

      # set up problem:
      # for a data.frame build a column
      # indicating where another column
      # is NA. both variable names provided
      # outside the code in variables.
      library("dplyr")
      d <- data.frame(x = c(1, NA, 3))
      cname <- "x"      # column we are examining
      rname <- "x_isNA" # where to land results
      
      # dplyr mutate_ stats::setNames solution
      d %>% mutate_(.dots = 
          stats::setNames(paste0('is.na(', cname, ')'), 
                                            rname))
      
      # dplyr mutate_ lazyeval::interp solution
      d %>% mutate_(RCOL = 
          lazyeval::interp("is.na(cname)",
                           cname = as.name(cname))) %>%
        rename_(.dots = setNames('RCOL', rname))
      
      # replyr::let solution
      replyr::let(list(cname = "x", rname = "x_isNA"),
                  d %>% mutate(rname = is.na(cname))
                  )
      

      Of the three solutions, I dislike my own replyr::let solution the least.

      I’ve expanded the above into a vignette.

  2. Experimenting with operator notation versions of let (in Github version of package):

    # %:% pronounced "let in" is a
    # mapping on left operator alias for let.
    mapping %:% { d %>%  mutate(RankColumn=RankColumn-1) }
    
    # %//% pronounced "eval over" or "eval where" is a
    # mapping on right operator alias for let.
    { d %>%  mutate(RankColumn=RankColumn-1) } %//% mapping
    
%d bloggers like this: