Menu Home

Is dplyr Easily Comprehensible?

dplyr is one of the most popular R packages. It is powerful and important. But is it in fact easily comprehensible?

dplyr makes sense to those of us who use it a lot. And we can teach part time R users a lot of the common good use patterns.

But, is it an easy task to study and characterize dplyr itself?

Please take our advanced dplyr quiz to test your dplyr mettle.

NewImage

“Pop dplyr quiz, hot-shot! There is data in a pipe. What does each verb do?”

Categories: Opinion

Tagged as:

jmount

Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

2 replies

  1. As far as I can tell the “starwars” example in the quiz has been wrong since it was introduced in the dplyr 0.7.0 announcement of 2017-06-13. The work around I think is to add extra “!!“s to the code (meaning perhaps dplyr is fine, and the example was what is wrong). But frankly that is just me trying things until something works like the documentation claims it should.

    Below is the issue: dplyr “pronoun mode” renames columns- which means if you try to use those columns you run into trouble (unless you add the extra “!!“s, which don’t seem to be part of the example or pronoun documentation).

    suppressPackageStartupMessages(library("dplyr"))
    
    # explicit code, works
    starwars %>%
      select(homeworld) %>%
      group_by(homeworld) %>%
      select(homeworld) %>%
      summarize(count = n()) %>%
      head()
    #> # A tibble: 6 x 2
    #>        homeworld count
    #>            <chr> <int>
    #> 1       Alderaan     3
    #> 2    Aleen Minor     1
    #> 3         Bespin     1
    #> 4     Bestine IV     1
    #> 5 Cato Neimoidia     1
    #> 6          Cerea     1
    
    grouping_column <- "homeworld"
    
    # augmented (extra "!!"s) pronound code
    # seems to work.
    starwars %>%
      select(.data[[!!grouping_column]]) %>%
      group_by(.data[[!!grouping_column]]) %>%
      select(.data[[!!grouping_column]]) %>%
      summarize(count = n()) %>%
      head()
    #> # A tibble: 6 x 2
    #>        homeworld count
    #>            <chr> <int>
    #> 1       Alderaan     3
    #> 2    Aleen Minor     1
    #> 3         Bespin     1
    #> 4     Bestine IV     1
    #> 5 Cato Neimoidia     1
    #> 6          Cerea     1
    
    # "pronoun code" as described in announcement,
    # does not work.
    starwars %>%
      select(.data[[grouping_column]]) %>%
      group_by(.data[[grouping_column]]) %>%
      select(.data[[grouping_column]]) %>%
      summarize(count = n()) %>%
      head()
    #> Error: Must subset with a string
    
%d bloggers like this: