In a lot of our R writing we casually say “install from CRAN using install.packages(‘PKGNAME’)” or “update your packages by using update.packages(ask = FALSE, checkBuilt = TRUE) (and answering ‘no’ to all questions about compiling).” We recently became aware that for some users this isn’t complete advice.
Estimated reading time: 4 minutes
Here is simple modeling problem in R. We want to fit a linear model where the names of the data columns carrying the outcome to predict (y), the explanatory variables (x1, x2), and per-example row weights (wt) are given to us as string values in variables.
Estimated reading time: 7 minutes
Let’s try some "ugly corner cases" for data manipulation in R. Corner cases are examples where the user might be running to the edge of where the package developer intended their package to work, and thus often where things can go wrong. Let’s see what happens when we try to […]
Estimated reading time: 8 minutes
Our group has done a lot of work with non-standard calling conventions in R. Our tools work hard to eliminate non-standard calling (as is the purpose of wrapr::let()), or at least make it cleaner and more controllable (as is done in the wrapr dot pipe). And even so, we still […]
Estimated reading time: 4 minutes
R Tip: be wary of “…“. The following code example contains an easy error in using the R function unique(). vec1 <- c("a", "b", "c") vec2 <- c("c", "d") unique(vec1, vec2) # [1] "a" "b" "c" Notice none of the novel values from vec2 are present in the result. Our […]
Estimated reading time: 8 minutes
R has a lot of under-appreciated super powerful functions. I list a few of our favorites below. Atlas, carrying the sky. Royal Palace (Paleis op de Dam), Amsterdam. Photo: Dominik Bartsch, CC some rights reserved.
Estimated reading time: 1 minute
R tip: use stringsAsFactors = FALSE. R often uses a concept of factors to re-encode strings. This can be too early and too aggressive. Sometimes a string is just a string. It is often claimed Sigmund Freud said “Sometimes a cigar is just a cigar.”
Estimated reading time: 1 minute
Take care if trying the new RPostgres database connection package. By default it returns some non-standard types that code developed against other database drivers may not expect, and may not be ready to defend against. Danger, Will Robinson!
Estimated reading time: 2 minutes
Some days I see R as an eclectic programming language preferred by scientists. “Programming languages as people.” From Leftover Salad (David Marino). Other days I see it more like the following.
Estimated reading time: 38 seconds
Here is an R tip. Use loop indices to avoid for()-loops damaging classes. Below is an R annoyance that occurs again and again: vectors lose class attributes when you iterate over them in a for()-loop. d <- c(Sys.time(), Sys.time()) print(d) #> [1] "2018-02-18 10:16:16 PST" "2018-02-18 10:16:16 PST" for(di in […]
Estimated reading time: 2 minutes