Another R tip. Use vector(mode = "list")
to pre-allocate lists.
result <- vector(mode = "list", 3) print(result) #> [[1]] #> NULL #> #> [[2]] #> NULL #> #> [[3]] #> NULL
The above used to be critical for writing performant R code (R seems to have greatly improved incremental list growth over the years). It remains a convenient thing to know.
Pre-allocation is particularly useful when using for
-loops.
for(i in seq_along(result)) { result[[i]] <- i } print(result) # [[1]] # [1] 1 # # [[2]] # [1] 2 # # [[3]] # [1] 3
seq_along()
is a handy function similar to what we discussed in R Tip: Use seq_len()
to Avoid The Backwards List Trap. For “[[ ]]
” please see R Tip: Use [[ ]] Wherever You Can.
Note: for
-loops are not in fact a always a bad thing (even in R
). for
-loops can be easier to debug, are the right solution when you are carrying state from iteration to iteration, and with proper pre-allocation can be as performant as map/apply methods. Mostly one should not use them where better vectorized operations can be used. For example: in R
it is usually wrong to try to iterate over rows in a data.frame
, as usually thare are vectorized operators that can more efficiently write the same process in terms of column operations.
jmount
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.