Suppose we have data in two data frames, and both of these data frames have common row-identifying columns called “
library("wrapr") d1 <- build_frame( "idx", "x" | 3 , "a" | 1 , "b" | 2 , "c" ) d2 <- build_frame( "idx", "y" | 2 , "D" | 1 , "E" | 3 , "F" ) print(d1) #> idx x #> 1 3 a #> 2 1 b #> 3 2 c print(d2) #> idx y #> 1 2 D #> 2 1 E #> 3 3 F
(Please see R Tip: Think in Terms of Values for
build_frame() and other value capturing tools.)
Often we wish to work with such data aligned so each row in
d2 has the same
idx value as the same row (by row order) as
d1. This is an important data wrangling task, so there are many ways to achieve it in R, such as
dplyr::left_join(), or by sorting both tables into the same order and then using
However if you wish to preserve the order of the first table (which may not be sorted), you need one more trick.
You can add a row-id column, sort by the joining id, combine and then re-sort by the row-id column.
Or you can match the orders in one step using
p <- match_order(d2$idx, d1$idx) print(d2[p, , drop=FALSE]) #> idx y #> 3 3 F #> 2 1 E #> 1 2 D
match_order is merely wrapping all of the sort and re-sort tricks we mentioned above, however the theory is based on the absolute magic of associative array indexing.
Please see R Tip: Use
drop = FALSE with
data.frames, for why one should get in the habit of writing
drop = FALSE.
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.