This is a joyn wrapper that works in a similar fashion to base::merge and data.table::merge, which is why merge masks the other two.
Arguments
- x, y
data tables.yis coerced to adata.tableif it isn't one already.- by
A vector of shared column names in
xandyto merge on. This defaults to the shared key columns between the two tables. Ifyhas no key columns, this defaults to the key ofx.- by.x, by.y
Vectors of column names in
xandyto merge on.- all
logical;
all = TRUEis shorthand to save setting bothall.x = TRUEandall.y = TRUE.- all.x
logical; if
TRUE, rows fromxwhich have no matching row inyare included. These rows will have 'NA's in the columns that are usually filled with values fromy. The default isFALSEso that only rows with data from bothxandyare included in the output.- all.y
logical; analogous to
all.xabove.- sort
logical. If
TRUE(default), the rows of the mergeddata.tableare sorted by setting the key to theby / by.xcolumns. IfFALSE, unlike base R'smergefor which row order is unspecified, the row order inxis retained (including retaining the position of missing entries whenall.x=TRUE), followed byyrows that don't matchx(whenall.y=TRUE) retaining the order those appear iny.- suffixes
A
character(2)specifying the suffixes to be used for making non-bycolumn names unique. The suffix behaviour works in a similar fashion as themerge.data.framemethod does.- no.dups
logical indicating that
suffixesare also appended to non-by.ycolumn names inywhen they have the same column name as anyby.x.- allow.cartesian
See
allow.cartesianin[.data.table.- match_type
character: one of "m:m", "m:1", "1:m", "1:1". Default is "1:1" since this the most restrictive. However, following Stata's recommendation, it is better to be explicit and use any of the other three match types (See details in match types sections).
- keep_common_vars
logical: If TRUE, it will keep the original variable from y when both tables have common variable names. Thus, the prefix "y." will be added to the original name to distinguish from the resulting variable in the joined table.
- ...
Arguments passed on to
joyny_vars_to_keepcharacter: Vector of variable names in
ythat will be kept after the merge. If TRUE (the default), it keeps all the brings all the variables in y into x. If FALSE or NULL, it does not bring any variable into x, but a report will be generated.reportvarcharacter: Name of reporting variable. Default is ".joyn". This is the same as variable "_merge" in Stata after performing a merge. If FALSE or NULL, the reporting variable will be excluded from the final table, though a summary of the join will be display after concluding.
update_NAslogical: If TRUE, it will update NA values of all variables in x with actual values of variables in y that have the same name as the ones in x. If FALSE, NA values won't be updated, even if
update_valuesisTRUEupdate_valueslogical: If TRUE, it will update all values of variables in x with the actual of variables in y with the same name as the ones in x. NAs from y won't be used to update actual values in x. Yet, by default, NAs in x will be updated with values in y. To avoid this, make sure to set
update_NAs = FALSEverboselogical: if FALSE, it won't display any message (programmer's option). Default is TRUE.
Examples
x1 = data.frame(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.frame(id = c(1,2, 4),
y = c(11L, 15L, 16))
joyn::merge(x1, y1, by = "id")
#>
#> ── JOYn Report ──
#>
#> .joyn n percent
#> 1 x 2 66.7%
#> 2 y 1 33.3%
#> 3 total 3 100%
#> ────────────────────────────────────────────────────────── End of JOYn report ──
#> ℹ Note: Joyn's report available in variable .joyn
#> ℹ Note: Removing key variables id from id and y
#> ⚠ Warning: Supplied both by and by.x/by.y. by argument will be ignored.
#> ⚠ Warning: The keys supplied uniquely identify y, therefore a m:1 join is
#> executed
#> id t x y .joyn
#> 1 1 1 11 11 x & y
#> 2 1 2 12 11 x & y
#> 3 2 1 13 15 x & y
# example of using by.x and by.y
x2 = data.frame(id1 = c(1, 1, 2, 3, 3),
id2 = c(1, 1, 2, 3, 4),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = c(16, 12, NA, NA, 15))
y2 = data.frame(id = c(1, 2, 5, 6, 3),
id2 = c(1, 1, 2, 3, 4),
y = c(11L, 15L, 20L, 13L, 10L),
x = c(16:20))
jn <- joyn::merge(x2,
y2,
match_type = "m:m",
all.x = TRUE,
by.x = "id1",
by.y = "id2")
#>
#> ── JOYn Report ──
#>
#> .joyn n percent
#> 1 y 1 14.3%
#> 2 x & y 6 85.7%
#> 3 total 7 100%
#> ────────────────────────────────────────────────────────── End of JOYn report ──
#> ℹ Note: Joyn's report available in variable .joyn
#> ℹ Note: Removing key variables keyby1 from id, keyby1, y, and x
#> ⚠ Warning: Supplied both by and by.x/by.y. by argument will be ignored.
# example with all = TRUE
jn <- joyn::merge(x2,
y2,
match_type = "m:m",
by.x = "id1",
by.y = "id2",
all = TRUE)
#>
#> ── JOYn Report ──
#>
#> .joyn n percent
#> 1 y 1 12.5%
#> 2 x & y 7 87.5%
#> 3 total 8 100%
#> ────────────────────────────────────────────────────────── End of JOYn report ──
#> ℹ Note: Joyn's report available in variable .joyn
#> ℹ Note: Removing key variables keyby1 from id, keyby1, y, and x
#> ⚠ Warning: Supplied both by and by.x/by.y. by argument will be ignored.