data.table, apply function, and return results in a data.table.dt_ddply.RdFor each subset of a data.table, apply function then combine results into a data.table.
dt_ddply(
.data,
.variables,
.f = NULL,
...,
.progress = "none",
.drop = TRUE,
.parallel = FALSE
)
dt_ldply(
.data,
.f = NULL,
...,
.progress = "none",
.parallel = FALSE,
.id = NA
)
dt_dlply(
.data,
.variables,
.f = NULL,
...,
.progress = "none",
.drop = TRUE,
.parallel = FALSE
)data frame to be processed
variables to split data frame by, as as.quoted
variables, a formula or character vector
A function, specified in one of the following ways:
A named function, e.g. mean.
An anonymous function, e.g. \(x) x + 1 or function(x) x + 1.
A formula, e.g. ~ .x + 1. You must use .x to refer to the first
argument. No longer recommended.
A string, integer, or list, e.g. "idx", 1, or list("idx", 1) which
are shorthand for \(x) pluck(x, "idx"), \(x) pluck(x, 1), and
\(x) pluck(x, "idx", 1) respectively. Optionally supply .default to
set a default value if the indexed element is NULL or does not exist.
Wrap a function with in_parallel() to declare that it should be performed
in parallel. See in_parallel() for more details.
Use of ... is not permitted in this context.
other arguments passed on to .fun
name of the progress bar to use, see
create_progress_bar
should combinations of variables that do not appear in the input data be preserved (FALSE) or dropped (TRUE, default)
if TRUE, apply function in parallel, using parallel
backend provided by foreach
dt <- data.table(x = 1:10, y = 1:5)
dt_dlply(dt, .(y), ~.[which.max(x)])
#> $`1`
#> [data.table]:
#> # A data frame: 1 × 2
#> x y
#> <int> <int>
#> 1 6 1
#>
#> $`2`
#> [data.table]:
#> # A data frame: 1 × 2
#> x y
#> <int> <int>
#> 1 7 2
#>
#> $`3`
#> [data.table]:
#> # A data frame: 1 × 2
#> x y
#> <int> <int>
#> 1 8 3
#>
#> $`4`
#> [data.table]:
#> # A data frame: 1 × 2
#> x y
#> <int> <int>
#> 1 9 4
#>
#> $`5`
#> [data.table]:
#> # A data frame: 1 × 2
#> x y
#> <int> <int>
#> 1 10 5
#>
dt_ddply(dt, .(y), ~ top_n(., 1, x))
#> [data.table]:
#> # A data frame: 5 × 2
#> x y
#> <int> <int>
#> 1 6 1
#> 2 7 2
#> 3 8 3
#> 4 9 4
#> 5 10 5