Conditional lapply or bplapply with optional batch processing.
Usage
loop(
x,
FUN,
parallelize = TRUE,
use_batch = as.logical(Sys.getenv("GDR_USE_BATCH", "FALSE")),
temp_dir = Sys.getenv("GDR_TEMP_DIR", tempdir()),
batch_size = as.numeric(Sys.getenv("GDR_BATCH_SIZE", 100)),
...
)
Arguments
- x
Vector (atomic or list) or an expression object. Other objects (including classed objects) will be coerced by as.list
- FUN
A user-defined function to apply to each element of
x
.- parallelize
Logical indicating whether or not to parallelize the computation. Defaults to
TRUE
.- use_batch
Logical indicating whether to use batch processing to save intermediate results. Defaults to
FALSE
.- temp_dir
Character string specifying the directory where batch results are saved. Defaults to
tempdir()
.- batch_size
Integer specifying the number of elements to process in each batch during batch mode. Defaults to
100
.- ...
Optional arguments passed to bplapply if
parallelize == TRUE
, else to lapply.
Value
List containing output of FUN
applied to every element in x
.
When batch processing is enabled, results are saved incrementally and merged at the end of processing.
Details
The function operates in two modes:
Regular mode: Directly applies
FUN
to the elements usinglapply
orbplapply
.Batch mode: Saves results in batches to disk, allowing computation to resume from the last saved step. Batch mode is activated by setting
use_batch
toTRUE
.
Examples
# Regular processing
loop(list(1, 2, 3), function(x) x^2, parallelize = FALSE, use_batch = FALSE)
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 4
#>
#> [[3]]
#> [1] 9
#>
# Batch processing
loop(1:10, function(x) x^2, parallelize = TRUE, use_batch = TRUE)
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 4
#>
#> [[3]]
#> [1] 9
#>
#> [[4]]
#> [1] 16
#>
#> [[5]]
#> [1] 25
#>
#> [[6]]
#> [1] 36
#>
#> [[7]]
#> [1] 49
#>
#> [[8]]
#> [1] 64
#>
#> [[9]]
#> [1] 81
#>
#> [[10]]
#> [1] 100
#>