Futureverse

A Unifying Parallelization Framework in R for Everyone

The hexlogo of the future package. A left-facing arrow with the text future underneath - both in bold style filled with yellow-to-orange vertical gradient. The background is dark blue with teeny star-shaped symbols in distance resembling looking deep out in the universe. The hexlogo is surrounded by a light-blue border.

The future framework makes it easy to parallelize existing R code - often with only a minor change of code. The goal is to lower the barriers so that anyone can safely speed up their existing R code in a worry-free manner. As it is a cross-platform solution that requires no additional setups or technical skills, anyone can be up and running within a few minutes.

The future framework removes common hurdles and protects against pitfalls that follow from adding parallelization. Instead of leaving it to the developers and end-users to be aware of and deal with these problems, they are handled at the core of the highly-validated future ecosystem. Just as with sequential R code, output, messages, warnings, and errors work as expected and can be handled using traditional R techniques - regardless how the code is parallelized.

It is designed so that you as a developer can stay with your favorite coding style, may it be base R or tidyverse. If you like base R lapply() there is a corresponding future_lapply() in the future.apply package and if you like tidyverse purrr map() there is a corresponding future_map() in the furrr package. If you prefer foreach() from foreach, then you can do so via doFuture.

Futures makes your web interface asynchronous, e.g. a blocking Shiny application can easily be turned into a non-blocking experience by using futures.

Regardless how you use futures in your code, the user can, with a single setting, switch from running your code sequentially to running it in parallel on their local computer, across multiple machines on their local area network, in the cloud, or distributed on a high-performance compute (HPC) cluster.

For further details and motivations, see Bengtsson (2021).

Here’s the gist of how to use it:

library(future)
plan(multisession)

## Evaluate an R expression sequentially
y <- slow_fcn(X[1])

## Evaluate it in parallel in the background
f <- future(slow_fcn(X[1]))
y <- value(f)

## future.apply: futurized version of base R apply
library(future.apply)
y <-        lapply(X, slow_fcn)
y <- future_lapply(X, slow_fcn)

## furrr: futurized version of purrr
library(furrr)
y <- X |>        map(slow_fcn)
y <- X |> future_map(slow_fcn)

## foreach: futurized version (modern)
library(foreach)
y <- foreach(x = X) %do%       slow_fcn(x)
y <- foreach(x = X) %dofuture% slow_fcn(x)

## foreach: futurized version (traditional)
library(foreach)
doFuture::registerDoFuture()
y <- foreach(x = X) %do%    slow_fcn(x)
y <- foreach(x = X) %dopar% slow_fcn(x)
Bengtsson, Henrik. 2021. A Unifying Framework for Parallel and Distributed Processing in R using Futures.” The R Journal. https://journal.r-project.org/archive/2021/RJ-2021-048/index.html.

References