Quality, Validation & Maintenance

Top priorities: correctness, reproducibility, long-term stability, and trust


Since correctness and reproducibility is essential to all data processing, validation is a top priority and part of the design and implementation throughout the future ecosystem. Several types of testing are performed.

First, all the essential core packages part of the future framework, future, parallelly, globals, and listenv, implement a rich set of package tests. These are validated regularly across the wide-range of operating systems (Linux, Solaris, macOS, and MS Windows) and R versions available on CRAN, on continuous integration (CI) services (GitHub Actions, Travis CI, and AppVeyor CI), an on R-hub.

Second, for each new release, these packages undergo full reverse-package dependency checks using revdepcheck. As of March 2023, the future package is tested against 280 direct reverse-package dependencies available on CRAN and Bioconductor. These checks are performed on Linux with both the default settings and when forcing tests to use multisession workers (SOCK clusters), which further validates that globals and packages are identified correctly.

A screencast
animation of future.tests running in the terminal. It starts with the
command-line call, followed by tests being run one by one filling up
the screen. After 30-or-so tests, the program completes with the
results: 0 errors, 0 timeouts. Third, a suite of Future API conformance tests available in the future.tests package validates the correctness of all future backends. Any new future backend developed, must pass these tests to comply with the Future API. By conforming to this API, the end-user can trust that the backend will produce the same correct and reproducible results as any other backend, including the ones that the backend developer have tested on. Also, by making it the responsibility of the backend developer to assert that their new future backend conforms to the Future API, we relieve other developers from having to test that their future-based software works on all backends. It would be a daunting task for a developer to validate the correctness of their software with all existing backends. Even if they would achieve that, there may be additional third-party future backends that they are not aware of, that they do not have the possibility to test with, or that yet have not been developed.

Fourth, since foreach is used by a large number of essential CRAN packages, it provides an excellent opportunity for supplementary validation. Specifically, we dynamically tweak the examples of foreach and popular CRAN packages caret, glmnet, NMF, plyr, and TSP to use the doFuture adaptor. This allows us to run these examples with a variety of future backends to validate that the examples produce no run-time errors, which indirectly validates the backends as well as the Future API. In the past, these types of tests helped to identify and resolve corner cases where automatic identification of global variables would fail. As a side note, several of these foreach-based examples fail when using a parallel foreach adaptor because they do not properly export globals or declare package dependencies. The exception is when using the sequential doSEQ adaptor (default), fork-based ones such as doMC, or the generic doFuture, which supports any future backend and relies on the future framework for handling globals and packages.

Analogously to above reverse-dependency checks of each new release, CRAN and Bioconductor continuously run checks on all these direct, but also indirect, reverse dependencies, which further increases the validation of the Future API and the future ecosystem at large.

The packages underlying the future ecosystem are well maintained and are continuously updated and improved. For example, the future package is updated approximately six times per year, the future.apply and doFuture packages three times per year, and the infrastructure packages globals and parallelly three to six times a year. The policy is to fix bugs as soon as possible and avoid breaking updates to maintain full backward compatibility with existing R code in production that rely on futures. In the very rare case when an existing feature had to be removed, it has been done via deprecation process taking place over several release cycles working closely with existing package developers to assure a smooth transition and with informative warning messages to end-users where needed.