The RcppParallel package includes
high level functions for doing parallel programming with Rcpp. For example,
the parallelFor function can be used to convert the work of a standard
serial “for” loop into a parallel one. This article describes using
RcppParallel to transform an R matrix in parallel.

Serial Version

First a serial version of the matrix transformation. We take the square root
of each item of a matrix and return a new matrix with the tranformed values.
We do this by using std::transform to call the sqrt function on each
element of the matrix:

Parallel Version

Now we’ll adapt our code to run in parallel using the parallelFor function.
RcppParallel takes care of dividing up work between threads, our job is to
implement a “Worker” function object that is called by the RcppParallel
scheduler.

The SquareRoot function object below includes pointers to the input matrix
as well as the output matrix. Within it’s operator() method it performs a
std::transform with the sqrt function on the array elements specified by
the begin and end arguments:

Note that SquareRoot derives from RcppParallel::Worker. This is required
for function objects passed to parallelFor.

Note also that we use the RMatrix<double> type for accessing the matrix.
This is because this code will execute on a background thread where it’s not
safe to call R or Rcpp APIs. The RMatrix class is included in the
RcppParallel package and provides a lightweight, thread-safe wrapper around R
matrixes.

Here’s the parallel version of our matrix transformation function that makes
uses of the SquareRoot function object. The main difference is that rather
than calling std::transform directly, the parallelFor function is called
with the range to operate on (in this case based on the length of the input
matrix) and an instance of SquareRoot:

Benchmarks

A comparison of the performance of the two functions shows the parallel
version performing about 2.5 times as fast on a machine with 4 cores: