JJ Allaire — written Jul 15, 2014 — source
The RcppParallel package includes
high level functions for doing parallel programming with Rcpp. For example,
the parallelReduce
function can be used aggreggate values from a set of
inputs in parallel. This article describes using RcppParallel to parallelize
the inner-product
example previously posted to the Rcpp Gallery.
First the serial version of computing the inner product. For this we use
a simple call to the STL std::inner_product
function:
Now we adapt our code to run in parallel. We’ll use the parallelReduce
function to do this. This function requires a “worker” function object
(defined below as InnerProduct
). For details on worker objects see the
parallel-vector-sum
article on the Rcpp Gallery.
Note that InnerProduct
derives from the RcppParallel::Worker
class. This
is required for function objects passed to parallelReduce
.
Note also that we use the RVector<double>
type for accessing the vector.
This is because this code will execute on a background thread where it’s not
safe to call R or Rcpp APIs. The RVector
class is included in the
RcppParallel package and provides a lightweight, thread-safe wrapper around R
vectors.
Now that we’ve defined the function object, implementing the parallel inner
product function is straightforward. Just initialize an instance of
InnerProduct
with the input vectors and call parallelReduce
:
A comparison of the performance of the two functions shows the parallel version performing about 2.5 times as fast on a machine with 4 cores:
test replications elapsed relative 3 parallelInnerProduct(x, y) 100 0.023 1.000 2 innerProduct(x, y) 100 0.099 4.304 1 sum(x * y) 100 0.396 17.217
You can learn more about using RcppParallel at https://rcppcore.github.com/RcppParallel.
tags: parallel
Tweet