Simulating a Vector Autoregressive Process

Dirk Eddelbuettel — written Dec 18, 2012 — source

This example simulates a first-order vector autoregressive process involving simple matrix multiplication in an iterative fashion. It was suggested by Lance Bachmeier as a motivating example for using Rcpp.

So let’s walk through the example. First the plain vanilla R version, this starts with a simple enough loop. After skipping the first row, each iteration multiplies the previous row with the parameters and adds error terms:

## parameter and error terms used throughout
a <- matrix(c(0.5,0.1,0.1,0.5),nrow=2)
e <- matrix(rnorm(10000),ncol=2)

## Let's start with the R version
rSim <- function(coeff, errors) {
   simdata <- matrix(0, nrow(errors), ncol(errors))
   for (row in 2:nrow(errors)) {
      simdata[row,] = coeff %*% simdata[(row-1),] + errors[row,]
   }
   return(simdata)
}

rData <- rSim(a, e)     

We now create a version of the function using the R compiler:

compRsim <- compiler::cmpfun(rSim)
compRData <- compRsim(a,e)              # generated by R 'compiled'
stopifnot(all.equal(rData, compRData))  # checking results

With that, time to turn to C++ using Armadillo via RcppArmadillo:

// [[Rcpp::depends(RcppArmadillo)]]

#include <RcppArmadillo.h>

// [[Rcpp::export]]
arma::mat rcppSim(arma::mat coeff, arma::mat errors) {
   int m = errors.n_rows; int n = errors.n_cols;
   arma::mat simdata(m,n);
   simdata.row(0) = arma::zeros<arma::mat>(1,n);
   for (int row=1; row<m; row++) {
      simdata.row(row) = simdata.row(row-1)*trans(coeff)+errors.row(row);
   }
   return simdata;
}

The C++ code is pretty straightforward as well. We can instatiate Armadillo matrices directly from the R objects we pass down; we then run a similar loop building the result row by row.

Now, with all the build-up, here is the final timing comparison:

rbenchmark::benchmark(rcppSim(a,e),
                      rSim(a,e),
                      compRsim(a,e),
                      columns=c("test", "replications", "elapsed",
                                "relative", "user.self", "sys.self"),
                      order="relative")
            test replications elapsed relative user.self sys.self
1  rcppSim(a, e)          100   0.024     1.00     0.020    0.004
3 compRsim(a, e)          100   1.381    57.54     1.376    0.004
2     rSim(a, e)          100   3.368   140.33     3.344    0.008

So in a real-world example involving looping and some algebra (which is of course already done by BLAS and LAPACK libraries), the new R compiler improves by more than a factor of two, cutting time from 4.14 seconds down to about 2 seconds.

Yet, this still leaves the C++ solution, clocking in at a mere 38 milliseconds, ahead by a factor of over fifty relative to the new R compiler. And compared to just R itself, the simple solution involving Rcpp and RcppArmadillo is almost 110 times faster.

tags: armadillo  matrix 

Related Articles