STL transform + remove_copy for subsetting

Dirk Eddelbuettel — written Dec 29, 2012 — source

We have seen the use of the STL transform functions in the posts STL transform and Transforming a matrix. We use the same logic in conjuction with a logical (ie boolean) vector in order subset an initial vector.

#include <Rcpp.h>

using namespace Rcpp;
using namespace std;

const double flagval = __DBL_MIN__; // works
//const double flagval = NA_REAL;   // does not

// simple double value 'flagging' function
inline double flag(double a, bool b) { return b ? a : flagval; }

// [[Rcpp::export]]
NumericVector subsetter(NumericVector a, LogicalVector b) {
    // We use the flag() function to mark values of 'a' 
    // for which 'b' is false with the 'flagval'
    transform(a.begin(), a.end(), b.begin(), a.begin(), flag);

    // We use sugar's sum to compute how many true values to expect
    NumericVector res = NumericVector(sum(b));

    // And then copy the ones different from flagval from a into
    // res using the remove_copy function from the STL
    remove_copy(a.begin(), a.end(), res.begin(), flagval);
    return res;    
}

We can illustrate this on a simple example or two:

a <- 1:5
subsetter(a, a %% 2 == 0)

[1] 2 4

subsetter(a, a > 2)

[1] 3 4 5

Casual benchmarking (not shown) shows this to be comparable to and even slightly faster than basic indexing in R itself.

tags: stl

Extending R with C++ and Fortran — Dirk Eddelbuettel and JBrandon Duck-Mayr
Benchmarking Rcpp code with RcppClock — Zach DeBruine
Simulation Smoother using RcppArmadillo — Tomasz Woźniak

STL transform + remove_copy for subsetting

Related Articles