Dirk Eddelbuettel — written Jan 2, 2013 — updated Nov 19, 2016 — James Joseph Balamuta — source

## Introduction

Subsetting in armadillo is a popular topic that frequently appears on StackOverflow. Prinicipally, the occurrences are due to the unique approach that armadillo takes on performing subset operations. Within this topic, light will be shed on the applicable use of armadillo’s subsetting capabilities as it relates to non-contiguous submatrix views.

## Vector Subset and Assignment

To begin, we start with a recent inquiring on StackOverflow that dealt with the proposition of isolating pre-known locations and assigning the locations new values. The R equivalent would be in the lines of:

 [1]  1  2  3  4  5  6  7  8  9 10

 [1] 1 6 3 3 5 6 7 8 9 1


Within armadillo, the submatrix views that we are most interested in to replicate this behavior is .elem(). The main difference of subsetting with this function is the need to use an arma::uvec containing the locations of elements to extract. The u prepended to the vec object standards for unsigned int, which are only able to store positive integer numbers (e.g. 0, 1, 2,…, 42, and so on.)

To execute this code, we would use:

 [1]  1  2  3  4  5  6  7  8  9 10

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    1    6    3    3    5    6    7    8    9     1


## Finding Values to Subset

Another common operation is to use logical indexing to ascertain where within a vector certain conditions are met. For instance, to find values that are greater than or equal to 0.5 in R the operation is governed by:

 [1] 0.37667253 0.92201944 0.99187774 0.97723602 0.23629728 0.10706678
[7] 0.03915213 0.72802690 0.13023496 0.39995203

 [1] 0.37667253 1.00000000 1.00000000 1.00000000 0.23629728 0.10706678
[7] 0.03915213 1.00000000 0.13023496 0.39995203


Within armadillo, the use of logical values to subset is not possible without arma::find to replicate R’s logical index subset behavior. However, per the previous discussion, the values that .elem() takes requires indices. Therefore, the arma::find function is taking logical values and convert them into index positions. This operation is akin to the process behind R’s which() function. Furthermore, to specify a specific value that should be assigned to the subset, the use of .fill() is required.

 [1] 0.37667253 0.92201944 0.99187774 0.97723602 0.23629728 0.10706678
[7] 0.03915213 0.72802690 0.13023496 0.39995203

          [,1] [,2] [,3] [,4]      [,5]      [,6]       [,7] [,8]     [,9]
[1,] 0.3766725    1    1    1 0.2362973 0.1070668 0.03915213    1 0.130235
[,10]
[1,] 0.399952


## Subsetting a Matrix

As was the case with vectors, elements can be extracted from a matrix using both the arma::find function to obtain indices and the .elem() function in submatrix views to extract values. For example, to know the values of M * M' that are greater or equal to 100, one could do:

The result:

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

     [,1]
[1,]  108
[2,]  108
[3,]  126


## Obtaining Values with Matrix Subscripts

Another form of subsetting within matrices is done using matrix notation. Matrix notation allows for the specification of an element based on their row and column location. This question recently popped up on StackOverflow as the non-continuous submatrix views were initially causing the user problems.

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

     [,1] [,2]
[1,]    1    2
[2,]    3    1
[3,]    2    3

[1] 4 3 8


As one StackOverflow user discovered, the non-continuous submatrix views obtained via X.submat( vector_of_row_indices, vector_of_column_indices ) does not provide the traditional R matrix subset. Within armadillo 7.200, the ability to supply multiple matrix notations to sub2ind() was added. However, it requires a different matrix format than R. Instead of supplying the row and column as one row in the matrix, one must supply each observation as a column in the matrix. That is, the subscript matrix must be $2 \times n$, where $n$ is the number of matrix locations. The first row is associated with row location of the element in the matrix and the second row is requires the column location of the element.

The result:

     [,1] [,2] [,3]
[1,]    0    2    1
[2,]    1    0    2

     [,1] [,2] [,3]
[1,]    4    3    8


## Converting between umat matrix types to standard arma types.

Awhile back, on StackOverflow question, someone asked how convert to convert from arma::umat to arma::mat. The former is the format we have seen used for arma::find.

For the particular example at hand, a call to the conv_to<T>() converter provided the solution. For posterity, we rewrite the answer here.

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

     [,1] [,2] [,3]
[1,]    0    0    0
[2,]    1    0    0
[3,]    1    1    0