## A simple array class with specialized linear algebra routines

Fabian Scheipl — written Mar 21, 2014 — source

Data with an array structure are common in statistics, and the design or regression matrix for analysis of such data can often be written as a Kronecker product. Factorial designs, contingency tables and smoothing of data on multidimensional grids are three such general classes of data and models. In such a setting, we develop an arithmetic of arrays which allows us to define the expectation of the data array as a sequence of nested matrix operations on a coefficient array. We show how this arithmetic leads to low storage, high speed computation in the scoring algorithm of the generalized linear model.

For example, they show that if a design matrix `X` has the Kronecker structure `X = kronecker(Xd, ..., X2, X1)` with `X<i>` a partial model matrix with `n<i>` rows and `c<i>` columns, linear functions `X%*%theta` of `X` and a coefficient vector `theta` can be efficiently computed based only on the partial model matrices where the entries in the vector `X%*%theta` (with `nd*...*n2*n1` entries) are the same as the entries in the array with dimension `c(n1, n2, ..., nd)` returned by ```RH(Xd, ... , RH(X2, RH(X1, Theta))...)```.
`Theta` is an array with dimensions `c(c1, c2, ..., cd)` containing `theta` and `RH(X, A)` – the “rotated H-transform” – is an operation generalizing transposed pre-multiplication `t(X %*% A)` of a matrix `A` by a matrix `X` to the case of higher dimensional array-valued `A`.

The code below implements a simple array class for numeric arrays and the rotated H-transform in `RcppArmadillo` and compares the performance to both the naive straight forward matrix multiplication based on the full model matrix and an `R`-implementation of `RH()`:

Set up test case:

Let’s look at a 3-dimensional example where `X = X3 %x% X2 %x% X1` and each `X<i>` is a B-spline basis over `seq(0, 1, len=n<i>)`:

Note that `X` is fairly large, with 6 × 104 rows and 750 columns.

Check correctness:

``` TRUE
```
``` TRUE
```

Check performance:

```                                         test elapsed relative
1 array(X %*% theta_vec, dim = c(n1, n2, n3))  27.252  198.920
3   Reduce(RH, list(X3, X2, X1), Theta, TRUE)   0.137    1.000
2 Reduce(RH_r, list(X3, X2, X1), Theta, TRUE)   0.325    2.372
```

Note: An alternative version with proper formula notation can be found here.