Robin Girard — written Jun 10, 2013 — source
Sensitivity analysis is the task of evaluating the sensitivity of a model output Y to input variables (X1,…,Xp). Quite often, it is assumed that this output is related to the input through a known function f :Y= f(X1,…,Xp).
Sobol indices are generalizing the coefficient of the coefficient of determination in regression. The ith first order indice is the proportion of the output variance that is “due” to the ith input variable. Numerous details and references are found e.g. in this paper. In particular, the mentionned paper defines an estimation procedure that we use here.
This estimation procedure relies on the use of montecarlo estimates where computation speed and precision of the estimate are highly connected. Let us show how to compute these indices efficiently in c++ while keeping the code simple.
To fix idea we use a “test” function f that was introduced in the First papers of Sobol around sensitivity indices estimation, see e.g. his 1993 paper “Sensitivity analysis for non-linear mathematical models”. This function is called “the Sobol Function”.
Here is a cpp implementation of sobol function and of the Sobol indices estimation procedure:
Now compare the results with a similar function to estimate Sobol indices from the excellent package sensitivity (it has a lot of other sensitivity indices and estimation procedures efficiently implemented and well documented).
test elapsed relative user.self sys.self 2 FirstOrderSobol(n) 2.048 1.000 2.048 0.000 1 sensitivity_sobol2002(n) 4.181 2.042 4.111 0.071
It seems that the Rcpp code is faster (by a factor between 5 and 10), and part of the explanation may be in the use of a dataframe in the package sensitivity. The result for the case of the Sobol function are slightly different (see the first two indices) certainly due to different definition of indices.
Total indices From Sensitivity package : 0.667966 0.192648 0.0294923 0.00329963 4.15316e-05 0.000134084 0.000468673 -1.63014e-05 From our function : 0.792887 0.0851894 0.00885642 0.00241633 -7.88222e-05 8.27116e-06 0.000206418 -3.17788e-05
The use of Rcpp here hence allows to have a faster code at the cost of changing manually the estimation loop in C++ if one want to make a sensitivity analysis of another function (than the “Sobol function”).
For standard users this migth be a cost difficult to afford. A solution to overcome this problem is to let the user supply the function f and even the distribution of x. This requires that you know how to pass a function (either implemented in c++ or in R) as a parameter to FirstOrderSobol. Dirk has already done that in package RcppDE. Part of the trick is explained in this Rcpp Gallery post , and the answer here provides more details. Could be the purpose of a further package. In addition my experience with sensitivity analysis is that the Sobol Function is very simple, this means that in a lot of practical situation the gain of implementing the subsequent function in c++ can be much larger.
Finally, I like the fact that student interested in Sobol indices get the details of the used estimator hence not seeing Sobol indices estimation as a black box. Rcpp makes it possible to do so while keeping the code very efficient (here more efficient than the optimized package) !Tweet