Wush Wu — written Apr 6, 2014 — source
This post provides a brief introduction to calling Python from R through Rcpp. The official Python documentation explains how to embed python into C/C++ applications. Moreover, the Boost.Python library provides seamless interoperability between C++ and the Python programming language. Similarlly, Rcpp provides interoperability between C++ and R. Therefore, it is not hard to call Python from R through Rcpp and Boost.Python.
Although there is a package rPython which provides an interface to Python from R through Java, it is interesting to try to connect R and Python via C++.
In this article, we show how to call Python 2.7 from R on Ubuntu.
The most difficult thing is to establish a development environment. On Ubuntu, we need to install the following packages to build via embeded Python:
sudo apt-get install python2.7 python2.7-dev libboost-python-dev
Then, we pass the following flags to the compiler:
py_cflags <- system("python2.7-config --cflags", intern=TRUE)
Sys.setenv("PKG_CFLAGS"=sprintf("%s %s", Sys.getenv("PKG_CFLAGS"), py_cflags))
Sys.setenv("PKG_CXXFLAGS"=sprintf("%s %s", Sys.getenv("PKG_CXXFLAGS"), py_cflags))
py_ldflags <- system("python2.7-config --ldflags", intern=TRUE)
Sys.setenv("PKG_LIBS"=sprintf("%s %s %s", Sys.getenv("PKG_CFLAGS"), "-lboost_python-py27", py_ldflags))
The following hello world
should then work:
#include <Rcpp.h>
#include <Python.h>
using namespace Rcpp;
//[[Rcpp::export]]
void initialize_python() {
Py_SetProgramName(""); /* optional but recommended */
Py_Initialize();
}
//[[Rcpp::export]]
void finalize_python() {
Py_Finalize();
}
//[[Rcpp::export]]
void hello_python() {
PyRun_SimpleString("from time import time,ctime\n"
"print 'Today is',ctime(time())\n");
}
Let’s call them in R:
initialize_python()
hello_python()
Today is Thu Apr 2 11:32:17 2015
It shows that the hello_python
function successfully initializes the Python
engine and runs the Python script through PyRun_SimpleString
.
With Boost.Python and Rcpp, we can easily transfer the data between R and
Python. The following C codes transfer the R IntegerVector
to Python
List
:
#include <Rcpp.h>
#include <boost/python/raw_function.hpp>
namespace py = boost::python;
typedef Rcpp::XPtr<py::list> PyList;
using namespace Rcpp;
//[[Rcpp::export]]
SEXP IntVec_to_py_list(IntegerVector src) {
PyList pretval(new py::list());
int glue;
for(int i = 0;i < src.size();i++) {
glue = src[i];
pretval->append(glue);
}
return pretval;
}
IntVec_to_py_list(1:10)
<pointer: 0x8e0b410>
The pointer refers to the memory of the transformed Python object.
The following example shows how to define a function in Python and expose it in R.
#include <Rcpp.h>
#include <Python.h>
#include <boost/python/raw_function.hpp>
namespace py = boost::python;
typedef Rcpp::XPtr<py::list> PyList;
using namespace Rcpp;
//[[Rcpp::export]]
void pycall(std::string py_script) {
PyRun_SimpleString(py_script.c_str());
}
//[[Rcpp::export]]
void pyfun(std::string fun_name, SEXP fun_argument) {
// create the module of python which is similar to the R_GlobalEnv
py::object module((py::handle<>(py::borrowed(PyImport_AddModule("__main__")))));
// look up and retrieve the function of the given name in the module
py::object pyfun = module.attr("__dict__")[fun_name.c_str()];
// call the function with the API of boost::python
py::list argv(*PyList(fun_argument));
pyfun(argv);
}
pycall("
def print_list(src):
for i in src:
print i
")
a <- IntVec_to_py_list(1:10)
pyfun("print_list", a)
1 2 3 4 5 6 7 8 9 10
Errors in the Python engine can be handled easily by the C++ try/catch
idiom as the following example shows:
#include <Rcpp.h>
#include <Python.h>
#include <boost/python/raw_function.hpp>
namespace py = boost::python;
typedef Rcpp::XPtr<py::list> PyList;
//[[Rcpp::export]]
void pyfun(std::string fun_name, SEXP fun_argument) {
try {
// create the module of python which is similar to the R_GlobalEnv
py::object module((py::handle<>(py::borrowed(PyImport_AddModule("__main__")))));
// look up and retrieve the function of the given name in the module
py::object pyfun = module.attr("__dict__")[fun_name.c_str()];
// call the function with the API of boost::python
py::list argv(*PyList(fun_argument));
pyfun(argv);
}
catch (py::error_already_set) {
PyErr_Print();
}
}
pycall("
def print_list(src):
for i in src:
print i
")
a <- IntVec_to_py_list(1:10)
pyfun("print_lists", a) # a typo of the function name
KeyError: 'print_lists' Error in sys.excepthook: Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 63, in apport_excepthook from apport.fileutils import likely_packaged, get_recent_crashes File "/usr/lib/python2.7/dist-packages/apport/__init__.py", line 5, in <module> from apport.report import Report File "/usr/lib/python2.7/dist-packages/apport/report.py", line 16, in <module> from xml.parsers.expat import ExpatError File "/usr/lib/python2.7/xml/parsers/expat.py", line 4, in <module> from pyexpat import * ImportError: /usr/lib/python2.7/lib-dynload/pyexpat.i386-linux-gnu.so: undefined symbol: _Py_ZeroStruct Original exception was: KeyError: 'print_lists'
These examples show how to integrate Python and R with Rcpp and Boost.Python. It relied on two C++ libraries which ease the integration work greatly: Rcpp for R, and Boost.Python for Python. The core steps discussed above are initializing the engine (Hello World), transforming the data (Type Conversion), exposing functions (Call Python Function), and handling errors properly (Error Handling).
Tweet