Sprungmarken

Servicenavigation

Hauptnavigation


You are here:

Home Research Resource-constrained Computing Optimizations of R Programs

Bereichsnavigation

Hauptinhalt

Optimizing Execution Runtimes of R Programs

The GNU R language is very popular in the domain of statistics. Its functional character supports the rapid development of statistical algorithms and analyses. Statisticians around the world profit from the immense R package archive CRAN where researchers offer their algorithms in form of R programs for free usage.

One of the disadvantages of R is that programs have to be evaluated and processed by a runtime interpreter. Such an interpretation requires a lot of time and delays the execution. Thus, a lot of computing power is wasted compared to imperative languages like ANSI C, which can be automatically optimized and translated to machine code by a sophisticated compiler.

In the following, we present some ideas for optimization techniques which allow a significant performance increase of R programs:

GNU R Compiler Toolchain

For most imperative programming language like C, sophisticated compiler toolchains are existing. In order to avoid a time consuming interpretation of GNU R programs, we also propose such a compiler toolchain for R which automatically optimize and translate a program to efficient machine code.


Proposed toolchain for 						optimizing GNU R 						programs

The overall structure of the proposed GNUR R compiler toolchain is depicted in the figure above. Rounded boxes denote representations of the program to optimize at different abstraction levels. Squared boxed represent the different compiler tools working on each stage of the compilation process:

  • Parser
    The parser should always be compliant to the latest GNU R interpreter. It accepts several R source code files within a single compiler run and creates a high-level intermediate representation called R-IR out of them.
  • R-IR
    The R-IR should be designed as a compiler front-end providing a machine-independent intermediate representation (IR) for R code. On such an intermediate representation, architecture-independent code analyses and optimizations can be developed.
  • R2C
    The R2C module should integrate new R-tailored optimization techniques which should be able to remove redundancy and exploit target specific features in order to decrease the execution time of the R program. Afterwards, the optimized R code should be automatically translated to ANSI C source code which is in turn represented by its own intermediate representation C-IR. In this step, existing C libraries could be injected in form of their C code.
  • C-IR
    As intermediate representation of C code, the ICD-C compiler framework can be employed. The ICD-C is designed as a compiler front-end providing a machine-independent intermediate representation (IR) for C code. It provides a built-in interface for target architecture-independent code analyses and optimizations. ICD-C would be used for applying optimizations and finally emitting compilable C code, thus allowing source-to-source optimization.
  • C2C
    The C2C module can implement various existing and new analysis and optimization techniques for C code. The module should make use of the ICD-C interface to read the C-IR to be optimized and write a more efficient representation back.
  • Standard Compiler
    To generate an efficient machine program from the emitted C code, a standard compiler like the GCC is favoured. This avoids time consuming reimplementations of existing techniques and enables the usage of already highly optimized code translation mechanisms. The employed compiler is also able to link existing highly optimized C++ and Fortran libraries to the emitted binary.

Case Study

In order to demonstrate the effectiveness of our proposed toolchain, a case study where we manually applied the standard optimizations common subexpression elimination (CSE) and dead code elimination (DCE) on GNU R code was performed. Additionally, code snippets were manually translated from R to C code and finally to machine code in order to demonstrate the effectiveness of a possible R compiler toolchain.

Common Subexpression Elimination

The common subexpression elimination is an optimization which removes redundancy in form of subexpressions from the code. It searches equal subexpressions which always evaluate to the same value and replaces the repeated computation by a variable holding the precomputed value.

strMCMC(), which is an R package for estimating graphical model by employing a Markov Chain Monte Carlo approach, serves as example. Consider the following R code snippet before CSE:

  x[ which( y[z[1],] == 1 ) ]     x[ which( y[z[1],] == 1 ) ] – 1;

By manually applying CSE, superfluous recomputations are removed:

  cse     x[ cse ]

Dead Code Elimination

The dead code elimination is an optimization which removes code which is never executed. For instances if it can be evaluated at compile time that a condition always evaluates false, the complete condition statement can be removed.

Here, too, strMCMC() serves as example. Consider the following R code snippet before applying DCE:

  ...
  for (i in 1:d2) {
    if (!is.null(tmp))
      ans[[i]]   }
  ...

Since it can be determined at compile time that for each call of the which() function x is always logical, the if expression can be removed:

  ...
  for (i in 1:d2) {
    if (!is.null(tmp))
      ans[[i]]   }
  ...

Transformation to ANSI C

The commonly used rda package is used as example for translating R to C code. It turned out, that a for loop inside the heavily used apply() function was the hotspot of the algorithm.

The following code snippet shows the mentioned loop which is manually translated to equivalent C code:

  ...
  for (i in 1:d2) {
    if (!is.null(tmp))
      ans[[i]]   }
  ...

The resulting C code is translated to efficient machine code and will be called using R's .C interface.

Results

By applying the optimization CSE, the runtime of the strMCMC() function can be decreased by 10% whereas DCE can decrease the execution runtime by up to 5%. The translation of the time consuming for loop inside the apply() function can decrease the overall runtime of the rda package by 70%. The translated code snipped in itself outperforms the R code by a factor of up to 90!

Links and Further Reading