Advanced R for Econometricians

Advanced R for EconometriciansMeasuring Performance and ProfilingMartin C. Arnold, Jens Klenke1 / 36

What's up next? — Overview

Now:

Measuring Performance: Profiling and Microbenchmarking
(Why) is R slow?

Remaining classes:

Improving Performance
How to speed things up using Rcpp (and hopefully RcppArmadillo)

2 / 36

Profiling3 / 36

Prerequisites

# packages needed
library(profvis)
library(bench)
library(beeswarm) # for plotting with `bench` package

Profiling means measuring run time of our code line-by-line using realistic inputs in order to identify bottlenecks.
After identifying bottlenecks we experiment with alternatives and find the fastest using a microbenchmark.
We will use the packages profvis and bench for profiling and benchmarking.

4 / 36

It’s tempting to think you just know where the bottlenecks in your code are. I mean, after all, you write it! But trust me, I can’t tell you how many times I’ve been surprised at where exactly my code is spending all its time. The reality is that profiling is better than guessing. — Roger D. Peng

5 / 36

Profiling — `utils::Rprof()`

Rprof() is a build-in sampling profiler. It keeps track of the function call stack at regularly sampled intervals.
Note: results are stochastic—we never run a function under the same conditions twice (think memory usage, CPU load, etc.)

Example: profiling a call of `replicate()`

tmp <- tempfile()
Rprof(tmp, interval = 0.1)       # start the profiler
replicate(5, mean(rnorm(1e6)))
Rprof(NULL)                      # stop profiling
writeLines(readLines(tmp))

## 
## sample.interval=100000
## "rnorm" "mean" "FUN" "lapply" "sapply" "replicate" 
## "rnorm" "mean" "FUN" "lapply" "sapply" "replicate" 
## "rnorm" "mean" "FUN" "lapply" "sapply" "replicate"

6 / 36

Visualising profiles: `profvis::profvis()`

Exercise: profiling nested functions

Profile a call of f() and visualise the results using profvis::profvis().

# define some (nested) example functions
f <- function() {
  pause(0.1)
  g()
  h()
}
g <- function() {
  pause(0.1)
  h()
}
h <- function() {
  pause(0.1)
}

7 / 36

Visualising profiles: `profvis::profvis()`

Example: profiling linear regression

profvis({
  dat <- data.frame(
    x = rnorm(5e4),
    y = rnorm(5e4)
  )
  plot(x ~ y, data = dat)
  m <- lm(x ~ y, data = dat)
  abline(m, col = "red")
})

8 / 36

Visualising profiles: `profvis::profvis()`

Example: profiling linear regression

9 / 36

Visualising profiles: `profvis::profvis()`

Example: profiling linear regression

10 / 36

Memory Profiling

Exercise: extensive garbage collection

The following code generates a large number of short-lived objects by copy-on-modification.

What does profvis() reveal?

profvis({
  x <- integer()
  for (i in 1:1e4) {
    x <- c(x, i)
  }
})

11 / 36

Memory Profiling

Sometimes code statements seem fairly innocent but are very inefficient both when it comes to memory and speed.

Exercise: coercion to another type

Can you explain why execution of the third line needs 762.9MB memory?

profvis({
  x <- matrix(nrow = 1e4, ncol = 1e4)
  x[1, 1] <- 0
  x[1:3, 1:3] 
})

12 / 36

Memory Profiling

Exercise: coercion to another type

13 / 36

Profiling — some notes and hints

C/C++ (or other compiled) code cannot be profiled
We also cannot profile what happens inside primitive functions, e.g., sum() and sqrt() (these functions are written in C or FORTRAN).

We thus cannot use profiling to see whether the code is slow because something further down the call stack is slow.
Profiling is another good reason to break your code into functions so that the profiler can give useful information about where time is being spent

14 / 36

Microbenchmarking15 / 36

What is a Microbenchmark?

"A microbenchmark is a program designed to test a very small snippets of code for a specific task. Microbenchmarks are always artificial and they are not intended to represent normal use."

We usually speak of milliseconds (ms), microseconds (µs), or nanoseconds (ns) here.
Important:

microbenchmarks can rarely be generalised to 'real' code: the observed differences in microbenchmarks will typically be dominated by higher-order effects in real code.

Think of it this way:

A deep understanding of quantum physics is not very helpful when baking cookies.
There are several R packages for microbenchmarking. We will rely on the bench package by Hester (2022) which has the most accurate timer function currently available in R.

16 / 36

Microbenchmarking — the `bench` package

bench is part of the tidyerse. It uses the highest precision APIs available for the common operating system (often nanoseconds-level!)
mark() is the working horse of the bench package and measures both memory allocation and computation time.
By default, a human-readable statistical summary on the distributions of memory load and timings based on 1e4 iterations is returned which also reports on garbage collections
Benchmarking across a grid of input values with bench::press() is possible
The package also has methods for neat visualisation of the results using ggplot2::autoplot() (default plot type is beeswarm)

17 / 36

Microbenchmarking — `bench::mark()`

Example: "the fastest square root"

x <- runif(100)
(lb <- bench::mark(
  sqrt(x),
  x^0.5
))

## # A tibble: 2 × 6
##   expression      min   median `itr/sec` mem_alloc `gc/sec`
##   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
## 1 sqrt(x)       123ns 164.15ns  3232274.      848B        0
## 2 x^0.5         943ns   1.02µs   864867.      848B        0

plot(lb)

18 / 36

Microbenchmarking — `bench::mark()`

Example: "the fastest square root"

19 / 36

Microbenchmarking — `bench::mark()`

Example: non-equivalent code

set.seed(42)
dat <- data.frame(
  x = runif(10000, 1, 1000),
  y = runif(10000, 1, 1000)
  )
bench::mark(
  dat[dat$x > 500, ],
  dat[which(dat$x > 499), ],
  subset(dat, x > 500)
  )

## 
## Error: Each result must equal the first result:
##   `dat$x > 500` does not equal `which(dat$x > 499)` Each result must 
##   equal the first result:
##   `` does not equal ``
##

Use check = FALSE to disable checking of consistent results.

20 / 36

Microbenchmarking — `bench::press()`

Sometimes it is useful to benchmark against a parameter grid. This can be done using bench::press().

Exercise: benchmark against parameter grid

Benchmark subsetting functions for dataframes generated by create_df() for combination of rows = c(10000, 100000) and cols = c(10, 100).

create_df <- function(rows, cols) {
  as.data.frame(
    matrix(
      unlist(replicate(cols, runif(rows, 1, 1000), simplify = FALSE)),
      nrow = rows, ncol = cols
      )
    )
}

21 / 36

Microbenchmarking — Exercises

Instead of using bench::mark(), you could use the built-in function system.time() which is, however, much less precise, so you’ll need to repeat each operation many times with a loop, and then divide to find the average time of each operation, as in the code below.
```
x <- runif(100)
n <- 1e6
system.time(for (i in 1:n) sqrt(x)) / n
system.time(for (i in 1:n) x ^ 0.5) / n
```
How do the estimates from system.time() compare to those from bench::mark()? Why are they different?
Here are two other ways to compute the square root of a numeric vector. Which do you think will be fastest? Use microbenchmarking to verify.

    x^(1/2)
    exp(log(x)/2)

22 / 36

(Why) is R slow?23 / 36

(Why) is R slow?

Think of R as both the definition of a language and its implementation.

The R language

The R language defines meaning of code statements and how they work (very abstract)
R is an extremely dynamic language, i.e., you have a lot freedom in modifying objects after they have been created
This dynamism is comfortable because is allows you to iterate your code and alter objects — there's no need to start from scratch when something doesn't work out!

⚡ Changes for improving speed without breaking existing code are problematic.

24 / 36

(Why) is R slow? — Slow Code Interpretation

But:

R is an interpreted language. Its dynamism causes code interpretation to be relatively time consuming.

Example: extreme dynamism

x <- 0L
for (i in 1:1e6) {
  x <- x + 1L
}

25 / 36

(Why) is R slow?

The R Implementation

Processes R code and computes results; commonly base R (GNU-R) from cran.r-project.org
R (written in FORTRAN, R, and C) is more than 20 years old was not intended for super fast computations
🔩 tweaking for speed improvements is easier (done by R-core team) but not feasible for end users like us.

Some alternatives address specific issues:
- "pretty quick R" pqR (parallelisation of core functions)
- Microsoft R Open (some fast extensions for machine learning)

26 / 36

(Why) is R slow? — Name Look-up with Mutable Environments

Remember Lexical Scoping:

"Values of variables are searched for in the environment the function belongs to."

If a value is not found in the environment in which a function was defined, the search steps-up to the parent environment
This continues down the sequence of parent environments until the top-level environment (usually the global environment or a package namespace) is reached
After reaching the top-level environment, the search continues down the search list until we hit the empty environment.

27 / 36

(Why) is R slow? — Name Look-up with Mutable Environments

Example: have you seen z?

The interpreter asks: which environment does the value of z come from?

f <- function(x, y) {
  x + y / z
}

# print search list
search()

##  [1] ".GlobalEnv"             "package:emoji"          "package:beeswarm"       "package:icons"          "package:tidyr"          "package:dplyr"         
##  [7] "package:tictoc"         "package:parallel"       "package:microbenchmark" "tools:rstudio"          "package:stats"          "package:graphics"      
## [13] "package:grDevices"      "package:utils"          "package:datasets"       "package:methods"        "Autoloads"              "org:r-lib"             
## [19] "package:base"

28 / 36

(Why) is R slow? — Name Look-up with Mutable Environments

Example: have you seen z? — ctd.

Interpreter asks: which environment does the value of z come from?

z <- 1                 # global environment
f <- function() {      # f() is parent to g()
  print(z)
  g <- function() {
    z <- 3
    print(z)
  }
  z <- 2
  print(z)
  g() 
}
f()

Name look-up is done every time we call print(z)!

29 / 36

(Why) is R slow? — Look-up with Mutable Environments

Even more problematic: most operations are lexically scoped function calls. This includes + and - but also (sometimes "recklessly" used) operators like ( and {.

Example: excessive usage of parenthesis

(or: a good example of poorly written code)

g <- function(x) x = (1/(1 + x))
h <- function(x) x = ((1/(1 + x)))
i <- function(x) x = (((1/(1 + x))))
x <- sample(1:100, 100, replace = TRUE)
bench::mark(g(x), h(x), i(x))

## # A tibble: 3 x 6
##   expression      min   median `itr/sec` mem_alloc `gc/sec`
##   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
## 1 g(x)          500ns    700ns  1076771.      848B    108. 
## 2 h(x)          500ns    700ns  1039037.      848B      0  
## 3 i(x)          500ns    700ns   925182.      848B     92.5

30 / 36

(Why) is R slow? — Look-up with Mutable Environments

Example: function calls in nested environments

f <- function(x, y) {
  (x + y) ^ 2
}
random_env <- function(parent = globalenv()) {
  letter_list <- setNames(as.list(runif(26)), LETTERS)
  list2env(letter_list, envir = new.env(parent = parent))
}
set_env <- function(f, e) {
  environment(f) <- e
  f
}
f2 <- set_env(f, random_env())
f3 <- set_env(f, random_env(environment(f2)))
f4 <- set_env(f, random_env(environment(f3)))

31 / 36

(Why) is R slow? — Look-up with Mutable Environments

Example: function calls in nested environments — ctd.

bench::mark(f(1, 2), f2(1, 2), f3(1, 2), f4(1, 2))

## # A tibble: 4 x 6
##   expression      min   median `itr/sec` mem_alloc `gc/sec`
##   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
## 1 f(1, 2)       300ns    400ns  1568554.        0B       0 
## 2 f2(1, 2)      500ns    700ns  1071876.        0B     107.
## 3 f3(1, 2)      500ns    700ns  1149359.        0B       0 
## 4 f4(1, 2)      600ns    700ns  1230237.        0B     123.

Each additional environment between f() and the global environment where + and ^ are defined increases computation time.

32 / 36

(Why) is R slow? — Lazy Evaluation Overhead

Remember that function arguments are lazy — they are only evaluated when actually needed.

Example: lazy evaluation

f1 <- function(a) {
  force(a) # why not simply write 'a'?
  NULL
  }
f1(stop("My dog speaks Chinese"))

vs.

f1 <- function(a) NULL
f1(stop("My dog speaks Chinese"))

33 / 36

(Why) is R slow? — Lazy Evaluation Overhead

Promises are very convenient but should be avoided if speed is crucial.

Example: lazy evaluation — ctd.

f1 <- function(a) NULL
f2 <- function(a = 1, b = 2, c = 4, d = 4, e = 5) NULL
bench::mark(f1(), f2())

## # A tibble: 2 × 6
##   expression      min   median `itr/sec` mem_alloc `gc/sec`
##   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
## 1 f1()              0     82ns 10318877.        0B        0
## 2 f2()          123ns    205ns  4135334.        0B        0

Arguments trigger promises each time the function is called. Superfluous arguments thus lead to avoidable overhead.

34 / 36

(Why) is R slow? — Summary

R is not slow per se — but it is slow compared to other languages:

Speed isn't its strongest suit but accessability and compatibility are. R was designed to make life easier for you, not for your computer!

More technically: R is an easy-to-use high level programming language. It provides a flexible and extensible toolkit for data analysis and statistics.
We are (mostly) happy to accept the slower speed for the time saved from not having to reinvent the wheel each time we want to implement a new function that, e.g., computes a test statistic.

(meanwhile (2022-06-20) there are 18244 packages available on CRAN with many useful functions waiting to be discovered by you!)
We cannot overcome the idiosyncrasies of R described in this section. However, there are a few things and tricks to keep in mind if you want to write performant code. We'll discuss these in the next Lecture.

35 / 36

Thank You!

36 / 36

Help

Keyboard shortcuts

↑, ←, Pg Up, k

Go to previous slide

↓, →, Pg Dn, Space, j

Go to next slide

Home

Go to first slide

End

Go to last slide

Number + Return

Go to specific slide

b / m / f

Toggle blackout / mirrored / fullscreen mode

Clone slideshow

Toggle presenter mode

Restart the presentation timer

?, h

Toggle this help

Advanced R for Econometricians

Measuring Performance and Profiling

Martin C. Arnold, Jens Klenke

What's up next? — Overview

Profiling

Prerequisites

Profiling — utils::Rprof()

Example: profiling a call of replicate()

Visualising profiles: profvis::profvis()

Exercise: profiling nested functions

Visualising profiles: profvis::profvis()

Example: profiling linear regression

Visualising profiles: profvis::profvis()

Visualising profiles: profvis::profvis()

Memory Profiling

Exercise: extensive garbage collection

Memory Profiling

Exercise: coercion to another type

Memory Profiling

Profiling — some notes and hints

Microbenchmarking

What is a Microbenchmark?

Microbenchmarking — the bench package

Microbenchmarking — bench::mark()

Example: "the fastest square root"

Microbenchmarking — bench::mark()

Example: "the fastest square root"

Microbenchmarking — bench::mark()

Example: non-equivalent code

Microbenchmarking — bench::press()

Exercise: benchmark against parameter grid

Microbenchmarking — Exercises

(Why) is R slow?

(Why) is R slow?

(Why) is R slow? — Slow Code Interpretation

Example: extreme dynamism

(Why) is R slow?

(Why) is R slow? — Name Look-up with Mutable Environments

(Why) is R slow? — Name Look-up with Mutable Environments

Example: have you seen z?

(Why) is R slow? — Name Look-up with Mutable Environments

Example: have you seen z? — ctd.

(Why) is R slow? — Look-up with Mutable Environments

Example: excessive usage of parenthesis

(Why) is R slow? — Look-up with Mutable Environments

Example: function calls in nested environments

(Why) is R slow? — Look-up with Mutable Environments

Example: function calls in nested environments — ctd.

(Why) is R slow? — Lazy Evaluation Overhead

Example: lazy evaluation

(Why) is R slow? — Lazy Evaluation Overhead

Example: lazy evaluation — ctd.

(Why) is R slow? — Summary

Thank You!

What's up next? — Overview

Help

Profiling — `utils::Rprof()`

Example: profiling a call of `replicate()`

Visualising profiles: `profvis::profvis()`

Visualising profiles: `profvis::profvis()`

Visualising profiles: `profvis::profvis()`

Visualising profiles: `profvis::profvis()`

Microbenchmarking — the `bench` package

Microbenchmarking — `bench::mark()`

Microbenchmarking — `bench::mark()`

Microbenchmarking — `bench::mark()`

Microbenchmarking — `bench::press()`