R and closures: a potential problem

1 minute read

R is an environment for exploratory data analysis that is becoming increasingly popular. It is also used for analysis of data in high-impact scientific publications, highlighting the importance of correctness of R programs.

Here is an interesting example I came up with showing that R’s implementation of closures is somewhat strange.

h1 = function(f, ...) function() f(...)
h2 = function(f, ...) {list(...); function() f(...)}
g = function(x = 1) x
x = c(h1(g, 1)(), h1(g, 2)(), h1(g, 3)())
y = c(h2(g, 1)(), h2(g, 2)(), h2(g, 3)())

The two functions h1 and h2 essentially bind formal arguments of g to particular value (this trick is useful if you want to evaluate a function for a large number of parameter choices by just creating a list of 0-ary functions each of which represent one function call with one parameter setting). After running the above code, x and y are both 1:3 as expected. If we do this:

l1 = sapply(1:3, function(i) h1(g, i))
l2 = sapply(1:3, function(i) h2(g, i))

x = sapply(l1, function(f) f())
y = sapply(l2, function(f) f())

We get that y == 1:3, just as we expect, but x == c(3,3,3). This has to do with how R deals with closures (R calls these promises and are really a function and an environment from which it gets it variables). The '...' parameter list is essentially a function that when evaluated returns a list of parameters. The 'list(...)' statement in h2 forces the evaluation of this parameter list, and everything works as it should. However, this is not done in h1, and all the generated functions all get last incarnation of the parameters. Herein lies the problem, as in my opinion h1 should exhibit the behavior of h2.

If one is used to programming in an imperative style, the above examples might seem esoteric, but programmers used to functional languages use closures actively. An example of how closures can be used to take care of “houskeeping” can be seen here, where a fuzzy inference engine is implemented in 20 lines of code.