Friday, June 21, 2019

How do you code an R function so that it 'knows' to look in 'data' for the variables in other arguments?



If you run:



mod <- lm(mpg ~ factor(cyl), data=mtcars)



It runs, because lm knows to look in mtcars to find both mpg and cyl.



Yet mean(mpg) fails as it can't find mpg, so you do mean(mtcars$mpg).



How do you code a function so that it knows to look in 'data' for the variables?



myfun <- function (a,b,data){
return(a+b)

}


This will work with:



myfun(mtcars$mpg, mtcars$hp)


but will fail with:




myfun(mpg,hp, data=mtcars )


Cheers


Answer



Here's how I would code myfun():



myfun <- function(a, b, data) {
eval(substitute(a + b), envir=data, enclos=parent.frame())
}


myfun(mpg, hp, mtcars)
# [1] 131.0 131.0 115.8 131.4 193.7 123.1 259.3 86.4 117.8 142.2 140.8 196.4
# [13] 197.3 195.2 215.4 225.4 244.7 98.4 82.4 98.9 118.5 165.5 165.2 258.3
# [25] 194.2 93.3 117.0 143.4 279.8 194.7 350.0 130.4


If you're familiar with with(), it's interesting to see that it works in almost exactly the same way:



> with.default

# function (data, expr, ...)
# eval(substitute(expr), data, enclos = parent.frame())
#
#


In both cases, the key idea is to first create an expression from the symbols passed in as arguments and then evaluate that expression using data as the 'environment' of the evaluation.



The first part (e.g. turning a + b into the expression mpg + hp) is possible thanks to substitute(). The second part is possible because eval() was beautifully designed, such that it can take a data.frame as its evaluation environment.


No comments:

Post a Comment

plot explanation - Why did Peaches&#39; mom hang on the tree? - Movies &amp; TV

In the middle of the movie Ice Age: Continental Drift Peaches' mom asked Peaches to go to sleep. Then, she hung on the tree. This parti...