Function Arguments
On Sat, 2005-03-26 at 15:43 -0500, Doran, Harold wrote:
Hello,
I am trying to wrap some code that I repeatedly use into a function
for efficiency. The following is a toy example simply to illustrate
the problem.
foobar.fun<-function(data,idvar,dv){
id.list<-unique(idvar)
result<-numeric(0)
for (i in id.list){
tmp1<-subset(data, idvar == i)
result[i]<-mean(get("tmp1")[[dv]])
}
return(result)
}
The issue is that when the variable 'dv' is replaced by the name of
the actual variable in the dataframe the function works as expected.
However, when 'dv' is used the function does not identify this as a
variable, even though it is one of the function arguments and the
function fails.
How can function arguments be passed to a loop in such cases?
Thank you,
Harold
Harold, Perhaps I am being confused by your example code, which can all be replaced by: tapply(data$dv, list(data$idvar), mean) Using the 'warpbreaks' data in ?tapply, get the mean of 'breaks' for each level of 'tension':
tapply(warpbreaks$breaks, list(warpbreaks$tension), mean)
L M H 36.38889 26.38889 21.66667 Of course, 'mean' can be replaced by more a more complex function call and additional arguments. Or you can use by():
by(warpbreaks$breaks, warpbreaks$tension, mean)
INDICES: L [1] 36.38889 ------------------------------------------------------ INDICES: M [1] 26.38889 ------------------------------------------------------ INDICES: H [1] 21.66667 or you can use split() on the data frame first, followed by sapply(): # split warpbreaks into a list of 3 data frames by the value of # tension, each containing only 'breaks'
warp.s <- split(warpbreaks$breaks, warpbreaks$tension)
# now use sapply to get the mean of breaks in each df:
sapply(warp.s, mean)
L M H 36.38889 26.38889 21.66667 Or even:
aggregate(warpbreaks$breaks, list(Tension = warpbreaks$tension), mean)
Tension x
1 L 36.38889
2 M 26.38889
3 H 21.66667
However, presuming that your actual code is rather different and the key
is that you are really having problems referencing the column elements
in your data frame, the line:
result[i]<-mean(get("tmp1")[[dv]])
would require that you pass the argument 'dv' as a character variable in
the original function call, such as:
foobar.fun(..., ..., dv = "VectorName")
When extracting a data frame column or list element using '[' or '[[',
the index(s) value must be either numeric or character.
So, again using the warpbreaks data to get the breaks column:
warpbreaks$breaks
[1] 26 30 54 25 70 52 51 26 67 18 21 29 17 12 18 35 30 36 36 21 24 18 [23] 10 43 28 15 26 27 14 29 19 29 31 41 20 44 42 26 19 16 39 28 21 39 [45] 29 20 21 24 17 13 15 15 16 28
warpbreaks[["breaks"]]
[1] 26 30 54 25 70 52 51 26 67 18 21 29 17 12 18 35 30 36 36 21 24 18 [23] 10 43 28 15 26 27 14 29 19 29 31 41 20 44 42 26 19 16 39 28 21 39 [45] 29 20 21 24 17 13 15 15 16 28
warpbreaks[[1]]
[1] 26 30 54 25 70 52 51 26 67 18 21 29 17 12 18 35 30 36 36 21 24 18 [23] 10 43 28 15 26 27 14 29 19 29 31 41 20 44 42 26 19 16 39 28 21 39 [45] 29 20 21 24 17 13 15 15 16 28
warpbreaks[, "breaks"]
[1] 26 30 54 25 70 52 51 26 67 18 21 29 17 12 18 35 30 36 36 21 24 18 [23] 10 43 28 15 26 27 14 29 19 29 31 41 20 44 42 26 19 16 39 28 21 39 [45] 29 20 21 24 17 13 15 15 16 28 However:
warpbreaks[[breaks]]
Error in (function(x, i) if (is.matrix(i)) as.matrix(x)[[i]] else .subset2(x, : Object "breaks" not found or
warpbreaks[, breaks]
Error in "[.data.frame"(warpbreaks, , breaks) : Object "breaks" not found HTH, Marc Schwartz <Will be away from e-mail for a while)