On Sat, 2005-03-26 at 15:43 -0500, Doran, Harold wrote:> Hello,
>
> I am trying to wrap some code that I repeatedly use into a function
> for efficiency. The following is a toy example simply to illustrate
> the problem.
>
> foobar.fun<-function(data,idvar,dv){
> id.list<-unique(idvar)
> result<-numeric(0)
> for (i in id.list){
> tmp1<-subset(data, idvar == i)
> result[i]<-mean(get("tmp1")[[dv]])
> }
> return(result)
> }
>
> The issue is that when the variable 'dv' is replaced by the name of
> the actual variable in the dataframe the function works as expected.
> However, when 'dv' is used the function does not identify this as a
> variable, even though it is one of the function arguments and the
> function fails.
>
> How can function arguments be passed to a loop in such cases?
>
> Thank you,
> Harold
Harold,
Perhaps I am being confused by your example code, which can all be
replaced by:
tapply(data$dv, list(data$idvar), mean)
Using the 'warpbreaks' data in ?tapply, get the mean of 'breaks'
for
each level of 'tension':
> tapply(warpbreaks$breaks, list(warpbreaks$tension), mean)
L M H
36.38889 26.38889 21.66667
Of course, 'mean' can be replaced by more a more complex function call
and additional arguments.
Or you can use by():
> by(warpbreaks$breaks, warpbreaks$tension, mean)
INDICES: L
[1] 36.38889
------------------------------------------------------
INDICES: M
[1] 26.38889
------------------------------------------------------
INDICES: H
[1] 21.66667
or you can use split() on the data frame first, followed by sapply():
# split warpbreaks into a list of 3 data frames by the value of
# tension, each containing only 'breaks'> warp.s <- split(warpbreaks$breaks, warpbreaks$tension)
# now use sapply to get the mean of breaks in each df:> sapply(warp.s, mean)
L M H
36.38889 26.38889 21.66667
Or even:
> aggregate(warpbreaks$breaks, list(Tension = warpbreaks$tension), mean)
Tension x
1 L 36.38889
2 M 26.38889
3 H 21.66667
However, presuming that your actual code is rather different and the key
is that you are really having problems referencing the column elements
in your data frame, the line:
result[i]<-mean(get("tmp1")[[dv]])
would require that you pass the argument 'dv' as a character variable in
the original function call, such as:
foobar.fun(..., ..., dv = "VectorName")
When extracting a data frame column or list element using '[' or
'[[',
the index(s) value must be either numeric or character.
So, again using the warpbreaks data to get the breaks column:
> warpbreaks$breaks
[1] 26 30 54 25 70 52 51 26 67 18 21 29 17 12 18 35 30 36 36 21 24 18
[23] 10 43 28 15 26 27 14 29 19 29 31 41 20 44 42 26 19 16 39 28 21 39
[45] 29 20 21 24 17 13 15 15 16 28
> warpbreaks[["breaks"]]
[1] 26 30 54 25 70 52 51 26 67 18 21 29 17 12 18 35 30 36 36 21 24 18
[23] 10 43 28 15 26 27 14 29 19 29 31 41 20 44 42 26 19 16 39 28 21 39
[45] 29 20 21 24 17 13 15 15 16 28
> warpbreaks[[1]]
[1] 26 30 54 25 70 52 51 26 67 18 21 29 17 12 18 35 30 36 36 21 24 18
[23] 10 43 28 15 26 27 14 29 19 29 31 41 20 44 42 26 19 16 39 28 21 39
[45] 29 20 21 24 17 13 15 15 16 28
> warpbreaks[, "breaks"]
[1] 26 30 54 25 70 52 51 26 67 18 21 29 17 12 18 35 30 36 36 21 24 18
[23] 10 43 28 15 26 27 14 29 19 29 31 41 20 44 42 26 19 16 39 28 21 39
[45] 29 20 21 24 17 13 15 15 16 28
However:
> warpbreaks[[breaks]]
Error in (function(x, i) if (is.matrix(i)) as.matrix(x)[[i]]
else .subset2(x, :
Object "breaks" not found
or
> warpbreaks[, breaks]
Error in "[.data.frame"(warpbreaks, , breaks) :
Object "breaks" not found
HTH,
Marc Schwartz
<Will be away from e-mail for a while)