Hi all,
I'm looking for a function with the same functionalities as Hmisc::summarize
but accepting a dataframe as input (not just a vector or a matrix).
I'd like to compute the correlation between two variables in my dataframe,
grouped according to other variables in the same dataframe.
For exemple, consider the following dataframe D:
V1 V2 V3
A 1 -1
A 1 1
A -1 -1
B 1 1
B 1 1
I'd like to use Hmisc::summarize(X=D, llist(myvar=D$V1), FUN=corr.V2.V3)
where corr.V2.V3 is defined as follows:
corr.V2.V3 = function(x) {
d = as.data.frame(x)
out = c(cor(d$V2, d$V3))
names(out) = c("CORR")
return(out)
}
As I was not able to use the Hmisc::summarize in this case, I wrote a
function using sapply instead (to construct the output dataframe), but this
is much less practical.
Thanks in advance,
Arnaud
[[alternative HTML version deleted]]