thr3ads.net - R help - [R] problem with FUN in Hmisc::summarize [Apr 2010]

If this information is useful, please help other people find it:
Share via:

arnaud chozo

2010-Apr-16 13:21 UTC

[R] problem with FUN in Hmisc::summarize

Hi all,

I'd like to use the Hmisc::summarize function, but it uses a function (FUN)
of a single vector argument to create the statistical summaries.

Consider an easy case: I'd like to compute the correlation between two
variables in my dataframe, grouped according to other variables in the same
dataframe.

For exemple, consider the following dataframe D:
V1  V2   V3
A     1    -1
A     1     1
A    -1    -1
B     1     1
B     1     1

I'd like to use Hmisc::summarize(X=D, by=llist(myvar=D$V1), FUN=corr.V2.V3)

where corr.V2.V3 is defined as follows:

corr.V2.V3 = function(x) {
  d = cbind(x$V2, x$V3)

  out = c(cor(d))
  names(out) = c("CORR")
  return(out)
}

I was not able to use Hmisc::summarize in this case because FUN should be a
function of a matrix argument. Any idea?

Thanks in advance,
Arnaud

	[[alternative HTML version deleted]]

Ista Zahn

2010-Apr-16 14:40 UTC

head link

[R] problem with FUN in Hmisc::summarize

Hi Arnaud,
I'm not sure how do to this with Hmis::summarize, but it's pretty easy
with
plyr::ddply:

D <- read.table(textConnection("V1  V2   V3
A     1    -1
A     1     1
A    -1    -1
B     1     1
B     1     1"), header=TRUE)
closeAllConnections()

corr.V2.V3 = function(x) {
 out = cor(x$V2, x$V3)
 names(out) = "CORR"
 return(out)
}

library(plyr)

ddply(D, .(V1), corr.V2.V3)

-Ista

On Fri, Apr 16, 2010 at 9:21 AM, arnaud chozo
<arnaud.chozo@gmail.com>wrote:
> Hi all,
>
> I'd like to use the Hmisc::summarize function, but it uses a function
(FUN)
> of a single vector argument to create the statistical summaries.
>
> Consider an easy case: I'd like to compute the correlation between two
> variables in my dataframe, grouped according to other variables in the same
> dataframe.
>
> For exemple, consider the following dataframe D:
> V1  V2   V3
> A     1    -1
> A     1     1
> A    -1    -1
> B     1     1
> B     1     1
>
> I'd like to use Hmisc::summarize(X=D, by=llist(myvar=D$V1),
FUN=corr.V2.V3)
>
> where corr.V2.V3 is defined as follows:
>
> corr.V2.V3 = function(x) {
>  d = cbind(x$V2, x$V3)
>
>  out = c(cor(d))
>  names(out) = c("CORR")
>  return(out)
> }
>
> I was not able to use Hmisc::summarize in this case because FUN should be a
> function of a matrix argument. Any idea?
>
> Thanks in advance,
> Arnaud
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

	[[alternative HTML version deleted]]

hadley wickham

2010-Apr-16 14:54 UTC

head link

[R] problem with FUN in Hmisc::summarize

> corr.V2.V3 = function(x) {
> ?out = cor(x$V2, x$V3)
> ?names(out) = "CORR"
> ?return(out)
> }
A litte more concisely:

corr.V2.V3 = function(x) {
 c(CORR = cor(x$V2, x$V3))
}


-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Frank E Harrell Jr

2010-Apr-16 19:04 UTC

head link

[R] problem with FUN in Hmisc::summarize

arnaud chozo wrote:> Hi all,
> 
> I'd like to use the Hmisc::summarize function, but it uses a function
(FUN)
> of a single vector argument to create the statistical summaries.
> 
> Consider an easy case: I'd like to compute the correlation between two
> variables in my dataframe, grouped according to other variables in the same
> dataframe.
> 
> For exemple, consider the following dataframe D:
> V1  V2   V3
> A     1    -1
> A     1     1
> A    -1    -1
> B     1     1
> B     1     1
> 
> I'd like to use Hmisc::summarize(X=D, by=llist(myvar=D$V1),
FUN=corr.V2.V3)
> 
> where corr.V2.V3 is defined as follows:
> 
> corr.V2.V3 = function(x) {
>   d = cbind(x$V2, x$V3)
> 
>   out = c(cor(d))
>   names(out) = c("CORR")
>   return(out)
> }
> 
> I was not able to use Hmisc::summarize in this case because FUN should be a
> function of a matrix argument. Any idea?
> 
> Thanks in advance,
> Arnaud
See the Hmisc mApply or summary.formula functions, or use tapply using a 
vector of possible subscripts (1:n) as the first argument; then you can 
use the subscripts selected to address multiple variables.

Frank

-- 
Frank E Harrell Jr   Professor and Chairman        School of Medicine
                      Department of Biostatistics   Vanderbilt University

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Apr 2010 - problem with FUN in Hmisc::summarize

[R] problem with FUN in Hmisc::summarize

[R] problem with FUN in Hmisc::summarize

[R] problem with FUN in Hmisc::summarize

[R] problem with FUN in Hmisc::summarize

Seemingly Similar Threads