Hello, fellow R-users. Let me describe the setup first. I have a data.frame, a sample of which is reported below: Company.Name Periods Returns MFR.Factor 350 Wartsila Oyj A 1996-07-31 6.82 0.02 351 Custodia Holding AG 1996-07-31 4.15 -0.02 352 Wartsila Oyj 1996-07-31 7.73 0.09 353 GEA Group AG 1996-07-31 10.12 0.04 354 LEGRAND ORD 1996-07-31 -7.46 -0.20 355 Mayr-Melnhof Karton AG 1996-07-31 4.71 -0.05 356 GEVAERT NPV 1996-08-30 NA NA 357 NOKIA K FMA2.50 1996-08-30 7.65 0.03 358 Altadis S.A. 1996-08-30 7.65 0.55 359 Metrovacesa S.A. 1996-08-30 4.55 -0.17 360 Oce N.V. 1996-08-30 9.43 0.23 The variable "Periods" is a date object, shows the month. Variables "Returns" and "MFR.Factor" are numeric. For each month the number of Returns and MFR.Factors varies, sometimes it is 350, sometimes 320 etc. What I need is to use cor.test(Returns, MFR.Factor,...) for each month, and produce a dataframe with columns: "Period", "cor.estimate", "p.value". The simplest way would be with tapply() using variable "Period" as a factor, but tapply() only applies FUN to just one cell. What is the most painless way to achieve my objective? Thank you in advance for your help! Best, Sergey
?by On 2/22/07, Sergey Goriatchev <sergeyg@gmail.com> wrote:> > Hello, fellow R-users. > > Let me describe the setup first. I have a data.frame, a sample of > which is reported below: > > Company.Name Periods Returns MFR.Factor > 350 Wartsila Oyj A 1996-07-31 6.82 0.02 > 351 Custodia Holding AG 1996-07-31 4.15 -0.02 > 352 Wartsila Oyj 1996-07-31 7.73 0.09 > 353 GEA Group AG 1996-07-31 10.12 0.04 > 354 LEGRAND ORD 1996-07-31 -7.46 -0.20 > 355 Mayr-Melnhof Karton AG 1996-07-31 4.71 -0.05 > 356 GEVAERT NPV 1996-08-30 NA NA > 357 NOKIA K FMA2.50 1996-08-30 7.65 0.03 > 358 Altadis S.A. 1996-08-30 7.65 0.55 > 359 Metrovacesa S.A. 1996-08-30 4.55 -0.17 > 360 Oce N.V. 1996-08-30 9.43 0.23 > > The variable "Periods" is a date object, shows the month. > Variables "Returns" and "MFR.Factor" are numeric. > For each month the number of Returns and MFR.Factors varies, sometimes > it is 350, sometimes 320 etc. > > What I need is to use cor.test(Returns, MFR.Factor,...) for each > month, and produce a dataframe with columns: "Period", "cor.estimate", > "p.value". > > The simplest way would be with tapply() using variable "Period" as a > factor, but tapply() only applies FUN to just one cell. > > What is the most painless way to achieve my objective? > > Thank you in advance for your help! > > Best, > Sergey > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]]
one approach is the following: dat <- data.frame( Period = as.Date(rep(c("1996-07-31", "1996-08-31", "1996-09-30"), each = 15)), Returns = rnorm(45), MFR.Factor = runif(45) ) ########### do.call(rbind, lapply(split(dat[c("Returns", "MFR.Factor")], dat$Period), function (x) { cr <- cor.test(x$Returns, x$MFR.Factor, method = "spearman") c("estimate" = cr$estimate, "p.value" = cr$p.value) })) I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm ----- Original Message ----- From: "Sergey Goriatchev" <sergeyg at gmail.com> To: <r-help at stat.math.ethz.ch> Sent: Thursday, February 22, 2007 2:35 PM Subject: [R] Combining tapply() and cor.test()?> Hello, fellow R-users. > > Let me describe the setup first. I have a data.frame, a sample of > which is reported below: > > Company.Name Periods Returns MFR.Factor > 350 Wartsila Oyj A 1996-07-31 6.82 0.02 > 351 Custodia Holding AG 1996-07-31 4.15 -0.02 > 352 Wartsila Oyj 1996-07-31 7.73 0.09 > 353 GEA Group AG 1996-07-31 10.12 0.04 > 354 LEGRAND ORD 1996-07-31 -7.46 -0.20 > 355 Mayr-Melnhof Karton AG 1996-07-31 4.71 -0.05 > 356 GEVAERT NPV 1996-08-30 NA NA > 357 NOKIA K FMA2.50 1996-08-30 7.65 0.03 > 358 Altadis S.A. 1996-08-30 7.65 0.55 > 359 Metrovacesa S.A. 1996-08-30 4.55 -0.17 > 360 Oce N.V. 1996-08-30 9.43 0.23 > > The variable "Periods" is a date object, shows the month. > Variables "Returns" and "MFR.Factor" are numeric. > For each month the number of Returns and MFR.Factors varies, > sometimes > it is 350, sometimes 320 etc. > > What I need is to use cor.test(Returns, MFR.Factor,...) for each > month, and produce a dataframe with columns: "Period", > "cor.estimate", > "p.value". > > The simplest way would be with tapply() using variable "Period" as a > factor, but tapply() only applies FUN to just one cell. > > What is the most painless way to achieve my objective? > > Thank you in advance for your help! > > Best, > Sergey > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm