Pat Schmitz
2009-Jul-30 08:19 UTC
[R] What is the best method to produce means by categorical factors?
I am attempting to replicate some of my experience from SAS in R and assume there are best methods for using a combination of summary(), subset, and which() to produce a subset of mean values by categorical or ordinal factors. within sas I would write proc means mean data=dataset; class factor1 factor2 var variable1 variable2; RUN; producing an output with means for each variable by factor groupings as below: *factor1 factor2 obs variable mean* Level A treatmentA 3 variable1 10 variable2 22 treatmentB 3 variable1 12 variable2 30 Level B treatmentA 3 variable1 10 variable2 22 treatmentB 3 variable1 12 variable2 30 What is the best way to go about this in R? -- Patrick Schmitz Graduate Student Plant Biology 1206 West Gregory Drive RM 1500 [[alternative HTML version deleted]]
Petr PIKAL
2009-Jul-30 08:35 UTC
[R] Odp: What is the best method to produce means by categorical factors?
Hi r-help-bounces at r-project.org napsal dne 30.07.2009 10:19:21:> I am attempting to replicate some of my experience from SAS in R andassume> there are best methods for using a combination of summary(), subset, and > which() to produce a subset of mean values by categorical or ordinal > factors. > > within sas I would write > > proc means mean data=dataset; > class factor1 factor2 > var variable1 variable2; > RUN; > > producing an output with means for each variable by factor groupings as > below: > > *factor1 factor2 obs variable mean* > Level A treatmentA 3 variable1 10 > variable2 22 > > treatmentB 3 variable1 12 > variable2 30 > > Level B treatmentA 3 variable1 10 > variable2 22 > > treatmentB 3 variable1 12 > variable2 30 > > What is the best way to go about this in R?See ?aggregate, ?by, ?tapply and maybe also doBy and plyr packages. Something like aggregate(data, list(variable, factor2, factor1), mean) Best regards Petr> > > > > > > -- > Patrick Schmitz > Graduate Student > Plant Biology > 1206 West Gregory Drive > RM 1500 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
ONKELINX, Thierry
2009-Jul-30 08:39 UTC
[R] What is the best method to produce means by categorical factors?
Dear Pat, Have a look at recast from the reshape package. library(reshape) dataset <- expand.grid(factor1 = c("A", "B"), factor2 = c("C", "D"), Rep = 1:3) dataset$variable1 <- rnorm(nrow(dataset)) dataset$variable2 <- rnorm(nrow(dataset), mean = 10) recast(factor1 + factor2 + variable ~ ., data = dataset, id.var c("factor1", "factor2", "Rep"), fun = mean) HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens Pat Schmitz Verzonden: donderdag 30 juli 2009 10:19 Aan: r-help at r-project.org Onderwerp: [R] What is the best method to produce means by categorical factors? I am attempting to replicate some of my experience from SAS in R and assume there are best methods for using a combination of summary(), subset, and which() to produce a subset of mean values by categorical or ordinal factors. within sas I would write proc means mean data=dataset; class factor1 factor2 var variable1 variable2; RUN; producing an output with means for each variable by factor groupings as below: *factor1 factor2 obs variable mean* Level A treatmentA 3 variable1 10 variable2 22 treatmentB 3 variable1 12 variable2 30 Level B treatmentA 3 variable1 10 variable2 22 treatmentB 3 variable1 12 variable2 30 What is the best way to go about this in R? -- Patrick Schmitz Graduate Student Plant Biology 1206 West Gregory Drive RM 1500 [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
John Kane
2009-Jul-30 23:09 UTC
[R] What is the best method to produce means by categorical factors?
The most common would be aggregate. You can change mean for other functions ( e.g. sum or length, median, or one you write yourself) You probably would find Bob Munechen's book http://rforsasandspssusers.com/ very useful and there is a shorter pdf available on that page. Example: dd <- data.frame(factor1= rep(letters[1:10], 2), factor2=rep(LETTERS[1:5],4), var1=rnorm(20, 10,2), var2=rnorm(20, 25,5)) aggregate(dd[,3:4], by=list(dd[,1],dd[,2]), mean) aggregate(dd[,3:4], by=list(dd[,1],dd[,2]), sum) --- On Thu, 7/30/09, Pat Schmitz <p.schmitz at gmail.com> wrote:> From: Pat Schmitz <p.schmitz at gmail.com> > Subject: [R] What is the best method to produce means by categorical factors? > To: r-help at r-project.org > Received: Thursday, July 30, 2009, 4:19 AM > I am attempting to replicate some of > my experience from SAS in R and assume > there are best methods for using a combination of > summary(), subset, and > which() to produce a subset of mean values by categorical > or ordinal > factors. > > within sas I would write > > proc means mean data=dataset; > class factor1 factor2 > var variable1 variable2; > RUN; > > producing an output with means for each variable by factor > groupings as > below: > > *factor1? ? ? ? factor2? ? > ? ? ? obs? ? > ???variable? ? mean* > Level A? ? ? ? treatmentA? ? > ? ? 3? ? ? ? variable1? > ? 10 > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? variable2? ? > 22 > > ? ? ? ? ? ? > ???treatmentB? ? ? ? > 3? ? ? ? variable1? ? 12 > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? variable2? ? > 30 > > Level B? ? ? ? treatmentA? ? > ? ? 3? ? ? ? variable1? > ? 10 > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? variable2? ? > 22 > > ? ? ? ? ? ? > ???treatmentB? ? ? ? > 3? ? ? ? variable1? ? 12 > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? variable2? ? > 30 > > What is the best way to go about this in R? > > > > > > > -- > Patrick Schmitz > Graduate Student > Plant Biology > 1206 West Gregory Drive > RM 1500 > > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >__________________________________________________________________ Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now http://ca.toolbar.yahoo.com.