Hi folks, I have a dataframe df.vars with the follwing structure: var1 var2 var3 group Group is a factor. Now I want to standardize the vars 1-3 (actually - there are many more) by class, so I define z.mean.sd <- function(data){ return.values <- (data - mean(data)) / (sd(data)) return(return.values) } now I can call for each var z.var1 <- by(df.vars$var1, group, z.mean.sd) which gives me the standardised data for each subgroup in a list with the subgroups z.var1 <- unlist(z.var1) then gives me the z-standardised data for var1 in one vector. Great! Now I would like to do this for the whole dataframe, but probably I am not thinking vectorwise enough. z.df.vars <- by(df.vars, group, z.mean.sd) does not work. I banged my head on other solutions trying out sapply and tapply, but did not succeed. Do I need to loop and put everything together by hand? But I want to keep the columnnames in the vector? -karsten --------------------------------------------------------------------------------------------- Karsten D. Wolf Didactical Design of Interactive Learning Environments Universit?t Bremen - Fachbereich 12 web: http://www.ifeb.uni-bremen.de/wolf/
Hi Karsten, Let me assume your data is called d. If I understood what you are trying to do, the following might help: res <- apply(d, 2, tapply, d$group, scale) res See ?apply, ?tapply and ?scale for more information. HTH, Jorge On Sun, Nov 29, 2009 at 10:41 AM, Karsten Wolf <> wrote:> Hi folks, > I have a dataframe df.vars with the follwing structure: > > > var1 var2 var3 group > > Group is a factor. > > Now I want to standardize the vars 1-3 (actually - there are many more) by > class, so I define > > z.mean.sd <- function(data){ > return.values <- (data - mean(data)) / (sd(data)) > return(return.values) > } > > now I can call for each var > > z.var1 <- by(df.vars$var1, group, z.mean.sd) > > which gives me the standardised data for each subgroup in a list with the > subgroups > > z.var1 <- unlist(z.var1) > > then gives me the z-standardised data for var1 in one vector. Great! > > Now I would like to do this for the whole dataframe, but probably I am not > thinking vectorwise enough. > > z.df.vars <- by(df.vars, group, z.mean.sd) > > does not work. I banged my head on other solutions trying out sapply and > tapply, but did not succeed. Do I need to loop and put everything together > by hand? But I want to keep the columnnames in the vector… > > -karsten > > > > --------------------------------------------------------------------------------------------- > Karsten D. Wolf > Didactical Design of Interactive > Learning Environments > Universität Bremen - Fachbereich 12 > web: http://www.ifeb.uni-bremen.de/wolf/ > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
http://finzi.psych.upenn.edu/R/library/QuantPsyc/html/Make.Z.html Make.Z in the QuantPsych package may already do it. --- On Sun, 11/29/09, Karsten Wolf <wolf at uni-bremen.de> wrote:> From: Karsten Wolf <wolf at uni-bremen.de> > Subject: [R] How to z-standardize for subgroups? > To: r-help at r-project.org > Received: Sunday, November 29, 2009, 10:41 AM > Hi folks, > I have a dataframe df.vars with the follwing structure: > > > var1???var2???var3???group > > Group is a factor. > > Now I want to standardize the vars 1-3 (actually - there > are many more) by class, so I define > > z.mean.sd <- function(data){ > ??? return.values <- (data? - > mean(data)) / (sd(data)) > ??? return(return.values) > } > > now I can call for each var > > z.var1 <- by(df.vars$var1, group, z.mean.sd) > > which gives me the standardised data for each subgroup in a > list with the subgroups > > z.var1 <- unlist(z.var1) > > then gives me the z-standardised data for var1 in one > vector. Great! > > Now I would like to do this for the whole dataframe, but > probably I am not thinking vectorwise enough. > > z.df.vars <- by(df.vars, group, z.mean.sd) > > does not work. I banged my head on other solutions trying > out sapply and tapply, but did not succeed. Do I need to > loop and put everything together by hand? But I want to keep > the columnnames in the vector? > > -karsten > > > --------------------------------------------------------------------------------------------- > Karsten D. Wolf > Didactical Design of Interactive > Learning Environments > Universit?t Bremen - Fachbereich 12 > web: http://www.ifeb.uni-bremen.de/wolf/ > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >__________________________________________________ Do You Yahoo!? Tired of spam?
On 11/29/2009 4:23 PM, John Kane wrote:> http://finzi.psych.upenn.edu/R/library/QuantPsyc/html/Make.Z.html > > Make.Z in the QuantPsych package may already do it.For a single variable, you could use ave() and scale() together like this: with(iris, ave(Sepal.Width, Species, FUN = scale)) To scale more than one variable in a concise call, consider something along these lines: apply(iris[,1:4], 2, function(x){ave(x, iris$Species, FUN = scale)}) hope this helps, Chuck Cleland> --- On Sun, 11/29/09, Karsten Wolf <wolf at uni-bremen.de> wrote: > >> From: Karsten Wolf <wolf at uni-bremen.de> >> Subject: [R] How to z-standardize for subgroups? >> To: r-help at r-project.org >> Received: Sunday, November 29, 2009, 10:41 AM >> Hi folks, >> I have a dataframe df.vars with the follwing structure: >> >> >> var1 var2 var3 group >> >> Group is a factor. >> >> Now I want to standardize the vars 1-3 (actually - there >> are many more) by class, so I define >> >> z.mean.sd <- function(data){ >> return.values <- (data - >> mean(data)) / (sd(data)) >> return(return.values) >> } >> >> now I can call for each var >> >> z.var1 <- by(df.vars$var1, group, z.mean.sd) >> >> which gives me the standardised data for each subgroup in a >> list with the subgroups >> >> z.var1 <- unlist(z.var1) >> >> then gives me the z-standardised data for var1 in one >> vector. Great! >> >> Now I would like to do this for the whole dataframe, but >> probably I am not thinking vectorwise enough. >> >> z.df.vars <- by(df.vars, group, z.mean.sd) >> >> does not work. I banged my head on other solutions trying >> out sapply and tapply, but did not succeed. Do I need to >> loop and put everything together by hand? But I want to keep >> the columnnames in the vector? >> >> -karsten >> >> >> --------------------------------------------------------------------------------------------- >> Karsten D. Wolf >> Didactical Design of Interactive >> Learning Environments >> Universit?t Bremen - Fachbereich 12 >> web: http://www.ifeb.uni-bremen.de/wolf/ >> >> ______________________________________________ >> R-help at r-project.org >> mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, >> reproducible code. >> > > __________________________________________________ > Do You Yahoo!? > Tired of spam? > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894
Reasonably Related Threads
- Supplying NA or Zeros in dataframe for missing factor combinations
- Research assistant at University Bremen, Germany: Timeseries Analysis with R
- Multiple imputation on subgroups
- perform subgroup meta-analysis and create forest plot displaying subgroups
- how to replace values in x by means in subgroups created in ... (not loops)