Hi, I am currently trying to z-transform (that is subtracting the mean and divide by the standard deviation) multiple columns of a data.frame at the same time. My first approach was: x <- data.frame(c(0:10), c(10:20)) (x - colMeans(x)) / apply(x, 2, sd) This is obviously not working. Is there a convenient way to z-transform each column separately (so in this case, each column represents an independent variable that should be z-transformed) thanks!
On Fri, Jan 20, 2012 at 9:04 AM, Martin Batholdy <batholdy at googlemail.com> wrote:> Hi, > > > I am currently trying to z-transform (that is subtracting the mean and divide by the standard deviation) multiple columns of a data.frame at the same time. > > > My first approach was: > > x <- data.frame(c(0:10), c(10:20)) > (x - colMeans(x)) / apply(x, 2, sd) > > > This is obviously not working. > > Is there a convenient way to z-transform each column separately (so in this case, each column represents an independent variable that should be z-transformed)scale(x) will scale each column of a matrix/ data frame to mean 0 and variance 1. Peter
? scale apply(x, 2, scale) Michael On Fri, Jan 20, 2012 at 12:04 PM, Martin Batholdy <batholdy at googlemail.com> wrote:> Hi, > > > I am currently trying to z-transform (that is subtracting the mean and divide by the standard deviation) multiple columns of a data.frame at the same time. > > > My first approach was: > > x <- data.frame(c(0:10), c(10:20)) > (x - colMeans(x)) / apply(x, 2, sd) > > > This is obviously not working. > > Is there a convenient way to z-transform each column separately (so in this case, each column represents an independent variable that should be z-transformed) > > > thanks! > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Martin, On Fri, Jan 20, 2012 at 12:04 PM, Martin Batholdy <batholdy at googlemail.com> wrote:> Hi, > > > I am currently trying to z-transform (that is subtracting the mean and divide by the standard deviation) multiple columns of a data.frame at the same time. > > > My first approach was: > > x <- data.frame(c(0:10), c(10:20)) > (x - colMeans(x)) / apply(x, 2, sd) > > > This is obviously not working. > > Is there a convenient way to z-transform each column separately (so in this case, each column represents an independent variable that should be z-transformed)The `scale` function essentially does this except it divides by the sample standard deviation. Punch `scale.default` into your R session to see the code if you want to see how one way you could write the code yourself. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
On Fri, 2012-01-20 at 18:04 +0100, Martin Batholdy wrote:> Hi, > > > I am currently trying to z-transform (that is subtracting the mean and divide by the standard deviation) multiple columns of a data.frame at the same time. > > > My first approach was: > > x <- data.frame(c(0:10), c(10:20)) > (x - colMeans(x)) / apply(x, 2, sd) > > > This is obviously not working. > > Is there a convenient way to z-transform each column separately (so in this case, each column represents an independent variable that should be z-transformed)?scale> scale(x)c.0.10. c.10.20. [1,] -1.5075567 -1.5075567 [2,] -1.2060454 -1.2060454 [3,] -0.9045340 -0.9045340 [4,] -0.6030227 -0.6030227 [5,] -0.3015113 -0.3015113 [6,] 0.0000000 0.0000000 [7,] 0.3015113 0.3015113 [8,] 0.6030227 0.6030227 [9,] 0.9045340 0.9045340 [10,] 1.2060454 1.2060454 [11,] 1.5075567 1.5075567 attr(,"scaled:center") c.0.10. c.10.20. 5 15 attr(,"scaled:scale") c.0.10. c.10.20. 3.316625 3.316625> > thanks! > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
great, thank you! On 20.01.2012, at 18:10, R. Michael Weylandt wrote:> ? scale > apply(x, 2, scale) > > Michael > > On Fri, Jan 20, 2012 at 12:04 PM, Martin Batholdy > <batholdy at googlemail.com> wrote: >> Hi, >> >> >> I am currently trying to z-transform (that is subtracting the mean and divide by the standard deviation) multiple columns of a data.frame at the same time. >> >> >> My first approach was: >> >> x <- data.frame(c(0:10), c(10:20)) >> (x - colMeans(x)) / apply(x, 2, sd) >> >> >> This is obviously not working. >> >> Is there a convenient way to z-transform each column separately (so in this case, each column represents an independent variable that should be z-transformed) >> >> >> thanks! >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
If you use apply the result will be a matrix, not a data.frame. You could use a for loop for(j in seq_len(ncol(x))) { x[,j] <- scale(x[,j]) } or the odd looking x[] <- lapply(x, scale) to scale all the columns and keep x a data.frame. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Martin Batholdy > Sent: Friday, January 20, 2012 9:17 AM > To: R Help > Subject: Re: [R] z-transform each column of a data.frame > > > great, thank you! > > > On 20.01.2012, at 18:10, R. Michael Weylandt wrote: > > > ? scale > > apply(x, 2, scale) > > > > Michael > > > > On Fri, Jan 20, 2012 at 12:04 PM, Martin Batholdy > > <batholdy at googlemail.com> wrote: > >> Hi, > >> > >> > >> I am currently trying to z-transform (that is subtracting the mean and divide by the standard > deviation) multiple columns of a data.frame at the same time. > >> > >> > >> My first approach was: > >> > >> x <- data.frame(c(0:10), c(10:20)) > >> (x - colMeans(x)) / apply(x, 2, sd) > >> > >> > >> This is obviously not working. > >> > >> Is there a convenient way to z-transform each column separately (so in this case, each column > represents an independent variable that should be z-transformed) > >> > >> > >> thanks! > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.