Hi,
I want to be able to create a vector of z-scores from a vector of 
continuous data, conditional on a group membership vector.
Say you have 20 numbers distributed normally with a mean of 50 and an sd 
of 10:
x <- rnorm(20, 50, 10)
Then you have a vector that delineates 2 groups within x:
group <- sort(rep(c("A", "B"), 10))
test.data <- data.frame(cbind(x, group))
I know that if you break up the x vector into 2 different vectors then 
it becomes easy to calculate the z scores for each vector, then you 
stack them and append them to the original
data frame.  Is there anyway to apply this sort of calculation without 
splitting the original vector up?  I tried a really complex ifelse 
statement but it didn't seem to work.
Thanks in advance,
Matthew Dubins
Hello - First, I doubt you really want to cbind() those two vectors within the data.frame() function call. test.data <- data.frame(x, group) is probably what you want. That may be the source of your trouble. If you really want a vector returned, the following should work given your test.data is constructed without the cbind(): unlist(by(test.data$x, test.data$group, function(x) (x - mean(x)) / sd(x)), use.names = FALSE) Is that what you're after? Erik Matthew Dubins wrote:> Hi, > > I want to be able to create a vector of z-scores from a vector of > continuous data, conditional on a group membership vector. > > Say you have 20 numbers distributed normally with a mean of 50 and an sd > of 10: > > x <- rnorm(20, 50, 10) > > > Then you have a vector that delineates 2 groups within x: > > group <- sort(rep(c("A", "B"), 10)) > > test.data <- data.frame(cbind(x, group)) > > I know that if you break up the x vector into 2 different vectors then > it becomes easy to calculate the z scores for each vector, then you > stack them and append them to the original > data frame. Is there anyway to apply this sort of calculation without > splitting the original vector up? I tried a really complex ifelse > statement but it didn't seem to work. > > Thanks in advance, > Matthew Dubins > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Wayne.W.Jones at shell.com
2007-Sep-27  06:49 UTC
[R] Getting group-wise standard scores of a vector
tapply is also very useful: 
my.df<-data.frame(x=rnorm(20, 50, 10),group=factor(sort(rep(c("A",
"B"), 10))))
tapply(my.df$x,my.df$group,function(x){(x-mean(x))/sd(x)})
-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org]On Behalf Of Matthew Dubins
Sent: 26 September 2007 21:57
To: r-help at r-project.org
Subject: [R] Getting group-wise standard scores of a vector
Hi,
I want to be able to create a vector of z-scores from a vector of 
continuous data, conditional on a group membership vector.
Say you have 20 numbers distributed normally with a mean of 50 and an sd 
of 10:
x <- rnorm(20, 50, 10)
Then you have a vector that delineates 2 groups within x:
group <- sort(rep(c("A", "B"), 10))
test.data <- data.frame(cbind(x, group))
I know that if you break up the x vector into 2 different vectors then 
it becomes easy to calculate the z scores for each vector, then you 
stack them and append them to the original
data frame.  Is there anyway to apply this sort of calculation without 
splitting the original vector up?  I tried a really complex ifelse 
statement but it didn't seem to work.
Thanks in advance,
Matthew Dubins
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Apparently Analagous Threads
- column-wise z-scores by group
- R routines vs. MATLAB/SPSS Routines
- Recoding scores of negatively worded item
- Plotting numbers at a specified decimal length on a plot()
- In factor analysis in the psych package, how can I work out which factors the columns in $scores relate to? How do I know what each of the scores is scoring?