thr3ads.net - R help - [R] Getting the groupmean for each person [May 2004]

If this information is useful, please help other people find it:
Share via:

Felix Eschenburg

2004-May-08 20:33 UTC

[R] Getting the groupmean for each person

Hello list !

I have a huge data.frame with several variables observed on about 3000 
persons. For every person (row) there is variable called GROUP which indices 
the group the person belongs to. There is also another variable AV for each 
person. Now i want to create a new variable which holds the group mean of AV 
as a value for each person.
With tapply(AV,GROUP,mean) i get the means for each level of GROUP, but i 
cannot find out, how to give every person the groupmean as a value (every 
person should have the same value as every other person in the same group). 

Has anybody any ideas how to do that ?

Yours sincerly
Felix Eschenburg

Gabor Grothendieck

2004-May-08 21:02 UTC

head link

[R] Getting the groupmean for each person

predict(lm(AV~as.factor(GROUP)))



Felix Eschenburg <Atropin75 <at> t-online.de> writes:

: 
: Hello list !
: 
: I have a huge data.frame with several variables observed on about 3000 
: persons. For every person (row) there is variable called GROUP which indices 
: the group the person belongs to. There is also another variable AV for each 
: person. Now i want to create a new variable which holds the group mean of AV 
: as a value for each person.
: With tapply(AV,GROUP,mean) i get the means for each level of GROUP, but i 
: cannot find out, how to give every person the groupmean as a value (every 
: person should have the same value as every other person in the same group). 
: 
: Has anybody any ideas how to do that ?
: 
: Yours sincerly
: Felix Eschenburg

Thomas Lumley

2004-May-09 19:32 UTC

head link

[R] Getting the groupmean for each person

On Sat, 8 May 2004, Gabor Grothendieck wrote:
>
> predict(lm(AV~as.factor(GROUP)))

If Felix actually has a "huge" data frame this will be slow. Instead
try

groupmeans<-rowsum(AV,GROUP,reorder=FALSE)
individual.means<- groupmeans[match(GROUP, unique(GROUP)]

It uses hashing and takes roughly O(MGlogG) time for M measurements on G
groups, whereas the lm solution takes O(MG^3) [and the space requirements
are O(MG) and O(MG^2)]

Admittedly, with only 3000 observations either one will be fast enough.

	-thomas

>
>
>
> Felix Eschenburg <Atropin75 <at> t-online.de> writes:
>
> :
> : Hello list !
> :
> : I have a huge data.frame with several variables observed on about 3000
> : persons. For every person (row) there is variable called GROUP which
indices
> : the group the person belongs to. There is also another variable AV for
each
> : person. Now i want to create a new variable which holds the group mean of
AV
> : as a value for each person.
> : With tapply(AV,GROUP,mean) i get the means for each level of GROUP, but i
> : cannot find out, how to give every person the groupmean as a value (every
> : person should have the same value as every other person in the same
group).
> :
> : Has anybody any ideas how to do that ?
> :
> : Yours sincerly
> : Felix Eschenburg
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Liaw, Andy

2004-May-10 11:37 UTC

head link

[R] Getting the groupmean for each person

Both of you might have missed my question from Friday:  For very long `x'
(e.g., length=50000), indexing by names can take a long time.  See that
thread for detail.  (For small data, you can hardly tell the difference.)

Also, I'm trying to write the function in a way that one can pass in more
than one grouping variables in a list, much like tapply.  The version I
shown is a simplified version to demonstrate the `problem' I had.  I
obviously missed the fact that tapply returns 1D array...

Best,
Andy
> From: kjetil at acelerate.com 
> 
> On 10 May 2004 at 10:09, Christophe Pallier wrote:
> 
> > 
> > 
> > Liaw, Andy wrote:
> > 
> > >Suppose I
> > >define the function:
> > >
> > >fun <- function(x, f) {
> > >    m <- tapply(x, f, mean)
> > >    ans <- x - m[match(f, unique(f))]
> > >    names(ans) <- names(x)
> > >    ans
> > >}
> > >
> > >  
> > >
> > 
> > May I ask what is the purpose of match(f,unique(f)) ?
> > 
> > To remove the group means, I have be using:
> > 
> > x-tapply(x,f,mean)[f]
> > 
> > for a while, (and I am now changing to 
> > x-tapply(x,f,mean)[as.character(f)] because of the peculiarities of
> 
> wouldn't 
>  sweep(as.array(x), 1, tapply(x,f,mean)[as.character(f)] , "-")
> 
> be more natural?
> 
> Kjetil Halvorsen
> 
> > indexing named vectors with factors )
> > 
> > The use of tapply(x,f,mean)[match(f,unique(f))] assumes a particular
> > order in the result of tapply, no? It seems a bit dangerous to me.
> > 
> > 
> > Christophe Pallier
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> > 
> 
> 
> 
>

Seemingly Similar Threads

Search for more seemingly similar threads

R help - May 2004 - Getting the groupmean for each person

[R] Getting the groupmean for each person

[R] Getting the groupmean for each person

[R] Getting the groupmean for each person

[R] Getting the groupmean for each person

Seemingly Similar Threads