thr3ads.net - R help - [R] scale subsets of grouped data in data frame [Jul 2009]

If this information is useful, please help other people find it:
Share via:

Noah Silverman

2009-Jul-31 23:17 UTC

[R] scale subsets of grouped data in data frame

Hello,

I'm trying to duplicate what's an easy process in RapidMiner.

In RM, we can simply use two operators:
     subgroup iteration
     attribute value selection (Can use a regex for the attrribute name.)

I can do this in R with a lot of code and manual steps.  It would be 
really nice to find a more automated way.

My data looks like this

group 	group_height 	group_weight 	height 	weight
g22 	3.2 	8.896 	3.2 	8.896
g22 	2.5 	6.95 	2.5 	6.95
g22 	3.1 	8.618 	3.1 	8.618
g49 	2.4 	6.672 	2.4 	6.672
g49 	4.2 	11.676 	4.2 	11.676
g49 	2.5 	6.95 	2.5 	6.95
g55 	2.6 	7.228 	2.6 	7.228
g55 	3.4 	9.452 	3.4 	9.452
g55 	3.3 	9.174 	3.3 	9.174




What I want to do is scale the data by each group
So in pseudo-code
     for(group in groups){
         if(column_name = regex(group_.*)){
             data[column_name] = scale(data[group,column_name])
         }
     }

This way I get "group wise" normalization of my data, but still have
the
original values which I will normailze "database wide" for some
comparisons.

Can anybody help solve this one?

-N

	[[alternative HTML version deleted]]

Steve Lianoglou

2009-Aug-01 01:38 UTC

head link

[R] scale subsets of grouped data in data frame

Hi,

On Jul 31, 2009, at 7:17 PM, Noah Silverman wrote:
> Hello,
>
> I'm trying to duplicate what's an easy process in RapidMiner.
>
> In RM, we can simply use two operators:
>     subgroup iteration
>     attribute value selection (Can use a regex for the attrribute  
> name.)
>
> I can do this in R with a lot of code and manual steps.  It would be
> really nice to find a more automated way.
>
> My data looks like this
>
> group 	group_height 	group_weight 	height 	weight
> g22 	3.2 	8.896 	3.2 	8.896
> g22 	2.5 	6.95 	2.5 	6.95
> g22 	3.1 	8.618 	3.1 	8.618
> g49 	2.4 	6.672 	2.4 	6.672
> g49 	4.2 	11.676 	4.2 	11.676
> g49 	2.5 	6.95 	2.5 	6.95
> g55 	2.6 	7.228 	2.6 	7.228
> g55 	3.4 	9.452 	3.4 	9.452
> g55 	3.3 	9.174 	3.3 	9.174
>
> What I want to do is scale the data by each group
> So in pseudo-code
>     for(group in groups){
>         if(column_name = regex(group_.*)){
>             data[column_name] = scale(data[group,column_name])
>         }
>     }
>
> This way I get "group wise" normalization of my data, but still
have
> the
> original values which I will normailze "database wide" for some  
> comparisons.
>
> Can anybody help solve this one?
>
> -N

You can do this quite easily.

Just take what you learned from the last example re: scaling subsets,  
and play around with some of the functions you see in the ?grep help  
page. You'll be using those functions against the strings you get back  
from colnames(data).

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
   |  Memorial Sloan-Kettering Cancer Center
   |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

Maybe Matching Threads

Search for more apparently analagous threads

R help - Jul 2009 - scale subsets of grouped data in data frame

[R] scale subsets of grouped data in data frame

[R] scale subsets of grouped data in data frame

Maybe Matching Threads