thr3ads.net - R help - [R] Data extraction and assembly from a data frame [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Jun Shen

2014-Jun-20 19:42 UTC

[R] Data extraction and assembly from a data frame

Hi all,

Here is my situation. I have a dataframe, the structure would be something
like this,

TestData<-data.frame(ID=rep(1:10,each=10),TIME=rep(seq(0.1,1,0.1),10),VAR1=rnorm(100),VAR2=5*rnorm(100),VAR3=10*rnorm(100))

Basically, I want to extract the maximum value from each ID for VAR1, VAR2,
VAR3......

The way I can think of is

do.call(rbind,lapply(split(TestData,TestData$ID),function(x)x[which.max(x$VAR1),'VAR1']))

and do this for each of the variables and put the results back. It's kind
of clumsy but OK for several variables. I have dozens of them. Is there a
better way to do it?

It would be ideal to produce the results like

   ID VAR1.max VAR2.max VAR3.max  1 1.2828796 8.63276 15.051992  2 1.1870067
8.691801 10.736301  3 1.2815352 6.335692 5.827524  4 1.6719411 5.998597
16.646212  5 1.5631107 6.067457 15.331046  6 0.718989 6.610279 7.306005  7
0.8734315 13.39844 16.965365  8 2.7447862 10.21613 22.545131  9 3.490395
10.83543 25.744662  10 0.4719087 11.73021 7.226687
Thanks for any help.

Jun Shen

	[[alternative HTML version deleted]]

MacQueen, Don

2014-Jun-20 19:51 UTC

head link

[R] Data extraction and assembly from a data frame

How about

aggregate(TestData[,c('VAR1','VAR2','VAR3')],
by=list(id=TestData$ID),
FUN=max)




-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 6/20/14 12:42 PM, "Jun Shen" <jun.shen.ut at gmail.com>
wrote:
>Hi all,
>
>Here is my situation. I have a dataframe, the structure would be something
>like this,
>
>TestData<-data.frame(ID=rep(1:10,each=10),TIME=rep(seq(0.1,1,0.1),10),VAR1
>=rnorm(100),VAR2=5*rnorm(100),VAR3=10*rnorm(100))
>
>Basically, I want to extract the maximum value from each ID for VAR1,
>VAR2,
>VAR3......
>
>The way I can think of is
>
>do.call(rbind,lapply(split(TestData,TestData$ID),function(x)x[which.max(x$
>VAR1),'VAR1']))
>
>and do this for each of the variables and put the results back. It's
kind
>of clumsy but OK for several variables. I have dozens of them. Is there a
>better way to do it?
>
>It would be ideal to produce the results like
>
>   ID VAR1.max VAR2.max VAR3.max  1 1.2828796 8.63276 15.051992  2
>1.1870067
>8.691801 10.736301  3 1.2815352 6.335692 5.827524  4 1.6719411 5.998597
>16.646212  5 1.5631107 6.067457 15.331046  6 0.718989 6.610279 7.306005  7
>0.8734315 13.39844 16.965365  8 2.7447862 10.21613 22.545131  9 3.490395
>10.83543 25.744662  10 0.4719087 11.73021 7.226687
>Thanks for any help.
>
>Jun Shen
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

William Dunlap

2014-Jun-20 19:51 UTC

head link

[R] Data extraction and assembly from a data frame

Have you looked at the 'aggregate' function?  E.g.,
  aggregate(TestData[c("VAR1","VAR2","VAR3")],
by=TestData["ID"], max)
Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Fri, Jun 20, 2014 at 12:42 PM, Jun Shen <jun.shen.ut at gmail.com>
wrote:> Hi all,
>
> Here is my situation. I have a dataframe, the structure would be something
> like this,
>
>
TestData<-data.frame(ID=rep(1:10,each=10),TIME=rep(seq(0.1,1,0.1),10),VAR1=rnorm(100),VAR2=5*rnorm(100),VAR3=10*rnorm(100))
>
> Basically, I want to extract the maximum value from each ID for VAR1, VAR2,
> VAR3......
>
> The way I can think of is
>
>
do.call(rbind,lapply(split(TestData,TestData$ID),function(x)x[which.max(x$VAR1),'VAR1']))
>
> and do this for each of the variables and put the results back. It's
kind
> of clumsy but OK for several variables. I have dozens of them. Is there a
> better way to do it?
>
> It would be ideal to produce the results like
>
>    ID VAR1.max VAR2.max VAR3.max  1 1.2828796 8.63276 15.051992  2
1.1870067
> 8.691801 10.736301  3 1.2815352 6.335692 5.827524  4 1.6719411 5.998597
> 16.646212  5 1.5631107 6.067457 15.331046  6 0.718989 6.610279 7.306005  7
> 0.8734315 13.39844 16.965365  8 2.7447862 10.21613 22.545131  9 3.490395
> 10.83543 25.744662  10 0.4719087 11.73021 7.226687
> Thanks for any help.
>
> Jun Shen
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

arun

2014-Jun-20 19:56 UTC

head link

[R] Data extraction and assembly from a data frame

You could try:
library(plyr)
res <- ddply(TestData[,-2],.(ID),numcolwise(max)) 
colnames(res)[-1] <- paste0(colnames(res)[-1],".max")
A.K.




On Friday, June 20, 2014 3:43 PM, Jun Shen <jun.shen.ut at gmail.com>
wrote:
Hi all,

Here is my situation. I have a dataframe, the structure would be something
like this,

TestData<-data.frame(ID=rep(1:10,each=10),TIME=rep(seq(0.1,1,0.1),10),VAR1=rnorm(100),VAR2=5*rnorm(100),VAR3=10*rnorm(100))

Basically, I want to extract the maximum value from each ID for VAR1, VAR2,
VAR3......

The way I can think of is

do.call(rbind,lapply(split(TestData,TestData$ID),function(x)x[which.max(x$VAR1),'VAR1']))

and do this for each of the variables and put the results back. It's kind
of clumsy but OK for several variables. I have dozens of them. Is there a
better way to do it?

It would be ideal to produce the results like

?  ID VAR1.max VAR2.max VAR3.max? 1 1.2828796 8.63276 15.051992? 2 1.1870067
8.691801 10.736301? 3 1.2815352 6.335692 5.827524? 4 1.6719411 5.998597
16.646212? 5 1.5631107 6.067457 15.331046? 6 0.718989 6.610279 7.306005? 7
0.8734315 13.39844 16.965365? 8 2.7447862 10.21613 22.545131? 9 3.490395
10.83543 25.744662? 10 0.4719087 11.73021 7.226687
Thanks for any help.

Jun Shen

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

R help - Jun 2014 - Data extraction and assembly from a data frame

[R] Data extraction and assembly from a data frame

[R] Data extraction and assembly from a data frame

[R] Data extraction and assembly from a data frame

[R] Data extraction and assembly from a data frame