thr3ads.net - R help - [R] Conditional selection of rows [Nov 2004]

If this information is useful, please help other people find it:
Share via:

F Z

2004-Nov-09 02:33 UTC

[R] Conditional selection of rows

Hi,

I have a data.frame with several variables and 50,000 observations.
i.e.
data[1:2,1:7]
  Iteration Day Production.Type tsUSusc tsASusc tsULat tsALat
         1   0         Generic   17965 8833053      0      0
         1   1         Generic   17965 8833053      0      0
         .
         .
         .
         1 199         Generic   17237 8141028     26  23131
         2 127         Generic   15828 7307583     92  63463

I would like to extract only the observations (rows) for the last
"day" for
each "iteration" and store them in a data frame.

I tried lapply nested in a for loop without success.  Any help will be 
greatly appreciated!

Thanks

Francisco


Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

Gabor Grothendieck

2004-Nov-09 02:58 UTC

head link

[R] Conditional selection of rows

F Z <gerifalte28 <at> hotmail.com> writes:

: 
: Hi,
: 
: I have a data.frame with several variables and 50,000 observations.
: i.e.
: data[1:2,1:7]
:   Iteration Day Production.Type tsUSusc tsASusc tsULat tsALat
:          1   0         Generic   17965 8833053      0      0
:          1   1         Generic   17965 8833053      0      0
:          .
:          .
:          .
:          1 199         Generic   17237 8141028     26  23131
:          2 127         Generic   15828 7307583     92  63463
: 
: I would like to extract only the observations (rows) for the last
"day" for
: each "iteration" and store them in a data frame.
: 


Try this:

   do.call("rbind", by(data, dat$Iteration, tail, 1))

Thomas Lumley

2004-Nov-09 15:24 UTC

head link

[R] Conditional selection of rows

On Tue, 9 Nov 2004, F Z wrote:
> Hi,
>
> I have a data.frame with several variables and 50,000 observations.
> i.e.
> data[1:2,1:7]
> Iteration Day Production.Type tsUSusc tsASusc tsULat tsALat
>        1   0         Generic   17965 8833053      0      0
>        1   1         Generic   17965 8833053      0      0
>        .
>        .
>        .
>        1 199         Generic   17237 8141028     26  23131
>        2 127         Generic   15828 7307583     92  63463
>
> I would like to extract only the observations (rows) for the last
"day" for
> each "iteration" and store them in a data frame.
>
> I tried lapply nested in a for loop without success.  Any help will be 
> greatly appreciated!
If you reverse the ordering you are then looking for the first Day in each 
Iteration, which can be done efficiently with duplicated().

data <- data[order(data$Iteration, data$Day, decreasing=TRUE),]

subset <- data[!duplicated(data$Iteration),]

If you are sure that the data are in order to begin with you could just 
reverse the entire data set (  data <- data[nrow(data):1,] ), but I'm 
always reluctant to assume this.

 	-thomas

F Z

2004-Nov-09 18:48 UTC

head link

[R] Conditional selection of rows

Many thanks to Gabor Grothendieck, Thomas Lumley and James Holtman for their 
useful answers on this thread.  The three solutions worked for the problem.  
Here is a sumary of their responses (modified for consistency on notations):

>F Z <gerifalte28 <at> hotmail.com> writes:
>: Hi,
>:
>: I have a data.frame with several variables and 50,000 observations.
>: i.e.
>: data[1:2,1:7]
>:   Iteration Day Production.Type tsUSusc tsASusc tsULat tsALat
>:          1   0         Generic   17965 8833053      0      0
>:          1   1         Generic   17965 8833053      0      0
>:          .
>:          .
>:          .
>:          1 199         Generic   17237 8141028     26  23131
>:          2 127         Generic   15828 7307583     92  63463
>:
>: I would like to extract only the observations (rows) for the last
"day"
>for
>: each "iteration" and store them in a data frame.
Gabor Grothendieck's solution:


subset<-do.call("rbind", by(data, data$Iteration, tail, 1))


James Holtman's solution:

subset<- by(data, data$Iteration, function(x)x[nrow(x),])
subset<-do.call('rbind',subset)

Thomas Lumley's solution:

data <- data[order(data$Iteration, data$Day, decreasing=TRUE),]

subset <- data[!duplicated(data$Iteration),]

If you are sure that the data are in order to begin with you could just 
reverse the entire data set (  data <- data[nrow(data):1,] ), but I'm
always
reluctant to assume this.


>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! 
>http://www.R-project.org/posting-guide.html

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Nov 2004 - Conditional selection of rows

[R] Conditional selection of rows

[R] Conditional selection of rows

[R] Conditional selection of rows

[R] Conditional selection of rows

Seemingly Similar Threads