Hi,
I have a data.frame with several variables and 50,000 observations.
i.e.
data[1:2,1:7]
  Iteration Day Production.Type tsUSusc tsASusc tsULat tsALat
         1   0         Generic   17965 8833053      0      0
         1   1         Generic   17965 8833053      0      0
         .
         .
         .
         1 199         Generic   17237 8141028     26  23131
         2 127         Generic   15828 7307583     92  63463
I would like to extract only the observations (rows) for the last
"day" for
each "iteration" and store them in a data frame.
I tried lapply nested in a for loop without success.  Any help will be 
greatly appreciated!
Thanks
Francisco
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963
F Z <gerifalte28 <at> hotmail.com> writes:
: 
: Hi,
: 
: I have a data.frame with several variables and 50,000 observations.
: i.e.
: data[1:2,1:7]
:   Iteration Day Production.Type tsUSusc tsASusc tsULat tsALat
:          1   0         Generic   17965 8833053      0      0
:          1   1         Generic   17965 8833053      0      0
:          .
:          .
:          .
:          1 199         Generic   17237 8141028     26  23131
:          2 127         Generic   15828 7307583     92  63463
: 
: I would like to extract only the observations (rows) for the last
"day" for
: each "iteration" and store them in a data frame.
: 
Try this:
   do.call("rbind", by(data, dat$Iteration, tail, 1))
On Tue, 9 Nov 2004, F Z wrote:> Hi, > > I have a data.frame with several variables and 50,000 observations. > i.e. > data[1:2,1:7] > Iteration Day Production.Type tsUSusc tsASusc tsULat tsALat > 1 0 Generic 17965 8833053 0 0 > 1 1 Generic 17965 8833053 0 0 > . > . > . > 1 199 Generic 17237 8141028 26 23131 > 2 127 Generic 15828 7307583 92 63463 > > I would like to extract only the observations (rows) for the last "day" for > each "iteration" and store them in a data frame. > > I tried lapply nested in a for loop without success. Any help will be > greatly appreciated!If you reverse the ordering you are then looking for the first Day in each Iteration, which can be done efficiently with duplicated(). data <- data[order(data$Iteration, data$Day, decreasing=TRUE),] subset <- data[!duplicated(data$Iteration),] If you are sure that the data are in order to begin with you could just reverse the entire data set ( data <- data[nrow(data):1,] ), but I'm always reluctant to assume this. -thomas
Many thanks to Gabor Grothendieck, Thomas Lumley and James Holtman for their useful answers on this thread. The three solutions worked for the problem. Here is a sumary of their responses (modified for consistency on notations):>F Z <gerifalte28 <at> hotmail.com> writes: >: Hi, >: >: I have a data.frame with several variables and 50,000 observations. >: i.e. >: data[1:2,1:7] >: Iteration Day Production.Type tsUSusc tsASusc tsULat tsALat >: 1 0 Generic 17965 8833053 0 0 >: 1 1 Generic 17965 8833053 0 0 >: . >: . >: . >: 1 199 Generic 17237 8141028 26 23131 >: 2 127 Generic 15828 7307583 92 63463 >: >: I would like to extract only the observations (rows) for the last "day" >for >: each "iteration" and store them in a data frame.Gabor Grothendieck's solution: subset<-do.call("rbind", by(data, data$Iteration, tail, 1)) James Holtman's solution: subset<- by(data, data$Iteration, function(x)x[nrow(x),]) subset<-do.call('rbind',subset) Thomas Lumley's solution: data <- data[order(data$Iteration, data$Day, decreasing=TRUE),] subset <- data[!duplicated(data$Iteration),] If you are sure that the data are in order to begin with you could just reverse the entire data set ( data <- data[nrow(data):1,] ), but I'm always reluctant to assume this.>______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! >http://www.R-project.org/posting-guide.html