Hello together, i have a data.frame, like this one: No. Change Date A 123 final 2013-01-15 B 123 error 2013-01-16 C 123 bug fixed 2013-01-17 D 111 final 2013-01-12 and now a want a new data.frame which includes only the newest entry for each number. The solution look like this: No. Change Date C 123 bug fixed 2013-01-17 D 111 final 2013-01-12 is there any way to filter my data.frame to the latest data, perhabs "max"? Thanks. Mat -- View this message in context: http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html Sent from the R help mailing list archive at Nabble.com.
library(sqldf) k1<-data.frame(ID=LETTERS[1:4], No=c(rep(123,3),111), Change=c("final","error","bug fixed","final"), Date=c("2013-01-15","2013-01-16","2013-01-17","2013-01-12")) k1$Date=as.Date(as.character(k1$Date),tz=UK) sqldf("select * from k1 group by No having max(Date)") --- On Fri, 1/2/13, Mat <matthias.weber@fnt.de> wrote: From: Mat <matthias.weber@fnt.de> Subject: [R] Filter according to the latest data To: r-help@r-project.org Date: Friday, 1 February, 2013, 1:34 PM Hello together, i have a data.frame, like this one: No. Change Date A 123 final 2013-01-15 B 123 error 2013-01-16 C 123 bug fixed 2013-01-17 D 111 final 2013-01-12 and now a want a new data.frame which includes only the newest entry for each number. The solution look like this: No. Change Date C 123 bug fixed 2013-01-17 D 111 final 2013-01-12 is there any way to filter my data.frame to the latest data, perhabs "max"? Thanks. Mat -- View this message in context: http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
try this:> x <- read.table(text = " No. Change Date+ A 123 final 2013-01-15 + B 123 error 2013-01-16 + C 123 'bug fixed' 2013-01-17 + D 111 final 2013-01-12" + , header = TRUE + , as.is = TRUE + )> do.call(rbind, lapply(split(x, x$No.), function(.sec){+ .sec[which(.sec$Date == max(.sec$Date))[1L], ] + })) No. Change Date 111 111 final 2013-01-12 123 123 bug fixed 2013-01-17 On Fri, Feb 1, 2013 at 3:04 AM, Mat <matthias.weber at fnt.de> wrote:> Hello together, > > i have a data.frame, like this one: > No. Change Date > A 123 final 2013-01-15 > B 123 error 2013-01-16 > C 123 bug fixed 2013-01-17 > D 111 final 2013-01-12 > > and now a want a new data.frame which includes only the newest entry for > each number. > The solution look like this: > No. Change Date > C 123 bug fixed 2013-01-17 > D 111 final 2013-01-12 > > is there any way to filter my data.frame to the latest data, perhabs "max"? > > Thanks. > > Mat > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.
Hi, Perhaps, (#Untested) do.call(rbind,lapply(split(dat1,dat1$No),function(x) tail(x,1))) #or library(plyr) ddply(dat1,.(No), function(x) x[nrow(x),]) A.K. ----- Original Message ----- From: Mat <matthias.weber at fnt.de> To: r-help at r-project.org Cc: Sent: Friday, February 1, 2013 3:04 AM Subject: [R] Filter according to the latest data Hello together, i have a data.frame, like this one: ? ? ? ? ? ? ? ? No.? ? ? ? ? Change? ? ? ? ? Date? ? ? ? ? A? ? ? ? ? ? ? 123? ? ? ? ? final? ? ? ? ? ? ? ? 2013-01-15 B? ? ? ? ? ? ? 123? ? ? ? ? error? ? ? ? ? ? ? 2013-01-16 C? ? ? ? ? ? ? 123? ? ? ? ? bug fixed? ? ? 2013-01-17 D? ? ? ? ? ? ? 111? ? ? ? ? final? ? ? ? ? ? ? ? 2013-01-12 and now a want a new data.frame which includes only the newest entry for each number. The solution look like this: ? ? ? ? ? ? ? ? No.? ? ? ? ? Change? ? ? ? ? Date? ? ? ? ? C? ? ? ? ? ? ? 123? ? ? ? ? bug fixed? ? ? 2013-01-17 D? ? ? ? ? ? ? 111? ? ? ? ? final? ? ? ? ? ? ? ? 2013-01-12 is there any way to filter my data.frame to the latest data, perhabs "max"? Thanks. Mat -- View this message in context: http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Fri, Feb 1, 2013 at 8:05 AM, nalluri pratap <pratap_stat at yahoo.co.in> wrote:> library(sqldf) >> sqldf("select * > from k1 > group by No > having max(Date)") >HAVING is only used to select groups and only works by chance in this example but if the data were to change then it would likely not work. Try this instead. It makes use of an sqlite-specific feature that guarantees that when MAX is used in a GROUP BY that the other columns will be from the same row:> sqldf("select ID, No, Change, max(Date) Date from k1 group by No")ID No Change Date 1 D 111 final 2013-01-12 2 C 123 bug fixed 2013-01-17 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com