James Rome
2011-May-31 18:51 UTC
[R] How to get the rows corresponding to the maximum of a factor
I have a data frame as follows: MsgType eotpd fn FI 2011-05-13 01:40:00 0 FF 2011-05-13 01:39:53 0 TC 2011-05-13 01:39:45 0 FI 2011-05-14 00:58:46 1 FF 2011-05-14 00:58:46 1 FI 2011-05-15 00:48:32 2 FF 2011-05-15 00:48:21 2 TC 2011-05-15 00:48:15 2 FI 2011-05-16 02:00:01 3 FF 2011-05-16 01:59:46 3 FI 2011-05-17 02:22:05 4 FF 2011-05-17 02:21:58 4 FI 2011-05-18 01:50:35 5 FF 2011-05-18 01:50:30 5 FI 2011-05-19 02:05:24 6 FF 2011-05-19 02:05:20 6 TC 2011-05-19 02:05:19 6 FI 2011-05-13 17:04:15 8 TC 2011-05-13 17:04:04 8 FI 2011-05-16 17:32:40 9 FF 2011-05-16 17:32:19 9 TC 2011-05-16 17:32:06 9 FI 2011-05-17 18:39:42 10 FF 2011-05-17 18:39:38 10 FI 2011-05-18 17:54:55 11 FF 2011-05-18 17:54:57 11 TC 2011-05-18 17:54:50 11 FI 2011-05-19 17:26:01 12 FF 2011-05-19 17:26:01 12 TC 2011-05-19 17:25:53 12 . . . As you can see, I do not always have all three MsgTypes for a given fn The MsgTypes are an ordered factor: FL < FF < TC. What I want to get is a data frame having the maximum MsgType and its eotpd for each fn: MsgType eotpd fn TC 2011-05-13 01:39:45 0 FF 2011-05-14 00:58:46 1 TC 2011-05-15 00:48:15 2 FF 2011-05-16 01:59:46 3 FF 2011-05-17 02:21:58 4 FF 2011-05-18 01:50:30 5 TC 2011-05-19 02:05:19 6 TC 2011-05-13 17:04:04 8 TC 2011-05-16 17:32:06 9 FF 2011-05-17 18:39:38 10 TC 2011-05-18 17:54:50 11 TC 2011-05-19 17:25:53 12 . . . Surely there is a clever way to do this in R? Thanks for the help, Jim
David Winsemius
2011-May-31 19:52 UTC
[R] How to get the rows corresponding to the maximum of a factor
On May 31, 2011, at 2:51 PM, James Rome wrote:> I have a data frame as follows: > MsgType eotpd fn > FI 2011-05-13 01:40:00 0 > FF 2011-05-13 01:39:53 0 > TC 2011-05-13 01:39:45 0 > FI 2011-05-14 00:58:46 1 > FF 2011-05-14 00:58:46 1 > FI 2011-05-15 00:48:32 2 > FF 2011-05-15 00:48:21 2 > TC 2011-05-15 00:48:15 2 > FI 2011-05-16 02:00:01 3 > FF 2011-05-16 01:59:46 3 > FI 2011-05-17 02:22:05 4 > FF 2011-05-17 02:21:58 4 > FI 2011-05-18 01:50:35 5 > FF 2011-05-18 01:50:30 5 > FI 2011-05-19 02:05:24 6 > FF 2011-05-19 02:05:20 6 > TC 2011-05-19 02:05:19 6 > FI 2011-05-13 17:04:15 8 > TC 2011-05-13 17:04:04 8 > FI 2011-05-16 17:32:40 9 > FF 2011-05-16 17:32:19 9 > TC 2011-05-16 17:32:06 9 > FI 2011-05-17 18:39:42 10 > FF 2011-05-17 18:39:38 10 > FI 2011-05-18 17:54:55 11 > FF 2011-05-18 17:54:57 11 > TC 2011-05-18 17:54:50 11 > FI 2011-05-19 17:26:01 12 > FF 2011-05-19 17:26:01 12 > TC 2011-05-19 17:25:53 12 > . . . > As you can see, I do not always have all three MsgTypes for a given fn > The MsgTypes are an ordered factor: FL < FF < TC. > What I want to get is a data frame having the maximum MsgType and its > eotpd for each fn:Assuming this is in a dataframe, 'rrr' (so named for my annoyance that you did not use dput to offer the example) with this structure: > str(rrr) 'data.frame': 30 obs. of 3 variables: $ V1: Ord.factor w/ 3 levels "FI"<"FF"<"TC": 1 2 3 1 2 1 2 3 1 2 ... $ V2: POSIXct, format: "2011-05-13 01:40:00" "2011-05-13 01:39:53" ... $ V3: num 0 0 0 1 1 2 2 2 3 3 ... Then this seems to fit the description: idx <- sapply( split(seq_len(nrow(rrr)), rrr$V3), function(x) { x[which.max(rrr$V1[x])]}) > rrr[idx, ] V1 V2 V3 3 TC 2011-05-13 01:39:45 0 5 FF 2011-05-14 00:58:46 1 8 TC 2011-05-15 00:48:15 2 10 FF 2011-05-16 01:59:46 3 12 FF 2011-05-17 02:21:58 4 14 FF 2011-05-18 01:50:30 5 17 TC 2011-05-19 02:05:19 6 19 TC 2011-05-13 17:04:04 8 22 TC 2011-05-16 17:32:06 9 24 FF 2011-05-17 18:39:38 10 27 TC 2011-05-18 17:54:50 11 30 TC 2011-05-19 17:25:53 12 -- David.> MsgType eotpd fn > TC 2011-05-13 01:39:45 0 > FF 2011-05-14 00:58:46 1 > TC 2011-05-15 00:48:15 2 > FF 2011-05-16 01:59:46 3 > FF 2011-05-17 02:21:58 4 > FF 2011-05-18 01:50:30 5 > TC 2011-05-19 02:05:19 6 > TC 2011-05-13 17:04:04 8 > TC 2011-05-16 17:32:06 9 > FF 2011-05-17 18:39:38 10 > TC 2011-05-18 17:54:50 11 > TC 2011-05-19 17:25:53 12 > . . . > > Surely there is a clever way to do this in R? > > Thanks for the help, > Jim > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT