Dear R users, I am working on allocating the rows within a dataframe into some factor levels.Consider the following dataframe: Start.action Start.time 1 Start.setting 2010-12-30 17:58:00 2 Start.setting 2010-12-30 18:40:00 3 Start.setting 2010-12-31 22:39:00 4 Start.setting 2010-12-31 23:24:00 5 Start.setting 2011-01-01 00:30:00 6 Start.setting 2011-01-01 01:10:00 7 Start.hauling 2011-01-01 07:07:00 8 Start.hauling 2011-01-01 14:25:00 9 Start.hauling 2011-01-01 21:28:00 10 Start.hauling 2011-01-02 03:38:00 11 Start.hauling 2011-01-02 09:28:00 12 Start.hauling 2011-01-02 14:22:00 13 Start.setting 2011-01-02 20:51:00 14 Start.setting 2011-01-02 21:33:00 15 Start.setting 2011-01-02 22:47:00 16 Start.setting 2011-01-02 23:27:00 17 Start.setting 2011-01-03 00:35:00 18 Start.setting 2011-01-03 01:16:00 19 Start.hauling 2011-01-03 04:31:00 20 Start.hauling 2011-01-03 08:57:00 I am trying to assign a factor level like the one below (named "action") according to the sequence of setting and hauling occuring in the "Start.action" column. In fact, it wouldnt even need to be a factor or character, it could simply be numbered (i.e., the set/haul prefix is useless as I could simply split it afterwards). Start.action Start.time action 1 Start.setting 2010-12-30 17:58:00 set1 2 Start.setting 2010-12-30 18:40:00 set1 3 Start.setting 2010-12-31 22:39:00 set1 4 Start.setting 2010-12-31 23:24:00 set1 5 Start.setting 2011-01-01 00:30:00 set1 6 Start.setting 2011-01-01 01:10:00 set1 7 Start.hauling 2011-01-01 07:07:00 haul1 8 Start.hauling 2011-01-01 14:25:00 haul1 9 Start.hauling 2011-01-01 21:28:00 haul1 10 Start.hauling 2011-01-02 03:38:00 haul1 11 Start.hauling 2011-01-02 09:28:00 haul1 12 Start.hauling 2011-01-02 14:22:00 haul1 13 Start.setting 2011-01-02 20:51:00 set2 14 Start.setting 2011-01-02 21:33:00 set2 15 Start.setting 2011-01-02 22:47:00 set2 16 Start.setting 2011-01-02 23:27:00 set2 17 Start.setting 2011-01-03 00:35:00 set2 18 Start.setting 2011-01-03 01:16:00 set2 19 Start.hauling 2011-01-03 04:31:00 haul2 20 Start.hauling 2011-01-03 08:57:00 haul2 It seems like such a simple question, yet I just cant think of how to implement this. Any hints or ideas on how I might achieve this would be much appreciated. Regards, Darcy
Maybe clumsy but shows the activity. The idea is to use a numeric index to separate cases where Start.action is the same. (untested) my.data$action <- rep("set.or.haul ",20) my.data$recnums <- c(1:20) my.data$action[my.data$Start.action=="Start.setting" & my.data $recnums < 7] <- "set1" my.data$action[my.data$Start.action=="Start.setting" & my.data $recnums > 12] <- "set2" my.data$action[my.data$Start.action=="Start.hauling" & my.data $recnums < 13] <- "haul1" my.data$action[my.data$Start.action=="Start.hauling" & my.data $recnums > 18] <- "haul2" On 7-Mar-11, at 7:13 PM, Darcy Webber wrote:> Dear R users, > > I am working on allocating the rows within a dataframe into some > factor levels.Consider the following dataframe: > > Start.action Start.time > 1 Start.setting 2010-12-30 17:58:00 > 2 Start.setting 2010-12-30 18:40:00 > 3 Start.setting 2010-12-31 22:39:00 > 4 Start.setting 2010-12-31 23:24:00 > 5 Start.setting 2011-01-01 00:30:00 > 6 Start.setting 2011-01-01 01:10:00 > 7 Start.hauling 2011-01-01 07:07:00 > 8 Start.hauling 2011-01-01 14:25:00 > 9 Start.hauling 2011-01-01 21:28:00 > 10 Start.hauling 2011-01-02 03:38:00 > 11 Start.hauling 2011-01-02 09:28:00 > 12 Start.hauling 2011-01-02 14:22:00 > 13 Start.setting 2011-01-02 20:51:00 > 14 Start.setting 2011-01-02 21:33:00 > 15 Start.setting 2011-01-02 22:47:00 > 16 Start.setting 2011-01-02 23:27:00 > 17 Start.setting 2011-01-03 00:35:00 > 18 Start.setting 2011-01-03 01:16:00 > 19 Start.hauling 2011-01-03 04:31:00 > 20 Start.hauling 2011-01-03 08:57:00 > > I am trying to assign a factor level like the one below (named > "action") according to the sequence of setting and hauling occuring in > the "Start.action" column. In fact, it wouldnt even need to be a > factor or character, it could simply be numbered (i.e., the set/haul > prefix is useless as I could simply split it afterwards). > > Start.action Start.time action > 1 Start.setting 2010-12-30 17:58:00 set1 > 2 Start.setting 2010-12-30 18:40:00 set1 > 3 Start.setting 2010-12-31 22:39:00 set1 > 4 Start.setting 2010-12-31 23:24:00 set1 > 5 Start.setting 2011-01-01 00:30:00 set1 > 6 Start.setting 2011-01-01 01:10:00 set1 > 7 Start.hauling 2011-01-01 07:07:00 haul1 > 8 Start.hauling 2011-01-01 14:25:00 haul1 > 9 Start.hauling 2011-01-01 21:28:00 haul1 > 10 Start.hauling 2011-01-02 03:38:00 haul1 > 11 Start.hauling 2011-01-02 09:28:00 haul1 > 12 Start.hauling 2011-01-02 14:22:00 haul1 > 13 Start.setting 2011-01-02 20:51:00 set2 > 14 Start.setting 2011-01-02 21:33:00 set2 > 15 Start.setting 2011-01-02 22:47:00 set2 > 16 Start.setting 2011-01-02 23:27:00 set2 > 17 Start.setting 2011-01-03 00:35:00 set2 > 18 Start.setting 2011-01-03 01:16:00 set2 > 19 Start.hauling 2011-01-03 04:31:00 haul2 > 20 Start.hauling 2011-01-03 08:57:00 haul2 > > It seems like such a simple question, yet I just cant think of how to > implement this. Any hints or ideas on how I might achieve this would > be much appreciated. > > Regards, > Darcy > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.Why does the universe go to all the bother of existing? -- Stephen Hawking #define QUESTION ((bb) || !(bb)) -- William Shakespeare Don McKenzie, Research Ecologist Pacific WIldland Fire Sciences Lab US Forest Service Affiliate Professor School of Forest Resources, College of the Environment CSES Climate Impacts Group University of Washington desk: 206-732-7824 cell: 206-321-5966 dmck at uw.edu donaldmckenzie at fs.fed.us
Hi: Here's one way to piece it together. All we need is the first variable, so I'll manufacture a vector of Start.action's and go from there. w <- data.frame(Start.action = c(rep('Start.setting', 3), rep('Start.hauling', 4), rep('Start.setting', 4), rep('Start.hauling', 6), rep('Start.setting', 4), rep('Start.hauling', 4))) wr <- rle(w$Start.action == 'Start.setting')> wrRun Length Encoding lengths: int [1:6] 3 4 4 6 4 4 values : logi [1:6] TRUE FALSE TRUE FALSE TRUE FALSE w$cycle <- rep(cumsum(wr$values), wr$lengths) w$act <- ifelse(w$Start.action == 'Start.setting', 'set', 'haul') w$action <- with(w, paste(act, cycle, sep = '')) w$cycle <- w$act <- NULL> wStart.action action 1 Start.setting set1 2 Start.setting set1 3 Start.setting set1 4 Start.hauling haul1 5 Start.hauling haul1 <snip> 20 Start.setting set3 21 Start.setting set3 22 Start.hauling haul3 23 Start.hauling haul3 24 Start.hauling haul3 25 Start.hauling haul3 The rle() function is the key to this; given a logical statement as its argument, it is TRUE for Start.setting and FALSE for Start.hauling. The cumsum() function on the $values component of the result from rle() gives the values we want, and we replicate them according to the vector of $lengths given from rle. Once that is done, we just use a vectorized ifelse() function to yield 'set' or 'haul' in a new variable and then piece that together with the numeric vector...and we're done. Run the code one line at a time to understand what each instruction is doing. HTH, Dennis On Mon, Mar 7, 2011 at 7:13 PM, Darcy Webber <darcy.webber@gmail.com> wrote:> Dear R users, > > I am working on allocating the rows within a dataframe into some > factor levels.Consider the following dataframe: > > Start.action Start.time > 1 Start.setting 2010-12-30 17:58:00 > 2 Start.setting 2010-12-30 18:40:00 > 3 Start.setting 2010-12-31 22:39:00 > 4 Start.setting 2010-12-31 23:24:00 > 5 Start.setting 2011-01-01 00:30:00 > 6 Start.setting 2011-01-01 01:10:00 > 7 Start.hauling 2011-01-01 07:07:00 > 8 Start.hauling 2011-01-01 14:25:00 > 9 Start.hauling 2011-01-01 21:28:00 > 10 Start.hauling 2011-01-02 03:38:00 > 11 Start.hauling 2011-01-02 09:28:00 > 12 Start.hauling 2011-01-02 14:22:00 > 13 Start.setting 2011-01-02 20:51:00 > 14 Start.setting 2011-01-02 21:33:00 > 15 Start.setting 2011-01-02 22:47:00 > 16 Start.setting 2011-01-02 23:27:00 > 17 Start.setting 2011-01-03 00:35:00 > 18 Start.setting 2011-01-03 01:16:00 > 19 Start.hauling 2011-01-03 04:31:00 > 20 Start.hauling 2011-01-03 08:57:00 > > I am trying to assign a factor level like the one below (named > "action") according to the sequence of setting and hauling occuring in > the "Start.action" column. In fact, it wouldnt even need to be a > factor or character, it could simply be numbered (i.e., the set/haul > prefix is useless as I could simply split it afterwards). > > Start.action Start.time action > 1 Start.setting 2010-12-30 17:58:00 set1 > 2 Start.setting 2010-12-30 18:40:00 set1 > 3 Start.setting 2010-12-31 22:39:00 set1 > 4 Start.setting 2010-12-31 23:24:00 set1 > 5 Start.setting 2011-01-01 00:30:00 set1 > 6 Start.setting 2011-01-01 01:10:00 set1 > 7 Start.hauling 2011-01-01 07:07:00 haul1 > 8 Start.hauling 2011-01-01 14:25:00 haul1 > 9 Start.hauling 2011-01-01 21:28:00 haul1 > 10 Start.hauling 2011-01-02 03:38:00 haul1 > 11 Start.hauling 2011-01-02 09:28:00 haul1 > 12 Start.hauling 2011-01-02 14:22:00 haul1 > 13 Start.setting 2011-01-02 20:51:00 set2 > 14 Start.setting 2011-01-02 21:33:00 set2 > 15 Start.setting 2011-01-02 22:47:00 set2 > 16 Start.setting 2011-01-02 23:27:00 set2 > 17 Start.setting 2011-01-03 00:35:00 set2 > 18 Start.setting 2011-01-03 01:16:00 set2 > 19 Start.hauling 2011-01-03 04:31:00 haul2 > 20 Start.hauling 2011-01-03 08:57:00 haul2 > > It seems like such a simple question, yet I just cant think of how to > implement this. Any hints or ideas on how I might achieve this would > be much appreciated. > > Regards, > Darcy > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
On Mar 7, 2011, at 10:13 PM, Darcy Webber wrote:> Dear R users, > > I am working on allocating the rows within a dataframe into some > factor levels.Consider the following dataframe: > > Start.action Start.time > 1 Start.setting 2010-12-30 17:58:00 > 2 Start.setting 2010-12-30 18:40:00 > 3 Start.setting 2010-12-31 22:39:00 > 4 Start.setting 2010-12-31 23:24:00 > 5 Start.setting 2011-01-01 00:30:00 > 6 Start.setting 2011-01-01 01:10:00 > 7 Start.hauling 2011-01-01 07:07:00 > 8 Start.hauling 2011-01-01 14:25:00 > 9 Start.hauling 2011-01-01 21:28:00 > 10 Start.hauling 2011-01-02 03:38:00 > 11 Start.hauling 2011-01-02 09:28:00 > 12 Start.hauling 2011-01-02 14:22:00 > 13 Start.setting 2011-01-02 20:51:00 > 14 Start.setting 2011-01-02 21:33:00 > 15 Start.setting 2011-01-02 22:47:00 > 16 Start.setting 2011-01-02 23:27:00 > 17 Start.setting 2011-01-03 00:35:00 > 18 Start.setting 2011-01-03 01:16:00 > 19 Start.hauling 2011-01-03 04:31:00 > 20 Start.hauling 2011-01-03 08:57:00 > > I am trying to assign a factor level like the one below (named > "action") according to the sequence of setting and hauling occuring in > the "Start.action" column. In fact, it wouldnt even need to be a > factor or character, it could simply be numbered (i.e., the set/haul > prefix is useless as I could simply split it afterwards).w$action <- paste(c("set", "haul")[ 1+ c(0, cumsum(w[2:nrow(w), 1] != w[1:(nrow(w)-1), 1])) % %2], 1+ c(0, cumsum(w[2:nrow(w), 1] != w[1:(nrow(w)-1), 1])) %/%2 , sep="")> > Start.action Start.time action > 1 Start.setting 2010-12-30 17:58:00 set1 > 2 Start.setting 2010-12-30 18:40:00 set1 > 3 Start.setting 2010-12-31 22:39:00 set1 > 4 Start.setting 2010-12-31 23:24:00 set1 > 5 Start.setting 2011-01-01 00:30:00 set1 > 6 Start.setting 2011-01-01 01:10:00 set1 > 7 Start.hauling 2011-01-01 07:07:00 haul1 > 8 Start.hauling 2011-01-01 14:25:00 haul1 > 9 Start.hauling 2011-01-01 21:28:00 haul1 > 10 Start.hauling 2011-01-02 03:38:00 haul1 > 11 Start.hauling 2011-01-02 09:28:00 haul1 > 12 Start.hauling 2011-01-02 14:22:00 haul1 > 13 Start.setting 2011-01-02 20:51:00 set2 > 14 Start.setting 2011-01-02 21:33:00 set2 > 15 Start.setting 2011-01-02 22:47:00 set2 > 16 Start.setting 2011-01-02 23:27:00 set2 > 17 Start.setting 2011-01-03 00:35:00 set2 > 18 Start.setting 2011-01-03 01:16:00 set2 > 19 Start.hauling 2011-01-03 04:31:00 haul2 > 20 Start.hauling 2011-01-03 08:57:00 haul2 > > It seems like such a simple question, yet I just cant think of how to > implement this. Any hints or ideas on how I might achieve this would > be much appreciated. > > Regards, > Darcy > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT