Dear R users,
I am working on allocating the rows within a dataframe into some
factor levels.Consider the following dataframe:
Start.action Start.time
1 Start.setting 2010-12-30 17:58:00
2 Start.setting 2010-12-30 18:40:00
3 Start.setting 2010-12-31 22:39:00
4 Start.setting 2010-12-31 23:24:00
5 Start.setting 2011-01-01 00:30:00
6 Start.setting 2011-01-01 01:10:00
7 Start.hauling 2011-01-01 07:07:00
8 Start.hauling 2011-01-01 14:25:00
9 Start.hauling 2011-01-01 21:28:00
10 Start.hauling 2011-01-02 03:38:00
11 Start.hauling 2011-01-02 09:28:00
12 Start.hauling 2011-01-02 14:22:00
13 Start.setting 2011-01-02 20:51:00
14 Start.setting 2011-01-02 21:33:00
15 Start.setting 2011-01-02 22:47:00
16 Start.setting 2011-01-02 23:27:00
17 Start.setting 2011-01-03 00:35:00
18 Start.setting 2011-01-03 01:16:00
19 Start.hauling 2011-01-03 04:31:00
20 Start.hauling 2011-01-03 08:57:00
I am trying to assign a factor level like the one below (named
"action") according to the sequence of setting and hauling occuring in
the "Start.action" column. In fact, it wouldnt even need to be a
factor or character, it could simply be numbered (i.e., the set/haul
prefix is useless as I could simply split it afterwards).
Start.action Start.time action
1 Start.setting 2010-12-30 17:58:00 set1
2 Start.setting 2010-12-30 18:40:00 set1
3 Start.setting 2010-12-31 22:39:00 set1
4 Start.setting 2010-12-31 23:24:00 set1
5 Start.setting 2011-01-01 00:30:00 set1
6 Start.setting 2011-01-01 01:10:00 set1
7 Start.hauling 2011-01-01 07:07:00 haul1
8 Start.hauling 2011-01-01 14:25:00 haul1
9 Start.hauling 2011-01-01 21:28:00 haul1
10 Start.hauling 2011-01-02 03:38:00 haul1
11 Start.hauling 2011-01-02 09:28:00 haul1
12 Start.hauling 2011-01-02 14:22:00 haul1
13 Start.setting 2011-01-02 20:51:00 set2
14 Start.setting 2011-01-02 21:33:00 set2
15 Start.setting 2011-01-02 22:47:00 set2
16 Start.setting 2011-01-02 23:27:00 set2
17 Start.setting 2011-01-03 00:35:00 set2
18 Start.setting 2011-01-03 01:16:00 set2
19 Start.hauling 2011-01-03 04:31:00 haul2
20 Start.hauling 2011-01-03 08:57:00 haul2
It seems like such a simple question, yet I just cant think of how to
implement this. Any hints or ideas on how I might achieve this would
be much appreciated.
Regards,
Darcy
Maybe clumsy but shows the activity. The idea is to use a numeric
index to separate cases where Start.action is the same.
(untested)
my.data$action <- rep("set.or.haul ",20)
my.data$recnums <- c(1:20)
my.data$action[my.data$Start.action=="Start.setting" & my.data
$recnums < 7] <- "set1"
my.data$action[my.data$Start.action=="Start.setting" & my.data
$recnums > 12] <- "set2"
my.data$action[my.data$Start.action=="Start.hauling" & my.data
$recnums < 13] <- "haul1"
my.data$action[my.data$Start.action=="Start.hauling" & my.data
$recnums > 18] <- "haul2"
On 7-Mar-11, at 7:13 PM, Darcy Webber wrote:
> Dear R users,
>
> I am working on allocating the rows within a dataframe into some
> factor levels.Consider the following dataframe:
>
> Start.action Start.time
> 1 Start.setting 2010-12-30 17:58:00
> 2 Start.setting 2010-12-30 18:40:00
> 3 Start.setting 2010-12-31 22:39:00
> 4 Start.setting 2010-12-31 23:24:00
> 5 Start.setting 2011-01-01 00:30:00
> 6 Start.setting 2011-01-01 01:10:00
> 7 Start.hauling 2011-01-01 07:07:00
> 8 Start.hauling 2011-01-01 14:25:00
> 9 Start.hauling 2011-01-01 21:28:00
> 10 Start.hauling 2011-01-02 03:38:00
> 11 Start.hauling 2011-01-02 09:28:00
> 12 Start.hauling 2011-01-02 14:22:00
> 13 Start.setting 2011-01-02 20:51:00
> 14 Start.setting 2011-01-02 21:33:00
> 15 Start.setting 2011-01-02 22:47:00
> 16 Start.setting 2011-01-02 23:27:00
> 17 Start.setting 2011-01-03 00:35:00
> 18 Start.setting 2011-01-03 01:16:00
> 19 Start.hauling 2011-01-03 04:31:00
> 20 Start.hauling 2011-01-03 08:57:00
>
> I am trying to assign a factor level like the one below (named
> "action") according to the sequence of setting and hauling
occuring in
> the "Start.action" column. In fact, it wouldnt even need to be a
> factor or character, it could simply be numbered (i.e., the set/haul
> prefix is useless as I could simply split it afterwards).
>
> Start.action Start.time action
> 1 Start.setting 2010-12-30 17:58:00 set1
> 2 Start.setting 2010-12-30 18:40:00 set1
> 3 Start.setting 2010-12-31 22:39:00 set1
> 4 Start.setting 2010-12-31 23:24:00 set1
> 5 Start.setting 2011-01-01 00:30:00 set1
> 6 Start.setting 2011-01-01 01:10:00 set1
> 7 Start.hauling 2011-01-01 07:07:00 haul1
> 8 Start.hauling 2011-01-01 14:25:00 haul1
> 9 Start.hauling 2011-01-01 21:28:00 haul1
> 10 Start.hauling 2011-01-02 03:38:00 haul1
> 11 Start.hauling 2011-01-02 09:28:00 haul1
> 12 Start.hauling 2011-01-02 14:22:00 haul1
> 13 Start.setting 2011-01-02 20:51:00 set2
> 14 Start.setting 2011-01-02 21:33:00 set2
> 15 Start.setting 2011-01-02 22:47:00 set2
> 16 Start.setting 2011-01-02 23:27:00 set2
> 17 Start.setting 2011-01-03 00:35:00 set2
> 18 Start.setting 2011-01-03 01:16:00 set2
> 19 Start.hauling 2011-01-03 04:31:00 haul2
> 20 Start.hauling 2011-01-03 08:57:00 haul2
>
> It seems like such a simple question, yet I just cant think of how to
> implement this. Any hints or ideas on how I might achieve this would
> be much appreciated.
>
> Regards,
> Darcy
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
Why does the universe go to all the bother of existing?
-- Stephen Hawking
#define QUESTION ((bb) || !(bb))
-- William Shakespeare
Don McKenzie, Research Ecologist
Pacific WIldland Fire Sciences Lab
US Forest Service
Affiliate Professor
School of Forest Resources, College of the Environment
CSES Climate Impacts Group
University of Washington
desk: 206-732-7824
cell: 206-321-5966
dmck at uw.edu
donaldmckenzie at fs.fed.us
Hi:
Here's one way to piece it together. All we need is the first variable, so
I'll manufacture a vector of Start.action's and go from there.
w <- data.frame(Start.action = c(rep('Start.setting', 3),
rep('Start.hauling', 4),
rep('Start.setting', 4),
rep('Start.hauling', 6),
rep('Start.setting', 4),
rep('Start.hauling', 4)))
wr <- rle(w$Start.action == 'Start.setting')> wr
Run Length Encoding
lengths: int [1:6] 3 4 4 6 4 4
values : logi [1:6] TRUE FALSE TRUE FALSE TRUE FALSE
w$cycle <- rep(cumsum(wr$values), wr$lengths)
w$act <- ifelse(w$Start.action == 'Start.setting', 'set',
'haul')
w$action <- with(w, paste(act, cycle, sep = ''))
w$cycle <- w$act <- NULL> w
Start.action action
1 Start.setting set1
2 Start.setting set1
3 Start.setting set1
4 Start.hauling haul1
5 Start.hauling haul1
<snip>
20 Start.setting set3
21 Start.setting set3
22 Start.hauling haul3
23 Start.hauling haul3
24 Start.hauling haul3
25 Start.hauling haul3
The rle() function is the key to this; given a logical statement as its
argument, it is TRUE for Start.setting and FALSE for Start.hauling. The
cumsum() function on the $values component of the result from rle() gives
the values we want, and we replicate them according to the vector of
$lengths given from rle. Once that is done, we just use a vectorized
ifelse() function to yield 'set' or 'haul' in a new variable and
then piece
that together with the numeric vector...and we're done. Run the code one
line at a time to understand what each instruction is doing.
HTH,
Dennis
On Mon, Mar 7, 2011 at 7:13 PM, Darcy Webber <darcy.webber@gmail.com>
wrote:
> Dear R users,
>
> I am working on allocating the rows within a dataframe into some
> factor levels.Consider the following dataframe:
>
> Start.action Start.time
> 1 Start.setting 2010-12-30 17:58:00
> 2 Start.setting 2010-12-30 18:40:00
> 3 Start.setting 2010-12-31 22:39:00
> 4 Start.setting 2010-12-31 23:24:00
> 5 Start.setting 2011-01-01 00:30:00
> 6 Start.setting 2011-01-01 01:10:00
> 7 Start.hauling 2011-01-01 07:07:00
> 8 Start.hauling 2011-01-01 14:25:00
> 9 Start.hauling 2011-01-01 21:28:00
> 10 Start.hauling 2011-01-02 03:38:00
> 11 Start.hauling 2011-01-02 09:28:00
> 12 Start.hauling 2011-01-02 14:22:00
> 13 Start.setting 2011-01-02 20:51:00
> 14 Start.setting 2011-01-02 21:33:00
> 15 Start.setting 2011-01-02 22:47:00
> 16 Start.setting 2011-01-02 23:27:00
> 17 Start.setting 2011-01-03 00:35:00
> 18 Start.setting 2011-01-03 01:16:00
> 19 Start.hauling 2011-01-03 04:31:00
> 20 Start.hauling 2011-01-03 08:57:00
>
> I am trying to assign a factor level like the one below (named
> "action") according to the sequence of setting and hauling
occuring in
> the "Start.action" column. In fact, it wouldnt even need to be a
> factor or character, it could simply be numbered (i.e., the set/haul
> prefix is useless as I could simply split it afterwards).
>
> Start.action Start.time action
> 1 Start.setting 2010-12-30 17:58:00 set1
> 2 Start.setting 2010-12-30 18:40:00 set1
> 3 Start.setting 2010-12-31 22:39:00 set1
> 4 Start.setting 2010-12-31 23:24:00 set1
> 5 Start.setting 2011-01-01 00:30:00 set1
> 6 Start.setting 2011-01-01 01:10:00 set1
> 7 Start.hauling 2011-01-01 07:07:00 haul1
> 8 Start.hauling 2011-01-01 14:25:00 haul1
> 9 Start.hauling 2011-01-01 21:28:00 haul1
> 10 Start.hauling 2011-01-02 03:38:00 haul1
> 11 Start.hauling 2011-01-02 09:28:00 haul1
> 12 Start.hauling 2011-01-02 14:22:00 haul1
> 13 Start.setting 2011-01-02 20:51:00 set2
> 14 Start.setting 2011-01-02 21:33:00 set2
> 15 Start.setting 2011-01-02 22:47:00 set2
> 16 Start.setting 2011-01-02 23:27:00 set2
> 17 Start.setting 2011-01-03 00:35:00 set2
> 18 Start.setting 2011-01-03 01:16:00 set2
> 19 Start.hauling 2011-01-03 04:31:00 haul2
> 20 Start.hauling 2011-01-03 08:57:00 haul2
>
> It seems like such a simple question, yet I just cant think of how to
> implement this. Any hints or ideas on how I might achieve this would
> be much appreciated.
>
> Regards,
> Darcy
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
On Mar 7, 2011, at 10:13 PM, Darcy Webber wrote:> Dear R users, > > I am working on allocating the rows within a dataframe into some > factor levels.Consider the following dataframe: > > Start.action Start.time > 1 Start.setting 2010-12-30 17:58:00 > 2 Start.setting 2010-12-30 18:40:00 > 3 Start.setting 2010-12-31 22:39:00 > 4 Start.setting 2010-12-31 23:24:00 > 5 Start.setting 2011-01-01 00:30:00 > 6 Start.setting 2011-01-01 01:10:00 > 7 Start.hauling 2011-01-01 07:07:00 > 8 Start.hauling 2011-01-01 14:25:00 > 9 Start.hauling 2011-01-01 21:28:00 > 10 Start.hauling 2011-01-02 03:38:00 > 11 Start.hauling 2011-01-02 09:28:00 > 12 Start.hauling 2011-01-02 14:22:00 > 13 Start.setting 2011-01-02 20:51:00 > 14 Start.setting 2011-01-02 21:33:00 > 15 Start.setting 2011-01-02 22:47:00 > 16 Start.setting 2011-01-02 23:27:00 > 17 Start.setting 2011-01-03 00:35:00 > 18 Start.setting 2011-01-03 01:16:00 > 19 Start.hauling 2011-01-03 04:31:00 > 20 Start.hauling 2011-01-03 08:57:00 > > I am trying to assign a factor level like the one below (named > "action") according to the sequence of setting and hauling occuring in > the "Start.action" column. In fact, it wouldnt even need to be a > factor or character, it could simply be numbered (i.e., the set/haul > prefix is useless as I could simply split it afterwards).w$action <- paste(c("set", "haul")[ 1+ c(0, cumsum(w[2:nrow(w), 1] != w[1:(nrow(w)-1), 1])) % %2], 1+ c(0, cumsum(w[2:nrow(w), 1] != w[1:(nrow(w)-1), 1])) %/%2 , sep="")> > Start.action Start.time action > 1 Start.setting 2010-12-30 17:58:00 set1 > 2 Start.setting 2010-12-30 18:40:00 set1 > 3 Start.setting 2010-12-31 22:39:00 set1 > 4 Start.setting 2010-12-31 23:24:00 set1 > 5 Start.setting 2011-01-01 00:30:00 set1 > 6 Start.setting 2011-01-01 01:10:00 set1 > 7 Start.hauling 2011-01-01 07:07:00 haul1 > 8 Start.hauling 2011-01-01 14:25:00 haul1 > 9 Start.hauling 2011-01-01 21:28:00 haul1 > 10 Start.hauling 2011-01-02 03:38:00 haul1 > 11 Start.hauling 2011-01-02 09:28:00 haul1 > 12 Start.hauling 2011-01-02 14:22:00 haul1 > 13 Start.setting 2011-01-02 20:51:00 set2 > 14 Start.setting 2011-01-02 21:33:00 set2 > 15 Start.setting 2011-01-02 22:47:00 set2 > 16 Start.setting 2011-01-02 23:27:00 set2 > 17 Start.setting 2011-01-03 00:35:00 set2 > 18 Start.setting 2011-01-03 01:16:00 set2 > 19 Start.hauling 2011-01-03 04:31:00 haul2 > 20 Start.hauling 2011-01-03 08:57:00 haul2 > > It seems like such a simple question, yet I just cant think of how to > implement this. Any hints or ideas on how I might achieve this would > be much appreciated. > > Regards, > Darcy > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT