Hi, I am having some problems setting up some indicators and would appreciate any help. I have some data called 'lights' with 3 variables called x, a and b. x - is the date a - equals 1 to indicate an 'on' button is activated b - equals 1 to indicate an 'off' button is activated Essentially i wannt to create 2 new variables c and d c - will reflect the current state of the light (1 being on) d - will be a count for how many days the light has been on here's some data with the date omitted to illustrate what i have and what is required. a b c d 0 0 0 0 0 0 0 0 1 0 1 1 1 0 1 2 0 0 1 3 0 0 1 4 1 0 1 5 0 0 1 6 0 0 1 7 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 1 2 0 0 1 3 0 0 1 4 0 1 0 0 0 0 0 0 After some considerable time i have managed to create variable c with a loop but it's really slow with the volume of data i have. Could anyone please show me how to do this efficiently? I hope this is clear... Thanks Roy
try this:> x <- read.table(textConnection("a b c d+ 0 0 0 0 + 0 0 0 0 + 1 0 1 1 + 1 0 1 2 + 0 0 1 3 + 0 0 1 4 + 1 0 1 5 + 0 0 1 6 + 0 0 1 7 + 0 1 0 0 + 0 0 0 0 + 0 0 0 0 + 0 0 0 0 + 0 0 0 0 + 1 0 1 1 + 0 0 1 2 + 0 0 1 3 + 0 0 1 4 + 0 1 0 0 + 0 0 0 0"), header=TRUE)> closeAllConnections() > # initialize 'c' & 'd' > x$c <- NA > x$d <- 0 > # assume 'c' is initially off > x$c[1] <- 0 > # set 'c' to value of 'a' or 'b' > x$c[x$a == 1] <- 1 > x$c[x$b == 1] <- 0 > # use the 'zoo' package for na.locf function > require(zoo) > # carry forward the value in 'c' > x$c <- na.locf(x$c) > # add a column for grouping > x$grp <- cumsum(c(0, diff(x$c) != 0)) > # now put on the count > x$d <- ave(x$c, x$grp, FUN=cumsum) > # remove 'grp' > x$grp <- NULL > xa b c d 1 0 0 0 0 2 0 0 0 0 3 1 0 1 1 4 1 0 1 2 5 0 0 1 3 6 0 0 1 4 7 1 0 1 5 8 0 0 1 6 9 0 0 1 7 10 0 1 0 0 11 0 0 0 0 12 0 0 0 0 13 0 0 0 0 14 0 0 0 0 15 1 0 1 1 16 0 0 1 2 17 0 0 1 3 18 0 0 1 4 19 0 1 0 0 20 0 0 0 0> >On Mon, Aug 2, 2010 at 8:15 AM, Roy Davy <roydavy at hotmail.com> wrote:> > Hi, > > I am having some problems setting up some indicators and would appreciate any help. > I have some data called 'lights' with 3 variables called x, a and b. > x - is the date > a - equals 1 to indicate an 'on' button is activated > b - equals 1 to indicate an 'off' button is activated > > Essentially i wannt to create 2 new variables c and d > c - will reflect the current state of the light (1 being on) > d - will be a count for how many days the light has been on > > here's some data with the date omitted to illustrate what i have and what is required. > > a b c d > 0 0 0 0 > 0 0 0 0 > 1 0 1 1 > 1 0 1 2 > 0 0 1 3 > 0 0 1 4 > 1 0 1 5 > 0 0 1 6 > 0 0 1 7 > 0 1 0 0 > 0 0 0 0 > 0 0 0 0 > 0 0 0 0 > 0 0 0 0 > 1 0 1 1 > 0 0 1 2 > 0 0 1 3 > 0 0 1 4 > 0 1 0 0 > 0 0 0 0 > > ?After some considerable time i have managed to create variable c with a loop but it's really slow with the volume of data i have. Could anyone please show me how to do this efficiently? > > I hope this is clear... > > Thanks > > > Roy > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
On Mon, 2 Aug 2010, Roy Davy wrote:> > Hi, > > I am having some problems setting up some indicators and would appreciate any help. > I have some data called 'lights' with 3 variables called x, a and b. > x - is the date > a - equals 1 to indicate an 'on' button is activated > b - equals 1 to indicate an 'off' button is activated > > Essentially i wannt to create 2 new variables c and d > c - will reflect the current state of the light (1 being on) > d - will be a count for how many days the light has been on > > here's some data with the date omitted to illustrate what i have and what is required. > > a b c d > 0 0 0 0 > 0 0 0 0 > 1 0 1 1 > 1 0 1 2 > 0 0 1 3 > 0 0 1 4 > 1 0 1 5 > 0 0 1 6 > 0 0 1 7 > 0 1 0 0 > 0 0 0 0 > 0 0 0 0 > 0 0 0 0 > 0 0 0 0 > 1 0 1 1 > 0 0 1 2 > 0 0 1 3 > 0 0 1 4 > 0 1 0 0 > 0 0 0 0 > > After some considerable time i have managed to create variable c with a > loop but it's really slow with the volume of data i have. Could anyone > please show me how to do this efficiently?Two solutions: 1) inline - if you are configured to compile source, use the inline package to render your loop as C or Fortran 2) findInterval - like this: on.pos <- which(a==1) off.pos <- which(b==1) last.on <- c(0,on.pos)[ 1 + findInterval(1:20,on.pos)] last.off <- c(0,off.pos)[ 1 + findInterval(1:20,off.pos)] cee <- as.integer(last.on>last.off) Don't use 'c' as a variable name as it is a common function HTH, Chuck> > I hope this is clear... > > Thanks > > > Roy > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901