Hi list, based on the following data.frame I would like to create a variable that indicates the number of occurrences of A in the 3 years prior to the current year: DF = data.frame(read.table(textConnection(" A B 8025 1995 8026 1995 8029 1995 8026 1996 8025 1997 8026 1997 8025 1997 8027 1997 8026 1999 8027 1999 8028 1995 8029 1998 8025 1997 8027 1997 8026 1999 8027 1999 8028 1995 8029 1998"),head=TRUE,stringsAsFactors=FALSE)) becomes: A B C 8025 1995 0 8026 1995 0 8029 1995 0 8026 1996 1 8025 1997 1 8026 1997 2 8025 1997 1 8027 1997 0 8026 1999 2 8027 1999 2 8028 1995 0 8029 1998 1 8025 1997 1 8027 1997 0 8026 1999 2 8027 1999 2 8028 1995 0 8029 2000 1 So 8026 in 1997 = 2 because 8026 can be found in 1995 and 1996 which are both within the appropriate window (1996 - 1994). Any ideas? I looked at the rollapply vignette, but couldn't figure out how to apply it to my data. Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Counting-occurrences-in-a-moving-window-tp3568658p3568658.html Sent from the R help mailing list archive at Nabble.com.
Would it be possible to use the sqldf package and the ave function to simply run ave over a limited set? So something like: DF = data.frame(read.table(textConnection(" A B 8025 1995 8026 1995 8029 1995 8026 1996 8025 1997 8026 1997 8025 1997 8027 1997 8026 1999 8027 1999 8028 1995 8029 1998 8025 1997 8027 1997 8026 1999 8027 1999 8028 1995 8029 1998"),head=TRUE,stringsAsFactors=FALSE)) library(sqldf) years<-c(1995:1999) for (t in 1:length(years)) { year = as.numeric(years[t]) m<-sqldf('select * from DF where B between $year-1 AND $year-4') n<-ave(m$A,m$A,FUN = length) } How do I get the correct values in DF$C? Thanks!! -- View this message in context: http://r.789695.n4.nabble.com/Counting-occurrences-in-a-moving-window-tp3568658p3570652.html Sent from the R help mailing list archive at Nabble.com.
On Thu, Jun 2, 2011 at 11:41 AM, mdvaan <mathijsdevaan at gmail.com> wrote:> Hi list, based on the following data.frame I would like to create a variable > that indicates the number of occurrences of A in the 3 years prior to the > current year: > > DF = data.frame(read.table(textConnection(" ?A ?B > 8025 ?1995 > 8026 ?1995 > 8029 ?1995 > 8026 ?1996 > 8025 ?1997 > 8026 ?1997 > 8025 ?1997 > 8027 ?1997 > 8026 ?1999 > 8027 ?1999 > 8028 ?1995 > 8029 ?1998 > 8025 ?1997 > 8027 ?1997 > 8026 ?1999 > 8027 ?1999 > 8028 ?1995 > 8029 ?1998"),head=TRUE,stringsAsFactors=FALSE)) > > becomes: > > A ? ? ? ? ? ?B ? ? ?C > 8025 ?1995 ?0 > 8026 ?1995 ?0 > 8029 ?1995 ?0 > 8026 ?1996 ?1 > 8025 ?1997 ?1 > 8026 ?1997 ?2 > 8025 ?1997 ?1 > 8027 ?1997 ?0 > 8026 ?1999 ?2 > 8027 ?1999 ?2 > 8028 ?1995 ?0 > 8029 ?1998 ?1 > 8025 ?1997 ?1 > 8027 ?1997 ?0 > 8026 ?1999 ?2 > 8027 ?1999 ?2 > 8028 ?1995 ?0 > 8029 ?2000 ?1 > > So 8026 in 1997 = 2 because 8026 can be found in 1995 and 1996 which are > both within the appropriate window (1996 - 1994). > > Any ideas? I looked at the rollapply vignette, but couldn't figure out how > to apply it to my data. >Try this:> DF$C <- sapply(1:nrow(DF), function(i)+ sum(DF$B < DF$B[i] & DF$B >= DF$B[i]-3 & DF$A[i] == DF$A))> DFA B C 1 8025 1995 0 2 8026 1995 0 3 8029 1995 0 4 8026 1996 1 5 8025 1997 1 6 8026 1997 2 7 8025 1997 1 8 8027 1997 0 9 8026 1999 2 10 8027 1999 2 11 8028 1995 0 12 8029 1998 1 13 8025 1997 1 14 8027 1997 0 15 8026 1999 2 16 8027 1999 2 17 8028 1995 0 18 8029 1998 1 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com