Hi list, based on the following data.frame I would like to create a variable
that indicates the number of occurrences of A in the 3 years prior to the
current year:
DF = data.frame(read.table(textConnection(" A B
8025 1995
8026 1995
8029 1995
8026 1996
8025 1997
8026 1997
8025 1997
8027 1997
8026 1999
8027 1999
8028 1995
8029 1998
8025 1997
8027 1997
8026 1999
8027 1999
8028 1995
8029 1998"),head=TRUE,stringsAsFactors=FALSE))
becomes:
A B C
8025 1995 0
8026 1995 0
8029 1995 0
8026 1996 1
8025 1997 1
8026 1997 2
8025 1997 1
8027 1997 0
8026 1999 2
8027 1999 2
8028 1995 0
8029 1998 1
8025 1997 1
8027 1997 0
8026 1999 2
8027 1999 2
8028 1995 0
8029 2000 1
So 8026 in 1997 = 2 because 8026 can be found in 1995 and 1996 which are
both within the appropriate window (1996 - 1994).
Any ideas? I looked at the rollapply vignette, but couldn't figure out how
to apply it to my data.
Thanks a lot!
--
View this message in context:
http://r.789695.n4.nabble.com/Counting-occurrences-in-a-moving-window-tp3568658p3568658.html
Sent from the R help mailing list archive at Nabble.com.
Would it be possible to use the sqldf package and the ave function to simply
run ave over a limited set? So something like:
DF = data.frame(read.table(textConnection(" A B
8025 1995
8026 1995
8029 1995
8026 1996
8025 1997
8026 1997
8025 1997
8027 1997
8026 1999
8027 1999
8028 1995
8029 1998
8025 1997
8027 1997
8026 1999
8027 1999
8028 1995
8029 1998"),head=TRUE,stringsAsFactors=FALSE))
library(sqldf)
years<-c(1995:1999)
for (t in 1:length(years))
{
year = as.numeric(years[t])
m<-sqldf('select * from DF where B between $year-1 AND $year-4')
n<-ave(m$A,m$A,FUN = length)
}
How do I get the correct values in DF$C? Thanks!!
--
View this message in context:
http://r.789695.n4.nabble.com/Counting-occurrences-in-a-moving-window-tp3568658p3570652.html
Sent from the R help mailing list archive at Nabble.com.
On Thu, Jun 2, 2011 at 11:41 AM, mdvaan <mathijsdevaan at gmail.com> wrote:> Hi list, based on the following data.frame I would like to create a variable > that indicates the number of occurrences of A in the 3 years prior to the > current year: > > DF = data.frame(read.table(textConnection(" ?A ?B > 8025 ?1995 > 8026 ?1995 > 8029 ?1995 > 8026 ?1996 > 8025 ?1997 > 8026 ?1997 > 8025 ?1997 > 8027 ?1997 > 8026 ?1999 > 8027 ?1999 > 8028 ?1995 > 8029 ?1998 > 8025 ?1997 > 8027 ?1997 > 8026 ?1999 > 8027 ?1999 > 8028 ?1995 > 8029 ?1998"),head=TRUE,stringsAsFactors=FALSE)) > > becomes: > > A ? ? ? ? ? ?B ? ? ?C > 8025 ?1995 ?0 > 8026 ?1995 ?0 > 8029 ?1995 ?0 > 8026 ?1996 ?1 > 8025 ?1997 ?1 > 8026 ?1997 ?2 > 8025 ?1997 ?1 > 8027 ?1997 ?0 > 8026 ?1999 ?2 > 8027 ?1999 ?2 > 8028 ?1995 ?0 > 8029 ?1998 ?1 > 8025 ?1997 ?1 > 8027 ?1997 ?0 > 8026 ?1999 ?2 > 8027 ?1999 ?2 > 8028 ?1995 ?0 > 8029 ?2000 ?1 > > So 8026 in 1997 = 2 because 8026 can be found in 1995 and 1996 which are > both within the appropriate window (1996 - 1994). > > Any ideas? I looked at the rollapply vignette, but couldn't figure out how > to apply it to my data. >Try this:> DF$C <- sapply(1:nrow(DF), function(i)+ sum(DF$B < DF$B[i] & DF$B >= DF$B[i]-3 & DF$A[i] == DF$A))> DFA B C 1 8025 1995 0 2 8026 1995 0 3 8029 1995 0 4 8026 1996 1 5 8025 1997 1 6 8026 1997 2 7 8025 1997 1 8 8027 1997 0 9 8026 1999 2 10 8027 1999 2 11 8028 1995 0 12 8029 1998 1 13 8025 1997 1 14 8027 1997 0 15 8026 1999 2 16 8027 1999 2 17 8028 1995 0 18 8029 1998 1 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com