Hi,
The ?as.numeric() in 'indx' is not needed.
?indx1<-(as.Date(AB$Start)<= as.Date(AB$Date)) & (as.Date(AB$Date)
<= as.Date(AB$End))
?identical(indx,indx1)
#[1] TRUE
?AB[indx1,-c(5:7)]
A.K.
----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: Matthew Guzzo <mattguzzo12 at gmail.com>
Sent: Sunday, September 8, 2013 1:37 AM
Subject: Re: Sub setting multiple ids based on a 2nd data frame
HI Matt,
I changed the dates a little bit to show dates that are outside the range in
dataset B.
A<- read.table(text="
ID????? Date???????????? Depth? Temp
1?????? 2002-05-12?????????? 10 12
1?????? 2003-05-13?????????? 10 12
1?????? 2003-05-14?????????? 10 12
1?????? 2004-04-15?????????? 10 12
2?????? 2002-05-16?????????? 10 12
2?????? 2002-12-17?????????? 10 12
2?????? 2003-04-18?????????? 10 12
2?????? 2002-05-19?????????? 10 12
3?????? 2003-05-10?????????? 10 12
3?????? 2004-05-21?????????? 10 12
3?????? 2004-05-22?????????? 10 12
3?????? 2005-05-10?????????? 10 12
3?????? 2006-05-24?????????? 10 12
",sep="",header=TRUE,stringsAsFactors=FALSE)
?
B<- read.table(text="
Year?? Start??? End
2002 2002-05-10 2002-11-01
2003 2003-05-11 2003-11-02
2004 2004-05-12 2004-11-03
2005 2005-05-13 2005-11-04
2006 2006-05-14 2006-11-05
",sep="",header=TRUE,stringsAsFactors=FALSE)
?A$Year<-gsub("-.*","",A$Date)
?library(plyr)
AB<-join(A,B,by="Year")
?indx<-(as.numeric(as.Date(AB$Start))<= as.numeric(as.Date(AB$Date)))
& (as.numeric(as.Date(AB$Date)) <= as.numeric(as.Date(AB$End)))
?res<- AB[indx,-c(6,7)]
?res
#?? ID?????? Date Depth Temp Year
#1?? 1 2002-05-12??? 10?? 12 2002
#2?? 1 2003-05-13??? 10?? 12 2003
#3?? 1 2003-05-14??? 10?? 12 2003
#5?? 2 2002-05-16??? 10?? 12 2002
#8?? 2 2002-05-19??? 10?? 12 2002
#10? 3 2004-05-21??? 10?? 12 2004
#11? 3 2004-05-22??? 10?? 12 2004
#13? 3 2006-05-24??? 10?? 12 2006
A.K.
Hi All,
I accidentally posted this in the data.table forum and deleted it to post here.
I have some telemetry data that spans multiple years (2002 - 2013) with
multiple individuals per year. I want to subset the telemetry data to
include only those data points that fall between specific dates which are
provided in a 2nd data frame. The telemetry df is in the form of:
DF "A"
ID ? ? ?Date ? ? ? ? ? ? Depth ?Temp
1 ? ? ? 2002-05-12 ? ? ? ? ? 10 12
1 ? ? ? 2002-05-13 ? ? ? ? ? 10 12
1 ? ? ? 2002-05-14 ? ? ? ? ? 10 12
1 ? ? ? 2002-05-15 ? ? ? ? ? 10 12
2 ? ? ? 2002-05-16 ? ? ? ? ? 10 12
2 ? ? ? 2002-05-17 ? ? ? ? ? 10 12
2 ? ? ? 2002-05-18 ? ? ? ? ? 10 12
2 ? ? ? 2002-05-19 ? ? ? ? ? 10 12
3 ? ? ? 2002-05-20 ? ? ? ? ? 10 12
3 ? ? ? 2002-05-21 ? ? ? ? ? 10 12
3 ? ? ? 2002-05-22 ? ? ? ? ? 10 12
3 ? ? ? 2002-05-23 ? ? ? ? ? 10 12
3 ? ? ? 2002-05-24 ? ? ? ? ? 10 12
And the df with the dates I want to use to subset is formatted as follows:
?DF "B"
Year ? ? ? Start ? ? ? ? ? ?End
2002 ? ?2002-05-10 ? ? ?2002-11-01
2003 ? ?2003-05-11 ? ? ?2003-11-02
2004 ? ?2004-05-12 ? ? ?2004-11-03
2005 ? ?2005-05-13 ? ? ?2005-11-04
2006 ? ?2006-05-14 ? ? ?2006-11-05
So, I want to say, for each ID in DF A, subset and keep only those data
points collected on a date that fall between the start and end date for the
corresponding year from DF B.
I am unsure if a loop is my best bet, or using plyr (which I am unfamiliar
with). I am relatively new to R, so this seems a bit above my head. Any help
is much appreciated.
Thanks in advance!