Dear R Community, I am new to R, and have a question that I suspect may be quite simple but is proving a formidable roadblock for me. I have a large data set that includes water-quality measurements collected over many 24-hour periods. The date and time of sample collection are in a combined Date/Time field in the format yyyy-mm-dd hh:mm:ss. I need to be able to subset the data for analysis of different date and time windows. Thus far, I have tried casting the Date/Time field using several approaches, such as: DataSet$NewDateTime <- strptime(DataSet$DateTime, '%Y-%m-%d %H:%M:%S') DataSet$NewDateTime <- as.POSIXlt(strptime(DataSet$DateTime, '%Y-%m-%d %H:%M:S')) These instructions seem to cast the NewDateTime field correctly (at least it appears to be in the correct format, and I assume R sees the field as a date and a time) but I am then unable to subset the data using instructions such as: with(DataSet, subset(DataSet, DataSet$NewDateTime < '2004-08-05 14:15:00')) DataSubset <- subset(DataSet, DataSet$NewDateTime < '2004-08-05 14:00:00', select = DataSet) I have tried also separating the date and time fields in the input file, and casting with instructions such as: DataSet$NewTime <- strptime(DataSet$Time, '%H:%M:%S') DataSet$NewTime <- as.POSIXct(strptime(DataSet$Time, '%H:%M:%S')) but these seem to generate a NewTime field that contains today's date + the time data, and also will not subset based on date/time. I appreciate greatly any help and advice, Steve -- View this message in context: http://r.789695.n4.nabble.com/help-subsetting-data-based-on-date-AND-time-tp3799933p3799933.html Sent from the R help mailing list archive at Nabble.com.
Try altering your subset operation from this:
with(DataSet, subset(DataSet, DataSet$NewDateTime < '2004-08-05
14:15:00'))
to this:
with(DataSet, subset(DataSet, DataSet$NewDateTime <
as.POSIXct('2004-08-05
14:15:00')))
and see if you get the desired effect.
The statement DataSet$NewDateTime < '2004-08-05 14:15:00' is asking R
to
find all of the rows in DataSet$NewDateTime that are less than the
*character* value '2004-08-05 14:15:00'. You need to convert that
*character* value to a POSIX time value first, using as.POSIXct(). Then you
can successfully carry out the comparison between the POSIXct values in
DataSet$NewDateTime and your newly created POSIX time value.
Because your character time value is listed in the standard POSIX format
(yyyy-mm-dd HH:MM:SS), you don't need to include the format information
(%y-%m-%d %H:%M:%S) in the as.POSIXct() function, which saves a little
typing. If it was in another format (mm-dd-yyyy) you'd need to use the
format argument in as.POSIXct() to make the character-to-POSIXct conversion
correctly.
On Thu, Sep 8, 2011 at 4:03 PM, Steve E. <searl@vt.edu> wrote:
> Dear R Community,
>
> I am new to R, and have a question that I suspect may be quite simple but
> is
> proving a formidable roadblock for me. I have a large data set that
> includes water-quality measurements collected over many 24-hour periods.
> The date and time of sample collection are in a combined Date/Time field in
> the format yyyy-mm-dd hh:mm:ss. I need to be able to subset the data for
> analysis of different date and time windows. Thus far, I have tried
> casting
> the Date/Time field using several approaches, such as:
>
> DataSet$NewDateTime <- strptime(DataSet$DateTime, '%Y-%m-%d
%H:%M:%S')
> DataSet$NewDateTime <- as.POSIXlt(strptime(DataSet$DateTime,
'%Y-%m-%d
> %H:%M:S'))
>
> These instructions seem to cast the NewDateTime field correctly (at least
> it
> appears to be in the correct format, and I assume R sees the field as a
> date
> and a time) but I am then unable to subset the data using instructions such
> as:
>
> with(DataSet, subset(DataSet, DataSet$NewDateTime < '2004-08-05
14:15:00'))
> DataSubset <- subset(DataSet, DataSet$NewDateTime < '2004-08-05
14:00:00',
> select = DataSet)
>
> I have tried also separating the date and time fields in the input file,
> and
> casting with instructions such as:
>
> DataSet$NewTime <- strptime(DataSet$Time, '%H:%M:%S')
> DataSet$NewTime <- as.POSIXct(strptime(DataSet$Time,
'%H:%M:%S'))
>
> but these seem to generate a NewTime field that contains today's date +
the
> time data, and also will not subset based on date/time.
>
> I appreciate greatly any help and advice,
> Steve
>
> --
> View this message in context:
>
http://r.789695.n4.nabble.com/help-subsetting-data-based-on-date-AND-time-tp3799933p3799933.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
___________________________
Luke Miller
Postdoctoral Researcher
Marine Science Center
Northeastern University
Nahant, MA
(781) 581-7370 x318
[[alternative HTML version deleted]]
Steve,
Just below are some examples that I hope will help.
With regard to what you've tried, I don't see any reason for using
with(),
or the select argument to subset(). They both look unnecessary to me.
## examples of subsetting date-time values
## create fake data
tmp <- seq(as.POSIXct('2011-08-01 13:00'), as.POSIXct('2011-08-05
03:00'),
len=42)
df <- data.frame(tm=tmp, x=seq(42))
## subset examples
## on or before the 2nd at 01:30
df1 <- subset(df, tm <= as.POSIXct('2011-08-02 1:30'))
## everying on the 3rd
df2 <- subset(df, format(tm,'%d')=='03')
## everything in hours 11am through 1pm inclusive
df3 <- subset(df, format(tm,'%H') %in%
c('11','12','13'))
## 11 am through 3:59 pm on the 2nd
df4 <- subset(df, tm >= as.POSIXct('2011-08-02 11:00') & tm
<as.POSIXct('2011-08-02 15:59'))
## just for reference, a sequence of every 15 minutes
tmp <- seq(as.POSIXct('2011-08-01 13:00'), as.POSIXct('2011-08-02
03:00'),
by='15 min')
Note that all comparisons use POSIXct class objects, converting character
to POSIXct where needed. As Luke mentioned, if the character strings are
in standard format, 'yyyy-mm-dd HH:MM:SS', just use as.POSIXct() without
any additional args.
-Don
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
On 9/8/11 1:03 PM, "Steve E." <searl at vt.edu> wrote:
>Dear R Community,
>
>I am new to R, and have a question that I suspect may be quite simple but
>is
>proving a formidable roadblock for me. I have a large data set that
>includes water-quality measurements collected over many 24-hour periods.
>The date and time of sample collection are in a combined Date/Time field
>in
>the format yyyy-mm-dd hh:mm:ss. I need to be able to subset the data for
>analysis of different date and time windows. Thus far, I have tried
>casting
>the Date/Time field using several approaches, such as:
>
>DataSet$NewDateTime <- strptime(DataSet$DateTime, '%Y-%m-%d
%H:%M:%S')
>DataSet$NewDateTime <- as.POSIXlt(strptime(DataSet$DateTime,
'%Y-%m-%d
>%H:%M:S'))
>
>These instructions seem to cast the NewDateTime field correctly (at least
>it
>appears to be in the correct format, and I assume R sees the field as a
>date
>and a time) but I am then unable to subset the data using instructions
>such
>as:
>
>with(DataSet, subset(DataSet, DataSet$NewDateTime < '2004-08-05
>14:15:00'))
>DataSubset <- subset(DataSet, DataSet$NewDateTime < '2004-08-05
14:00:00',
>select = DataSet)
>
>I have tried also separating the date and time fields in the input file,
>and
>casting with instructions such as:
>
>DataSet$NewTime <- strptime(DataSet$Time, '%H:%M:%S')
>DataSet$NewTime <- as.POSIXct(strptime(DataSet$Time, '%H:%M:%S'))
>
>but these seem to generate a NewTime field that contains today's date +
>the
>time data, and also will not subset based on date/time.
>
>I appreciate greatly any help and advice,
>Steve
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/help-subsetting-data-based-on-date-AND-time-
>tp3799933p3799933.html
>Sent from the R help mailing list archive at Nabble.com.
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.