Dear R Community, I am new to R, and have a question that I suspect may be quite simple but is proving a formidable roadblock for me. I have a large data set that includes water-quality measurements collected over many 24-hour periods. The date and time of sample collection are in a combined Date/Time field in the format yyyy-mm-dd hh:mm:ss. I need to be able to subset the data for analysis of different date and time windows. Thus far, I have tried casting the Date/Time field using several approaches, such as: DataSet$NewDateTime <- strptime(DataSet$DateTime, '%Y-%m-%d %H:%M:%S') DataSet$NewDateTime <- as.POSIXlt(strptime(DataSet$DateTime, '%Y-%m-%d %H:%M:S')) These instructions seem to cast the NewDateTime field correctly (at least it appears to be in the correct format, and I assume R sees the field as a date and a time) but I am then unable to subset the data using instructions such as: with(DataSet, subset(DataSet, DataSet$NewDateTime < '2004-08-05 14:15:00')) DataSubset <- subset(DataSet, DataSet$NewDateTime < '2004-08-05 14:00:00', select = DataSet) I have tried also separating the date and time fields in the input file, and casting with instructions such as: DataSet$NewTime <- strptime(DataSet$Time, '%H:%M:%S') DataSet$NewTime <- as.POSIXct(strptime(DataSet$Time, '%H:%M:%S')) but these seem to generate a NewTime field that contains today's date + the time data, and also will not subset based on date/time. I appreciate greatly any help and advice, Steve -- View this message in context: http://r.789695.n4.nabble.com/help-subsetting-data-based-on-date-AND-time-tp3799933p3799933.html Sent from the R help mailing list archive at Nabble.com.
Try altering your subset operation from this: with(DataSet, subset(DataSet, DataSet$NewDateTime < '2004-08-05 14:15:00')) to this: with(DataSet, subset(DataSet, DataSet$NewDateTime < as.POSIXct('2004-08-05 14:15:00'))) and see if you get the desired effect. The statement DataSet$NewDateTime < '2004-08-05 14:15:00' is asking R to find all of the rows in DataSet$NewDateTime that are less than the *character* value '2004-08-05 14:15:00'. You need to convert that *character* value to a POSIX time value first, using as.POSIXct(). Then you can successfully carry out the comparison between the POSIXct values in DataSet$NewDateTime and your newly created POSIX time value. Because your character time value is listed in the standard POSIX format (yyyy-mm-dd HH:MM:SS), you don't need to include the format information (%y-%m-%d %H:%M:%S) in the as.POSIXct() function, which saves a little typing. If it was in another format (mm-dd-yyyy) you'd need to use the format argument in as.POSIXct() to make the character-to-POSIXct conversion correctly. On Thu, Sep 8, 2011 at 4:03 PM, Steve E. <searl@vt.edu> wrote:> Dear R Community, > > I am new to R, and have a question that I suspect may be quite simple but > is > proving a formidable roadblock for me. I have a large data set that > includes water-quality measurements collected over many 24-hour periods. > The date and time of sample collection are in a combined Date/Time field in > the format yyyy-mm-dd hh:mm:ss. I need to be able to subset the data for > analysis of different date and time windows. Thus far, I have tried > casting > the Date/Time field using several approaches, such as: > > DataSet$NewDateTime <- strptime(DataSet$DateTime, '%Y-%m-%d %H:%M:%S') > DataSet$NewDateTime <- as.POSIXlt(strptime(DataSet$DateTime, '%Y-%m-%d > %H:%M:S')) > > These instructions seem to cast the NewDateTime field correctly (at least > it > appears to be in the correct format, and I assume R sees the field as a > date > and a time) but I am then unable to subset the data using instructions such > as: > > with(DataSet, subset(DataSet, DataSet$NewDateTime < '2004-08-05 14:15:00')) > DataSubset <- subset(DataSet, DataSet$NewDateTime < '2004-08-05 14:00:00', > select = DataSet) > > I have tried also separating the date and time fields in the input file, > and > casting with instructions such as: > > DataSet$NewTime <- strptime(DataSet$Time, '%H:%M:%S') > DataSet$NewTime <- as.POSIXct(strptime(DataSet$Time, '%H:%M:%S')) > > but these seem to generate a NewTime field that contains today's date + the > time data, and also will not subset based on date/time. > > I appreciate greatly any help and advice, > Steve > > -- > View this message in context: > http://r.789695.n4.nabble.com/help-subsetting-data-based-on-date-AND-time-tp3799933p3799933.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ___________________________ Luke Miller Postdoctoral Researcher Marine Science Center Northeastern University Nahant, MA (781) 581-7370 x318 [[alternative HTML version deleted]]
Steve, Just below are some examples that I hope will help. With regard to what you've tried, I don't see any reason for using with(), or the select argument to subset(). They both look unnecessary to me. ## examples of subsetting date-time values ## create fake data tmp <- seq(as.POSIXct('2011-08-01 13:00'), as.POSIXct('2011-08-05 03:00'), len=42) df <- data.frame(tm=tmp, x=seq(42)) ## subset examples ## on or before the 2nd at 01:30 df1 <- subset(df, tm <= as.POSIXct('2011-08-02 1:30')) ## everying on the 3rd df2 <- subset(df, format(tm,'%d')=='03') ## everything in hours 11am through 1pm inclusive df3 <- subset(df, format(tm,'%H') %in% c('11','12','13')) ## 11 am through 3:59 pm on the 2nd df4 <- subset(df, tm >= as.POSIXct('2011-08-02 11:00') & tm <as.POSIXct('2011-08-02 15:59')) ## just for reference, a sequence of every 15 minutes tmp <- seq(as.POSIXct('2011-08-01 13:00'), as.POSIXct('2011-08-02 03:00'), by='15 min') Note that all comparisons use POSIXct class objects, converting character to POSIXct where needed. As Luke mentioned, if the character strings are in standard format, 'yyyy-mm-dd HH:MM:SS', just use as.POSIXct() without any additional args. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 9/8/11 1:03 PM, "Steve E." <searl at vt.edu> wrote:>Dear R Community, > >I am new to R, and have a question that I suspect may be quite simple but >is >proving a formidable roadblock for me. I have a large data set that >includes water-quality measurements collected over many 24-hour periods. >The date and time of sample collection are in a combined Date/Time field >in >the format yyyy-mm-dd hh:mm:ss. I need to be able to subset the data for >analysis of different date and time windows. Thus far, I have tried >casting >the Date/Time field using several approaches, such as: > >DataSet$NewDateTime <- strptime(DataSet$DateTime, '%Y-%m-%d %H:%M:%S') >DataSet$NewDateTime <- as.POSIXlt(strptime(DataSet$DateTime, '%Y-%m-%d >%H:%M:S')) > >These instructions seem to cast the NewDateTime field correctly (at least >it >appears to be in the correct format, and I assume R sees the field as a >date >and a time) but I am then unable to subset the data using instructions >such >as: > >with(DataSet, subset(DataSet, DataSet$NewDateTime < '2004-08-05 >14:15:00')) >DataSubset <- subset(DataSet, DataSet$NewDateTime < '2004-08-05 14:00:00', >select = DataSet) > >I have tried also separating the date and time fields in the input file, >and >casting with instructions such as: > >DataSet$NewTime <- strptime(DataSet$Time, '%H:%M:%S') >DataSet$NewTime <- as.POSIXct(strptime(DataSet$Time, '%H:%M:%S')) > >but these seem to generate a NewTime field that contains today's date + >the >time data, and also will not subset based on date/time. > >I appreciate greatly any help and advice, >Steve > >-- >View this message in context: >http://r.789695.n4.nabble.com/help-subsetting-data-based-on-date-AND-time- >tp3799933p3799933.html >Sent from the R help mailing list archive at Nabble.com. > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.