Sam Albers
2012-May-29 18:39 UTC
[R] Extract time from irregular date and time data records
Hello, I am having a problem making use of some data outputted from an instrument in a somewhat weird format. The instrument outputs two columns - one called JulianDay.Hour and one called Minutes.Seconds. I would like to convert these columns into a single column with a time. So I was using substr() and paste to extract that info. This works fine for the JulianDay.Hour column as there are always five characters in an entry. However in the Minutes.Seconds column any leading zeroes are dropped by the instrument. So if I use substr() to selected based on character position I end up with incorrect times. So for example: ## df df<-structure(list(Temperature = c(18.63, 18.4, 18.18, 16.99, 16.86, 11.39, 11.39, 11.37, 11.37, 11.37, 11.37), JulianDay.Hour = c(22610L, 22610L, 22610L, 22610L, 22610L, 22611L, 22611L, 22611L, 22611L, 22611L, 22611L), Minutes.Seconds = c(4608L, 4611L, 4614L, 4638L, 4641L, 141L, 144L, 208L, 211L, 214L, 238L)), .Names = c("Temperature", "JulianDay.Hour", "Minutes.Seconds"), row.names = c(3176L, 3177L, 3178L, 3179L, 3180L, 3079L, 3080L, 3054L, 3055L, 3056L, 3057L ), class = "data.frame") ## Extraction method for times df$Time.Incorrect <- paste(substr(df$JulianDay.Hour, 4,5),":", substr(df$Minutes.Seconds,1,2),":", substr(df$Minutes.Seconds,3,4), sep="") ## Manual generation of desired time df$Time.Correct <- c("10:46:08", "10:46:11","10:46:14","10:46:38","10:46:41","11:01:41","11:01:44","11:02:08","11:02:11","11:02:14","11:02:38") ## Note the absence of leading zeroes in the Minutes.Seconds leading to incomplete time records (df$Time.Incorrect) df ## So can anyone recommend a good way to extract a time from variables like these two? Basically this is subsetting a string issue. Thanks in advance! Sam
Sam Albers
2012-May-29 19:05 UTC
[R] Extract time from irregular date and time data records
Apologies. I was searching using the wrong search terms. This is clearly a string issue. I've added the solution below. Sam On Tue, May 29, 2012 at 11:39 AM, Sam Albers <tonightsthenight at gmail.com> wrote:> Hello, > > I am having a problem making use of some data outputted from an > instrument in a somewhat weird format. The instrument outputs two > columns - one called JulianDay.Hour and one called Minutes.Seconds. I > would like to convert these columns into a single column with a time. > So I was using substr() and paste to extract that info. This works > fine for the JulianDay.Hour column as there are always five characters > in an entry. However in the Minutes.Seconds column any leading zeroes > are dropped by the instrument. So if I use substr() to selected based > on character position I end up with incorrect times. So for example: > > ## df > > df<-structure(list(Temperature = c(18.63, 18.4, 18.18, 16.99, 16.86, > 11.39, 11.39, 11.37, 11.37, 11.37, 11.37), JulianDay.Hour = c(22610L, > 22610L, 22610L, 22610L, 22610L, 22611L, 22611L, 22611L, 22611L, > 22611L, 22611L), Minutes.Seconds = c(4608L, 4611L, 4614L, 4638L, > 4641L, 141L, 144L, 208L, 211L, 214L, 238L)), .Names = c("Temperature", > "JulianDay.Hour", "Minutes.Seconds"), row.names = c(3176L, 3177L, > 3178L, 3179L, 3180L, 3079L, 3080L, 3054L, 3055L, 3056L, 3057L > ), class = "data.frame") > > ## Extraction method for times > df$Time.Incorrect <- paste(substr(df$JulianDay.Hour, 4,5),":", > ? ? ? ? ? ? ? ? substr(df$Minutes.Seconds,1,2),":", > ? ? ? ? ? ? ? ? substr(df$Minutes.Seconds,3,4), > ? ? ? ? ? ? ? ? sep="") > >## Addition of leading zeroes df$Time.Correct <- paste(substr(df$JulianDay.Hour, 4,5),":", substr(formatC(df$Minutes.Seconds, width 4, format = "d", flag = "0"),1,2),":", substr(formatC(df$Minutes.Seconds, width 4, format = "d", flag = "0"),3,4), sep="")> > > ## Note the absence of leading zeroes in the Minutes.Seconds leading > to incomplete time records (df$Time.Incorrect) > df > > ## > > So can anyone recommend a good way to extract a time from variables > like these two? Basically this is subsetting a string issue. > > Thanks in advance! > > Sam