Sam Albers
2012-May-29 18:39 UTC
[R] Extract time from irregular date and time data records
Hello,
I am having a problem making use of some data outputted from an
instrument in a somewhat weird format. The instrument outputs two
columns - one called JulianDay.Hour and one called Minutes.Seconds. I
would like to convert these columns into a single column with a time.
So I was using substr() and paste to extract that info. This works
fine for the JulianDay.Hour column as there are always five characters
in an entry. However in the Minutes.Seconds column any leading zeroes
are dropped by the instrument. So if I use substr() to selected based
on character position I end up with incorrect times. So for example:
## df
df<-structure(list(Temperature = c(18.63, 18.4, 18.18, 16.99, 16.86,
11.39, 11.39, 11.37, 11.37, 11.37, 11.37), JulianDay.Hour = c(22610L,
22610L, 22610L, 22610L, 22610L, 22611L, 22611L, 22611L, 22611L,
22611L, 22611L), Minutes.Seconds = c(4608L, 4611L, 4614L, 4638L,
4641L, 141L, 144L, 208L, 211L, 214L, 238L)), .Names = c("Temperature",
"JulianDay.Hour", "Minutes.Seconds"), row.names = c(3176L,
3177L,
3178L, 3179L, 3180L, 3079L, 3080L, 3054L, 3055L, 3056L, 3057L
), class = "data.frame")
## Extraction method for times
df$Time.Incorrect <- paste(substr(df$JulianDay.Hour, 4,5),":",
substr(df$Minutes.Seconds,1,2),":",
substr(df$Minutes.Seconds,3,4),
sep="")
## Manual generation of desired time
df$Time.Correct <- c("10:46:08",
"10:46:11","10:46:14","10:46:38","10:46:41","11:01:41","11:01:44","11:02:08","11:02:11","11:02:14","11:02:38")
## Note the absence of leading zeroes in the Minutes.Seconds leading
to incomplete time records (df$Time.Incorrect)
df
##
So can anyone recommend a good way to extract a time from variables
like these two? Basically this is subsetting a string issue.
Thanks in advance!
Sam
Sam Albers
2012-May-29 19:05 UTC
[R] Extract time from irregular date and time data records
Apologies. I was searching using the wrong search terms. This is clearly a string issue. I've added the solution below. Sam On Tue, May 29, 2012 at 11:39 AM, Sam Albers <tonightsthenight at gmail.com> wrote:> Hello, > > I am having a problem making use of some data outputted from an > instrument in a somewhat weird format. The instrument outputs two > columns - one called JulianDay.Hour and one called Minutes.Seconds. I > would like to convert these columns into a single column with a time. > So I was using substr() and paste to extract that info. This works > fine for the JulianDay.Hour column as there are always five characters > in an entry. However in the Minutes.Seconds column any leading zeroes > are dropped by the instrument. So if I use substr() to selected based > on character position I end up with incorrect times. So for example: > > ## df > > df<-structure(list(Temperature = c(18.63, 18.4, 18.18, 16.99, 16.86, > 11.39, 11.39, 11.37, 11.37, 11.37, 11.37), JulianDay.Hour = c(22610L, > 22610L, 22610L, 22610L, 22610L, 22611L, 22611L, 22611L, 22611L, > 22611L, 22611L), Minutes.Seconds = c(4608L, 4611L, 4614L, 4638L, > 4641L, 141L, 144L, 208L, 211L, 214L, 238L)), .Names = c("Temperature", > "JulianDay.Hour", "Minutes.Seconds"), row.names = c(3176L, 3177L, > 3178L, 3179L, 3180L, 3079L, 3080L, 3054L, 3055L, 3056L, 3057L > ), class = "data.frame") > > ## Extraction method for times > df$Time.Incorrect <- paste(substr(df$JulianDay.Hour, 4,5),":", > ? ? ? ? ? ? ? ? substr(df$Minutes.Seconds,1,2),":", > ? ? ? ? ? ? ? ? substr(df$Minutes.Seconds,3,4), > ? ? ? ? ? ? ? ? sep="") > >## Addition of leading zeroes df$Time.Correct <- paste(substr(df$JulianDay.Hour, 4,5),":", substr(formatC(df$Minutes.Seconds, width 4, format = "d", flag = "0"),1,2),":", substr(formatC(df$Minutes.Seconds, width 4, format = "d", flag = "0"),3,4), sep="")> > > ## Note the absence of leading zeroes in the Minutes.Seconds leading > to incomplete time records (df$Time.Incorrect) > df > > ## > > So can anyone recommend a good way to extract a time from variables > like these two? Basically this is subsetting a string issue. > > Thanks in advance! > > Sam