Dear All, I usually work with time series data. The data may come in AM/PM date format or on 24 hour time basis. R can not recognize the two differences automatically - at least for me. I have to specifically tell R in which time format the data is. It seems that Pandas knows how to handle date without being told the format. The problem arises when I try to shift time by a certain time. Say adding 3600 to shift it forward, that case I have to use something like: Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date), tz="",format = "%m/%d/%Y %I:%M %p")+3600 or Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date), tz="",format = "%m/%d/%Y %H:%M")+3600 depending on the format. The date also attaches MDT or MST and so on. When merging two data frames with dates of different format that may create a problem (I think). When I get data from excel it could be in any/random format and I needed to customize the date to use in R in one of the above formats. Any TIPS - for automatic processing with no need to specifically tell the data format ? Another problem I saw was that when using r bind to bind data frames, if one column of one of the data frames is a character data (say for example none - coming from mysql) format R doesn't know how to concatenate numeric column from the other data frame to it. I needed to change the numeric to character and later after binding takes place I had to re-convert it to numeric. But, this causes problem in an automated environment. Any suggestion ? Thanks Mihretu [[alternative HTML version deleted]]
Have a look at the "lubridate" package. It claims to try to make dealing with dates easier. -- Bert On Fri, Nov 8, 2013 at 11:41 AM, Alemu Tadesse <alemu.tadesse at gmail.com> wrote:> Dear All, > > I usually work with time series data. The data may come in AM/PM date > format or on 24 hour time basis. R can not recognize the two differences > automatically - at least for me. I have to specifically tell R in which > time format the data is. It seems that Pandas knows how to handle date > without being told the format. The problem arises when I try to shift time > by a certain time. Say adding 3600 to shift it forward, that case I have to > use something like: > Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date), > tz="",format = "%m/%d/%Y %I:%M %p")+3600 > or Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date), > tz="",format = "%m/%d/%Y %H:%M")+3600 depending on the format. The date > also attaches MDT or MST and so on. When merging two data frames with > dates of different format that may create a problem (I think). When I get > data from excel it could be in any/random format and I needed to customize > the date to use in R in one of the above formats. Any TIPS - for automatic > processing with no need to specifically tell the data format ? > > Another problem I saw was that when using r bind to bind data frames, if > one column of one of the data frames is a character data (say for example > none - coming from mysql) format R doesn't know how to concatenate numeric > column from the other data frame to it. I needed to change the numeric to > character and later after binding takes place I had to re-convert it to > numeric. But, this causes problem in an automated environment. Any > suggestion ? > > Thanks > Mihretu > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374
Hi Mihretu, Can you grep for "AM" or "PM"? If so build your format string depending upon whether one of these exists in the date string. Jim On 11/09/2013 06:41 AM, Alemu Tadesse wrote:> Dear All, > > I usually work with time series data. The data may come in AM/PM date > format or on 24 hour time basis. R can not recognize the two differences > automatically - at least for me. I have to specifically tell R in which > time format the data is. It seems that Pandas knows how to handle date > without being told the format. The problem arises when I try to shift time > by a certain time. Say adding 3600 to shift it forward, that case I have to > use something like: > Measured_data$Date<- as.POSIXct(as.character(Measured_data$Date), > tz="",format = "%m/%d/%Y %I:%M %p")+3600 > or Measured_data$Date<- as.POSIXct(as.character(Measured_data$Date), > tz="",format = "%m/%d/%Y %H:%M")+3600 depending on the format. The date > also attaches MDT or MST and so on. When merging two data frames with > dates of different format that may create a problem (I think). When I get > data from excel it could be in any/random format and I needed to customize > the date to use in R in one of the above formats. Any TIPS - for automatic > processing with no need to specifically tell the data format ? > > Another problem I saw was that when using r bind to bind data frames, if > one column of one of the data frames is a character data (say for example > none - coming from mysql) format R doesn't know how to concatenate numeric > column from the other data frame to it. I needed to change the numeric to > character and later after binding takes place I had to re-convert it to > numeric. But, this causes problem in an automated environment. Any > suggestion ? > > Thanks > Mihretu >
Hi> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Alemu Tadesse > Sent: Friday, November 08, 2013 8:41 PM > To: r-help at r-project.org > Subject: [R] Date handling in R is hard to understand > > Dear All, > > I usually work with time series data. The data may come in AM/PM date > format or on 24 hour time basis. R can not recognize the two > differences automatically - at least for me. I have to specifically > tell R in which time format the data is. It seems that Pandas knows how > to handle date without being told the format. The problem arises when I > try to shift time by a certain time. Say adding 3600 to shift it > forward, that case I have to use something like: > Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date), > tz="",format = "%m/%d/%Y %I:%M %p")+3600 or Measured_data$Date <- > as.POSIXct(as.character(Measured_data$Date), > tz="",format = "%m/%d/%Y %H:%M")+3600 depending on the format. The > date also attaches MDT or MST and so on. When merging two data frames > with dates of different format that may create a problem (I think). > When I get data from excel it could be in any/random format and I > needed to customize the date to use in R in one of the above formats. > Any TIPS - for automatic processing with no need to specifically tell > the data format ? > > Another problem I saw was that when using r bind to bind data frames, > if one column of one of the data frames is a character data (say for > example none - coming from mysql) format R doesn't know how to > concatenate numeric column from the other data frame to it. I needed torbind/cbind can use data.frame method which add any column specific format. However with "normal" method, it results in matrix which has to have common type of data in all columns (actually matrix is only vector with dimensions).> str(cbind(airquality, 1:153))'data.frame': 153 obs. of 7 variables: $ ozone : int 41 36 12 18 NA 28 23 19 8 NA ... $ solar.r: int 190 118 149 313 NA NA 299 99 19 194 ... $ wind : num 7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ... $ temp : int 67 72 74 62 56 66 65 59 61 69 ... $ month : int 5 5 5 5 5 5 5 5 5 5 ... $ day : int 1 2 3 4 5 6 7 8 9 10 ... $ 1:153 : int 1 2 3 4 5 6 7 8 9 10 ... Regards Petr> change the numeric to character and later after binding takes place I > had to re-convert it to numeric. But, this causes problem in an > automated environment. Any suggestion ? > > Thanks > Mihretu > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.