Hello Everyone,? The column begins populated with integers as so:1/1/2013 0:00 in the spreadsheet equals 41257 in R's dataframe1/1/2013 0:15 in the spreadsheet equals 41257.010416666664 in R's dataframe...41257 must be in minutes since 1440min/day * .010416666664 day = 15 minutes. 41257 minutes is about 29 days: 41257 min / 1440 min/day = 28.65 days. So I don't know why the dataframe is showing 41257 for 1/12013 0:00.? Oddly, R sees the vector as NULL despite the fact it has integers in each record in the column:data_type = str(df2_TZ$DateTimeStamp) produces a NULL (empty) variable.? I tried: df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = "Sheet1")Sys.setenv(TZ = "GMT")testdtm <- as.POSIXct(df2_TZ$DateTimeStamp, format = "%m/%d/%Y %H:%M")# Inspect the resulttestdtmstr(testdtm) testdtm is a vector filled with NA values, which figures since DateTimeStamp is NULL.? I noticed in the table on page 32 of the R Help Desk pdf you linked to that dp-as.POSIXct(format(dp, tz="GMT")) is the only option listed for time zone difference. So I tried:df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = "Sheet1")df2_TZ_seq <- as.POSIXct(format(dt2_TZ, tz="GMT")) and got:?Error in format(dt2_TZ, tz = "GMT") : object 'dt2_TZ' not found Is the vector neither character nor factor, since it's NULL? Where do I go from here?? Thank You,Doug Hi Doug,What you have done is to ask whether the character string "DF_exp.xlsx" is a character string. I think Yogi Berra, were he still around, could have told you that. What will give you some useful information is: str(DF_exp.xlsx) which asks for information about the object, not its name. Jim On Friday, February 19, 2016 12:41 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: This is a mailing list. I don't know how you are interacting with it... using a website rather than an email program can lead to some confusion since there can be many ways to accomplish the task of interacting with the mailing list. My email program has a "reply-all" button when I am looking at an email. It also has an option to write the email in plain text, which often prevents the message from getting corrupted (recipient not seeing what you sent to the list). Using the str function on a literal string (the name of a file) will indeed tell you that you gave it a character string. Specifying a column in your data might tell you something more interesting... e.g. str( df2_TZ$DateTimeStamp ) If that says you have character data then Jim Lemon's suggestion would be a good next thing to look at. If it is factor data then you should use the as.character function on the data column and then follow Jim's suggestion. If it is numeric then you probably need to convert it using an appropriate origin (e.g. as described at [1] or [2]). I have had best luck setting the default timezone string when converting to POSIXt types... e.g. # specify timezone assumed by input data Sys.setenv( TZ="GMT" ) testdtm <- as.POSIXct( "1/1/2016 00:00", format = "%m/%d/%Y %H:%M" ) # inspect the result testdtm str( testdtm ) # view data from a different timezone Sys.setenv( TZ="Etc/GMT+8" ) # no change to the underlying data, but it prints out differently now because the tz attribute is "" which implies using the default TZ testdtm [1] http://blog.mollietaylor.com/2013/08/date-formats-in-r.html [2] https://www.r-project.org/doc/Rnews/Rnews_2004-1.pdf -- Sent from my phone. Please excuse my brevity. On February 19, 2016 7:48:31 AM PST, D Wolf <doug45290 at yahoo.com> wrote: Hello Jeff, I ran str() on the vector and it returned character.> str("DF_exp.xlsx")?chr "DF_exp.xlsx" This is my first thread on this forum, and I'm not sure how to reply to the thread instead of just sending the reply to your email account; I don't see a 'reply' link in the thread.I've read this page and I don't think it advises on how to reply in the thread:?R: Posting Guide: How to ask good questions that prompt useful answers | ? | | ? | | ? | ? | ? | ? | ? | | R: Posting Guide: How to ask good questions that prompt ...Posting Guide: How to ask good questions that prompt useful answers This guide is intended to help you get the most out of the R mailing lists, and to avoid embarra... | | | | View on www.r-project.org | Preview by Yahoo | | | | ? | Thank You,Doug Wolfinger On Friday, February 19, 2016 12:51 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: You are being rather scattershot in your explanation, so I suspect you are not being systematic in your troubleshooting. Use the str function to examine the data column after you pull it in from excel. It may be numeric, factor, or character, and the approach depends on which that function returns. -- Sent from my phone. Please excuse my brevity. On February 18, 2016 1:12:40 PM PST, D Wolf via R-help <r-help at r-project.org> wrote: Hello,I am trying to read a data frame column named DateTimeStamp. The time is in GMT in this format: 1/4/2013 23:30 require(xlsx) df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = "Sheet1") It's good to that line. But these three lines, which makes the dataframe, converts the column's values to NA:df2_TZ$DateTimeStamp = as.POSIXct(df2_TZ$DateTimeStamp, format="%m/%d/%Y %H:%M:%S", tz="GMT") and...?df2_TZ$DateTimeStamp = as.POSIXct(as.character(df2_TZ$DateTimeStamp), format = "%m/%d/%Y %H:%M:%S") and...df2_TZ$DateTimeStamp = as.Date(df2_TZ$DateTimeStamp, format = "%m/%d/%Y %H:%M:%S") This line returns and error...df2_TZ$DateTimeStamp = as.POSIXct(as.Date(df2_TZ$DateTimeStamp), format = "%m/%d/%Y %H:%M:%S") "Error in charToDate(x) :?? character string is not in a standard unambiguous format" Additionally, I need to convert from GMT to North American time zones, and I think the advice on this page would be good for that:?http://blog.revolutionanalytics.com/2009/06/converting-time-zones.html My ultimate goal is to write an R program that finds data in another variable in df2_TZ that corresponds to a date and time that match up with the date and time in another data frame. For now, any help reading the column would be much appreciated. Thank You,Doug [[alternative HTML version deleted]] R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
It is not minutes... read the Excel documentation for representing dates... it is days since December 30, 1899 on Windows. Read the links I provided in my last email. Also read ?str ... that function does not return anything... it only prints out information so don't expect to get anything useful by assigning the output of that function to a variable. Also read the examples section of the help file ?read.xlsx2 for relevant help. -- Sent from my phone. Please excuse my brevity. On February 22, 2016 8:55:34 AM PST, D Wolf <doug45290 at yahoo.com> wrote:>Hello Everyone,? >The column begins populated with integers as so:1/1/2013 0:00 in the >spreadsheet equals 41257 in R's dataframe1/1/2013 0:15 in the >spreadsheet equals 41257.010416666664 in R's dataframe...41257 must be >in minutes since 1440min/day * .010416666664 day = 15 minutes. 41257 >minutes is about 29 days: 41257 min / 1440 min/day = 28.65 days. So I >don't know why the dataframe is showing 41257 for 1/12013 0:00.? >Oddly, R sees the vector as NULL despite the fact it has integers in >each record in the column:data_type = str(df2_TZ$DateTimeStamp) >produces a NULL (empty) variable.? > >I tried: >df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = "Sheet1")Sys.setenv(TZ >"GMT")testdtm <- as.POSIXct(df2_TZ$DateTimeStamp, format = "%m/%d/%Y >%H:%M")# Inspect the resulttestdtmstr(testdtm) >testdtm is a vector filled with NA values, which figures since >DateTimeStamp is NULL.? >I noticed in the table on page 32 of the R Help Desk pdf you linked to >that dp-as.POSIXct(format(dp, tz="GMT")) is the only option listed for >time zone difference. So I tried:df2_TZ = read.xlsx2("DF_exp.xlsx", >sheetName = "Sheet1")df2_TZ_seq <- as.POSIXct(format(dt2_TZ, tz="GMT")) >and got:?Error in format(dt2_TZ, tz = "GMT") : object 'dt2_TZ' not >found >Is the vector neither character nor factor, since it's NULL? Where do I >go from here?? > Thank You,Doug > >Hi Doug,What you have done is to ask whether the character string >"DF_exp.xlsx" is a character string. I think Yogi Berra, were he still >around, could have told you that. What will give you some useful >information is: >str(DF_exp.xlsx) >which asks for information about the object, not its name. >Jim > >On Friday, February 19, 2016 12:41 PM, Jeff Newmiller ><jdnewmil at dcn.davis.ca.us> wrote: > > >This is a mailing list. I don't know how you are interacting with it... >using a website rather than an email program can lead to some confusion >since there can be many ways to accomplish the task of interacting with >the mailing list. My email program has a "reply-all" button when I am >looking at an email. It also has an option to write the email in plain >text, which often prevents the message from getting corrupted >(recipient not seeing what you sent to the list). > >Using the str function on a literal string (the name of a file) will >indeed tell you that you gave it a character string. Specifying a >column in your data might tell you something more interesting... e.g. > >str( df2_TZ$DateTimeStamp ) > >If that says you have character data then Jim Lemon's suggestion would >be a good next thing to look at. If it is factor data then you should >use the as.character function on the data column and then follow Jim's >suggestion. If it is numeric then you probably need to convert it using >an appropriate origin (e.g. as described at [1] or [2]). > >I have had best luck setting the default timezone string when >converting to POSIXt types... e.g. > ># specify timezone assumed by input data >Sys.setenv( TZ="GMT" ) >testdtm <- as.POSIXct( "1/1/2016 00:00", format = "%m/%d/%Y %H:%M" ) ># inspect the result >testdtm >str( testdtm ) ># view data from a different timezone >Sys.setenv( TZ="Etc/GMT+8" ) ># no change to the underlying data, but it prints out differently now >because the tz attribute is "" which implies using the default TZ >testdtm > >[1] http://blog.mollietaylor.com/2013/08/date-formats-in-r.html >[2] https://www.r-project.org/doc/Rnews/Rnews_2004-1.pdf > >-- >Sent from my phone. Please excuse my brevity. > >On February 19, 2016 7:48:31 AM PST, D Wolf <doug45290 at yahoo.com> >wrote: >Hello Jeff, >I ran str() on the vector and it returned character.> >str("DF_exp.xlsx")?chr "DF_exp.xlsx" >This is my first thread on this forum, and I'm not sure how to reply to >the thread instead of just sending the reply to your email account; I >don't see a 'reply' link in the thread.I've read this page and I don't >think it advises on how to reply in the thread:?R: Posting Guide: How >to ask good questions that prompt useful answers > >| ? | >| ? | | ? | ? | ? | ? | ? | >| R: Posting Guide: How to ask good questions that prompt ...Posting >Guide: How to ask good questions that prompt useful answers This guide >is intended to help you get the most out of the R mailing lists, and to >avoid embarra... | >| | >| View on www.r-project.org | Preview by Yahoo | >| | >| ? | > > >Thank You,Doug Wolfinger > > >On Friday, February 19, 2016 12:51 AM, Jeff Newmiller ><jdnewmil at dcn.davis.ca.us> wrote: > > >You are being rather scattershot in your explanation, so I suspect you >are not being systematic in your troubleshooting. Use the str function >to examine the data column after you pull it in from excel. It may be >numeric, factor, or character, and the approach depends on which that >function returns. >-- >Sent from my phone. Please excuse my brevity. > >On February 18, 2016 1:12:40 PM PST, D Wolf via R-help ><r-help at r-project.org> wrote: >Hello,I am trying to read a data frame column named DateTimeStamp. The >time is in GMT in this format: 1/4/2013 23:30 >require(xlsx) >df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = "Sheet1") > >It's good to that line. But these three lines, which makes the >dataframe, converts the column's values to NA:df2_TZ$DateTimeStamp >as.POSIXct(df2_TZ$DateTimeStamp, format="%m/%d/%Y %H:%M:%S", tz="GMT") > >and...?df2_TZ$DateTimeStamp >as.POSIXct(as.character(df2_TZ$DateTimeStamp), format = "%m/%d/%Y >%H:%M:%S") > >and...df2_TZ$DateTimeStamp = as.Date(df2_TZ$DateTimeStamp, format >"%m/%d/%Y %H:%M:%S") > >This line returns and error...df2_TZ$DateTimeStamp >as.POSIXct(as.Date(df2_TZ$DateTimeStamp), format = "%m/%d/%Y %H:%M:%S") >"Error in charToDate(x) :?? >character string is not in a standard unambiguous format" >Additionally, I need to convert from GMT to North American time zones, >and I think the advice on this page would >be good for >that:?http://blog.revolutionanalytics.com/2009/06/converting-time-zones.html >My ultimate goal is to write an R program that finds data in another >variable in df2_TZ that corresponds to a date and time that match up >with the date and time in another data frame. For now, any help reading >the column would be much appreciated. >Thank You,Doug > [[alternative HTML version deleted]] > > >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > > >[[alternative HTML version deleted]]
Hi Doug, It is difficult for us to work out what is happening as we don't have access to a toy data set that we can play with. Excel spreadsheets are one of those things that you can't just attach to your email to the help list. If there is somewhere you can leave a _small_ Excel sample file (take the first 10 rows, say) that we can download (Google Drive, Dropbox?) and include the URL in your email, maybe someone can offer more than guesses. Jim [[alternative HTML version deleted]]
On 22 Feb 2016, at 18:30 , Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> .. read the Excel documentation for representing dates... it is days since December 30, 1899 on Windows.I seem to recall that that is actually only true for dates after March 1, 1900. (The reason that it is not counting December 31st being that someone thought that 1900 was a leap year.) <Checks Wikipedia: Yep, 1900 is still a leap year in Excel. The original perpetrator was Lotus 1-2-3 and Microsoft went for but-compatibility.> -pd -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
In addition to my previous message, DF_extract_clean.R is the program in the dropbox folder that I am currently working on. Doug On Tuesday, February 23, 2016 4:02 AM, Jim Lemon <drjimlemon at gmail.com> wrote: Hi Doug,It is difficult for us to work out what is happening as we don't have access to a toy data set that we can play with. Excel spreadsheets are one of those things that you can't just attach to your email to the help list. If there is somewhere you can leave a _small_ Excel sample file (take the first 10 rows, say) that we can download (Google Drive, Dropbox?) and include the URL in your email, maybe someone can offer more than guesses. Jim [[alternative HTML version deleted]]
Hi Doug, I see what the problem is now. When your Excel file is read in with read.xlsx2, the DateTimeStamp is read as days since Microsoft's time epoch (see earlier posts on this). As these values are numeric, they cannot be converted in the same way as a human readable date/time string. The easiest way I could think of to get around this is to export the XLSX file as CSV. Then you will have the date/time strings and can convert them to POSIX date/time values. Note that your format spec was slightly wrong - day is first. # first export the EXCEL file as a CSV file then df2_TZ = read.csv("/media/KINGSTON/DF_exp2.csv",stringsAsFactors=FALSE) df2_TZ$DateTimeStamp<-strptime(df2_TZ$DateTimeStamp,"%d/%m/%Y %H:%M") # and I get df2_TZ$DateTimeStamp [1] "2013-01-01 00:00:00 EST" "2013-01-01 01:00:00 EST" [3] "2013-01-02 23:15:00 EST" "2013-01-02 23:30:00 EST" [5] "2013-01-02 23:45:00 EST" "2013-01-03 00:00:00 EST" [7] "2013-01-03 01:00:00 EST" "2013-01-03 01:15:00 EST" [9] "2013-01-04 23:00:00 EST" "2014-11-24 15:04:00 EST" [11] "2013-01-04 23:15:00 EST" "2013-01-04 23:30:00 EST" [13] "2013-01-05 00:30:00 EST" "2013-01-05 00:45:00 EST" [15] "2013-01-26 00:00:00 EST" "2013-07-19 15:42:00 EST" Jim [[alternative HTML version deleted]]
You are overthinking this. The answer is in the help file for read.xls2. -- Sent from my phone. Please excuse my brevity. On February 23, 2016 7:19:38 PM PST, Jim Lemon <drjimlemon at gmail.com> wrote:>Hi Doug, >I see what the problem is now. When your Excel file is read in with >read.xlsx2, the DateTimeStamp is read as days since Microsoft's time >epoch >(see earlier posts on this). As these values are numeric, they cannot >be >converted in the same way as a human readable date/time string. The >easiest >way I could think of to get around this is to export the XLSX file as >CSV. >Then you will have the date/time strings and can convert them to POSIX >date/time values. Note that your format spec was slightly wrong - day >is >first. > ># first export the EXCEL file as a CSV file then >df2_TZ = read.csv("/media/KINGSTON/DF_exp2.csv",stringsAsFactors=FALSE) >df2_TZ$DateTimeStamp<-strptime(df2_TZ$DateTimeStamp,"%d/%m/%Y %H:%M") ># and I get >df2_TZ$DateTimeStamp > [1] "2013-01-01 00:00:00 EST" "2013-01-01 01:00:00 EST" > [3] "2013-01-02 23:15:00 EST" "2013-01-02 23:30:00 EST" > [5] "2013-01-02 23:45:00 EST" "2013-01-03 00:00:00 EST" > [7] "2013-01-03 01:00:00 EST" "2013-01-03 01:15:00 EST" > [9] "2013-01-04 23:00:00 EST" "2014-11-24 15:04:00 EST" >[11] "2013-01-04 23:15:00 EST" "2013-01-04 23:30:00 EST" >[13] "2013-01-05 00:30:00 EST" "2013-01-05 00:45:00 EST" >[15] "2013-01-26 00:00:00 EST" "2013-07-19 15:42:00 EST" > >Jim > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]