Hello R users I tried to get maximum of sale date from my dataframe using sqldf in R. First time when i was executing the following code>sqldf("select max(sale_date) from test1")i got the result as 9997.0 BUT when i was running the same for second time, the result was 2031-04-09 (this is what correct one!) why it was happened? thanks. -- View this message in context: http://www.nabble.com/Bug-tp25548042p25548042.html Sent from the R help mailing list archive at Nabble.com.
Please read and follow the last line to every message on r-help. On Thu, Sep 24, 2009 at 5:32 AM, dhansekaran <dhana.sa at gmail.com> wrote:> > Hello R users > > I tried to get maximum of sale date from my dataframe using sqldf in R. > First time when i was executing the following code > >>sqldf("select max(sale_date) from test1") > > i got the result as 9997.0 > > BUT > > when i was running the same for second time, the result was 2031-04-09 (this > is what correct one!) > > why it was happened? > > thanks. > -- > View this message in context: http://www.nabble.com/Bug-tp25548042p25548042.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
You have found a bug. It would be best to use dput(test1) to show unambiguously display what is in test1 but in the absence of that I will assume that its as in test1 shown below.> library(sqldf) > test1 <- data.frame(sale_date = as.Date(c("2008-08-01", "2031-01-09",+ "1990-01-03", "2007-02-03", "1997-01-03", "2004-02-04")))> sqldf("select max(sale_date) from test1")max(sale_date) 1 9864.0 Evidently it is taking the internal numeric representation and then storing it in the database as characters and then taking the maximum of those characters. As the fifth entry starts with 9 its the maximum when sorted alphabetically:> as.numeric(test1[[1]])[1] 14092 22288 7307 13547 9864 12452 I will have to investigate whether the problem is in sqldf or the underlying software. In the meantime if you represent the Date data as character you should be ok:> test2 <- transform(test1, sale_date = as.character(sale_date)) > sqldf("select max(sale_date) from test2")max(sale_date) 1 2031-01-09> packageDescription("sqldf")$Version[1] "0-1.7"> R.version.string[1] "R version 2.9.2 Patched (2009-09-08 r49647)" Please provide the output of dput(test1) so that we know unambiguously what your data looks like. On Thu, Sep 24, 2009 at 9:07 AM, dhanasekaran <dhana.sa at gmail.com> wrote:> The data looks like > > "2008-08-01" > "2031-01-09" > "1990-01-03" > "2007-02-03" > "1997-01-03" > "2004-02-04" > > Thanks. > > On Thu, Sep 24, 2009 at 5:20 PM, Gabor Grothendieck > <ggrothendieck at gmail.com> wrote: >> >> Please read and follow the last line to every message on r-help. >> >> On Thu, Sep 24, 2009 at 5:32 AM, dhansekaran <dhana.sa at gmail.com> wrote: >> > >> > Hello R users >> > >> > I tried to get maximum of sale date from my dataframe using sqldf in R. >> > First time when i was executing the following code >> > >> >>sqldf("select max(sale_date) from test1") >> > >> > i got the result as 9997.0 >> > >> > BUT >> > >> > when i was running the same for second time, the result was 2031-04-09 >> > (this >> > is what correct one!) >> > >> > why it was happened? >> > >> > thanks. >> > -- >> > View this message in context: >> > http://www.nabble.com/Bug-tp25548042p25548042.html >> > Sent from the R help mailing list archive at Nabble.com. >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > > > > > -- > Best > Dhanasekaran > > "Without trust, words become the hollow sound of a wooden gong. With trust, > words become life itself.? >
Thank you very much for help. Here is my values of Sale Date>sampletest[1:100, 76] 1 1989-08-01 2 1900-01-01 3 2003-11-18 4 2003-05-30 5 2005-08-18 6 1990-04-01 7 1989-01-01 8 1900-01-01 9 1996-03-12 10 1900-01-01 11 2005-11-14 12 2002-05-08 13 2000-10-10 14 1900-01-01 15 2007-03-27 Gabor Grothendieck wrote:> > You have found a bug. > > It would be best to use dput(test1) to show unambiguously display what > is in test1 but in the absence of that I will assume that its as in > test1 shown below. > >> library(sqldf) >> test1 <- data.frame(sale_date = as.Date(c("2008-08-01", "2031-01-09", > + "1990-01-03", "2007-02-03", "1997-01-03", "2004-02-04"))) > >> sqldf("select max(sale_date) from test1") > max(sale_date) > 1 9864.0 > > Evidently it is taking the internal numeric representation and then > storing it in the database as characters and then taking the maximum > of those characters. As the fifth entry starts with 9 its the maximum > when sorted alphabetically: > >> as.numeric(test1[[1]]) > [1] 14092 22288 7307 13547 9864 12452 > > I will have to investigate whether the problem is in sqldf or the > underlying software. In the meantime if you represent the Date data > as character you should be ok: > >> test2 <- transform(test1, sale_date = as.character(sale_date)) >> sqldf("select max(sale_date) from test2") > max(sale_date) > 1 2031-01-09 > > > > >> packageDescription("sqldf")$Version > [1] "0-1.7" >> R.version.string > [1] "R version 2.9.2 Patched (2009-09-08 r49647)" > > Please provide the output of dput(test1) so that we know unambiguously > what your data looks like. > > On Thu, Sep 24, 2009 at 9:07 AM, dhanasekaran <dhana.sa at gmail.com> wrote: >> The data looks like >> >> "2008-08-01" >> "2031-01-09" >> "1990-01-03" >> "2007-02-03" >> "1997-01-03" >> "2004-02-04" >> >> Thanks. >> >> On Thu, Sep 24, 2009 at 5:20 PM, Gabor Grothendieck >> <ggrothendieck at gmail.com> wrote: >>> >>> Please read and follow the last line to every message on r-help. >>> >>> On Thu, Sep 24, 2009 at 5:32 AM, dhansekaran <dhana.sa at gmail.com> wrote: >>> > >>> > Hello R users >>> > >>> > I tried to get maximum of sale date from my dataframe using sqldf in >>> R. >>> > First time when i was executing the following code >>> > >>> >>sqldf("select max(sale_date) from test1") >>> > >>> > i got the result as 9997.0 >>> > >>> > BUT >>> > >>> > when i was running the same for second time, the result was 2031-04-09 >>> > (this >>> > is what correct one!) >>> > >>> > why it was happened? >>> > >>> > thanks. >>> > -- >>> > View this message in context: >>> > http://www.nabble.com/Bug-tp25548042p25548042.html >>> > Sent from the R help mailing list archive at Nabble.com. >>> > >>> > ______________________________________________ >>> > R-help at r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-help >>> > PLEASE do read the posting guide >>> > http://www.R-project.org/posting-guide.html >>> > and provide commented, minimal, self-contained, reproducible code. >>> > >> >> >> >> -- >> Best >> Dhanasekaran >> >> "Without trust, words become the hollow sound of a wooden gong. With >> trust, >> words become life itself.? >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/Bug-tp25548042p25610059.html Sent from the R help mailing list archive at Nabble.com.