Here is a way to reproduce the problem: > data.table::fread("9876543210\n") # number bigger than 2^31-1 V1 1: 4.879661e-314 and your work-around does fix things up > data.table::fread("9876543210\n", colClasses="numeric") V1 1: 9876543210 Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Mar 22, 2017 at 9:58 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> You failed to provide a reproducible example, and you posted HTML so the quality of any answer will be limited by the quality of your question. > > My stab at your problem is that you should read ?fread, and in particular should try using the colClasses argument. > -- > Sent from my phone. Please excuse my brevity. > > On March 22, 2017 8:52:55 AM PDT, Santosh <santosh2005 at gmail.com> wrote: >>Hi >> >>I have been using "fread" utility of "data.table" packge .. on a >>dataset of >>about 20 million rows. It's a fantastic package to read datasets. Thank >>you, Matt D. >> >>However, I am faced with a peculiar instance of certain numbers in a >>column being transformed. >> >>In the dataset, a column has values ranging from 1 to 9########## >>(nchar(x)=11, e.g. 98765432109). After using "fread" to read the >>dataset, >>values in all the columns are displayed correctly upto the first 1000 >>rows. >>If "fread" is applied for reading >1000 rows of the total of 20Million >>rows, the values in only this (column (having wide range of values) are >>displayed as x.xxxxxxxe-3yy. (e.g. 3.5639877e-324) >> >>I tried reading all the columns as "character" and didn't help. >> >>Would highly appreciate your assistance! >> >>Thanks so much in advance. >> >>Best regards, >>Santosh >> >> [[alternative HTML version deleted]] >> >>______________________________________________ >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Thanks Bill for cc. Santosh, I'm almost certain you don't have package bit64 installed. When you do it works fine :> remove.packages("bit64") > data.table::fread("9876543210\n")V1 1: 4.879661e-314> install.packages("bit64") > data.table::fread("9876543210\n")V1 1: 9876543210 News for data.table v1.10.2 on CRAN 31 Jan 2017 contained : * When fread() or print() see integer64 columns are present, bit64's namespace is now automatically loaded for convenience. However, when data.table loads the namespace there is a bug in this function :> data.table:::require_bit64function () { tt = try(requireNamespace("bit64", quietly = TRUE)) if (inherits(tt, "try-error")) warning("Some columns are type 'integer64' but package bit64 is not installed. Those columns will print as strange looking floating point data. There is no need to reload the data. Simply install.packages('bit64') to obtain the integer64 print method and print the data again.") } The intent was to display that nice helpful message to you. Due to this report, I can see now that I shouldn't have wrapped requireNamespace() with try() because requireNamespace() returns TRUE or FALSE anyway. Even though requireNamespace() prints 'Failed with error' it doesn't actually throw an error. I'll change data.table's function to the following : if (!requireNamespace("bit64", quietly = TRUE)) warning("Some columns ...") bit64 is correctly Suggests not Depends. It's just unfortunate the intended message wasn't displayed. Santosh, in future please follow the data.table support guide here: https://github.com/Rdatatable/data.table/wiki/Support. r-help is not supposed to be used for package support. The main thing though is thanks for helping me find this bug. Thanks, Matt On Wed, Mar 22, 2017 at 10:22 AM, William Dunlap <wdunlap at tibco.com> wrote:> Here is a way to reproduce the problem: > > data.table::fread("9876543210\n") # number bigger than 2^31-1 > V1 > 1: 4.879661e-314 > and your work-around does fix things up > > data.table::fread("9876543210\n", colClasses="numeric") > V1 > 1: 9876543210 > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > > On Wed, Mar 22, 2017 at 9:58 AM, Jeff Newmiller > <jdnewmil at dcn.davis.ca.us> wrote: > > You failed to provide a reproducible example, and you posted HTML so the > quality of any answer will be limited by the quality of your question. > > > > My stab at your problem is that you should read ?fread, and in > particular should try using the colClasses argument. > > -- > > Sent from my phone. Please excuse my brevity. > > > > On March 22, 2017 8:52:55 AM PDT, Santosh <santosh2005 at gmail.com> wrote: > >>Hi > >> > >>I have been using "fread" utility of "data.table" packge .. on a > >>dataset of > >>about 20 million rows. It's a fantastic package to read datasets. Thank > >>you, Matt D. > >> > >>However, I am faced with a peculiar instance of certain numbers in a > >>column being transformed. > >> > >>In the dataset, a column has values ranging from 1 to 9########## > >>(nchar(x)=11, e.g. 98765432109). After using "fread" to read the > >>dataset, > >>values in all the columns are displayed correctly upto the first 1000 > >>rows. > >>If "fread" is applied for reading >1000 rows of the total of 20Million > >>rows, the values in only this (column (having wide range of values) are > >>displayed as x.xxxxxxxe-3yy. (e.g. 3.5639877e-324) > >> > >>I tried reading all the columns as "character" and didn't help. > >> > >>Would highly appreciate your assistance! > >> > >>Thanks so much in advance. > >> > >>Best regards, > >>Santosh > >> > >> [[alternative HTML version deleted]] > >> > >>______________________________________________ > >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>https://stat.ethz.ch/mailman/listinfo/r-help > >>PLEASE do read the posting guide > >>http://www.R-project.org/posting-guide.html > >>and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Thanks so much for your suggestions! Will try them out! Santosh On Wed, Mar 22, 2017 at 12:17 PM, Matt Dowle <mattjdowle at gmail.com> wrote:> Thanks Bill for cc. > > Santosh, > > I'm almost certain you don't have package bit64 installed. When you do it > works fine : > > > remove.packages("bit64") > > data.table::fread("9876543210\n") > V1 > 1: 4.879661e-314 > > install.packages("bit64") > > data.table::fread("9876543210\n") > V1 > 1: 9876543210 > > News for data.table v1.10.2 on CRAN 31 Jan 2017 contained : > > * When fread() or print() see integer64 columns are present, bit64's > namespace is now automatically loaded for convenience. > > However, when data.table loads the namespace there is a bug in this > function : > > > data.table:::require_bit64 > function () > { > tt = try(requireNamespace("bit64", quietly = TRUE)) > if (inherits(tt, "try-error")) > warning("Some columns are type 'integer64' but package bit64 is > not installed. Those columns will print as strange looking floating point > data. There is no need to reload the data. Simply install.packages('bit64') > to obtain the integer64 print method and print the data again.") > } > > The intent was to display that nice helpful message to you. Due to this > report, I can see now that I shouldn't have wrapped requireNamespace() with > try() because requireNamespace() returns TRUE or FALSE anyway. Even though > requireNamespace() prints 'Failed with error' it doesn't actually throw an > error. I'll change data.table's function to the following : > > if (!requireNamespace("bit64", quietly = TRUE)) > warning("Some columns ...") > > bit64 is correctly Suggests not Depends. It's just unfortunate the > intended message wasn't displayed. > > Santosh, in future please follow the data.table support guide here: > https://github.com/Rdatatable/data.table/wiki/Support. r-help is not > supposed to be used for package support. The main thing though is thanks > for helping me find this bug. > > Thanks, > Matt > > > On Wed, Mar 22, 2017 at 10:22 AM, William Dunlap <wdunlap at tibco.com> > wrote: > >> Here is a way to reproduce the problem: >> > data.table::fread("9876543210\n") # number bigger than 2^31-1 >> V1 >> 1: 4.879661e-314 >> and your work-around does fix things up >> > data.table::fread("9876543210\n", colClasses="numeric") >> V1 >> 1: 9876543210 >> >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> >> On Wed, Mar 22, 2017 at 9:58 AM, Jeff Newmiller >> <jdnewmil at dcn.davis.ca.us> wrote: >> > You failed to provide a reproducible example, and you posted HTML so >> the quality of any answer will be limited by the quality of your question. >> > >> > My stab at your problem is that you should read ?fread, and in >> particular should try using the colClasses argument. >> > -- >> > Sent from my phone. Please excuse my brevity. >> > >> > On March 22, 2017 8:52:55 AM PDT, Santosh <santosh2005 at gmail.com> >> wrote: >> >>Hi >> >> >> >>I have been using "fread" utility of "data.table" packge .. on a >> >>dataset of >> >>about 20 million rows. It's a fantastic package to read datasets. Thank >> >>you, Matt D. >> >> >> >>However, I am faced with a peculiar instance of certain numbers in a >> >>column being transformed. >> >> >> >>In the dataset, a column has values ranging from 1 to 9########## >> >>(nchar(x)=11, e.g. 98765432109). After using "fread" to read the >> >>dataset, >> >>values in all the columns are displayed correctly upto the first 1000 >> >>rows. >> >>If "fread" is applied for reading >1000 rows of the total of 20Million >> >>rows, the values in only this (column (having wide range of values) are >> >>displayed as x.xxxxxxxe-3yy. (e.g. 3.5639877e-324) >> >> >> >>I tried reading all the columns as "character" and didn't help. >> >> >> >>Would highly appreciate your assistance! >> >> >> >>Thanks so much in advance. >> >> >> >>Best regards, >> >>Santosh >> >> >> >> [[alternative HTML version deleted]] >> >> >> >>______________________________________________ >> >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >>https://stat.ethz.ch/mailman/listinfo/r-help >> >>PLEASE do read the posting guide >> >>http://www.R-project.org/posting-guide.html >> >>and provide commented, minimal, self-contained, reproducible code. >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >[[alternative HTML version deleted]]