Hi I have been using "fread" utility of "data.table" packge .. on a dataset of about 20 million rows. It's a fantastic package to read datasets. Thank you, Matt D. However, I am faced with a peculiar instance of certain numbers in a column being transformed. In the dataset, a column has values ranging from 1 to 9########## (nchar(x)=11, e.g. 98765432109). After using "fread" to read the dataset, values in all the columns are displayed correctly upto the first 1000 rows. If "fread" is applied for reading >1000 rows of the total of 20Million rows, the values in only this (column (having wide range of values) are displayed as x.xxxxxxxe-3yy. (e.g. 3.5639877e-324) I tried reading all the columns as "character" and didn't help. Would highly appreciate your assistance! Thanks so much in advance. Best regards, Santosh [[alternative HTML version deleted]]
You failed to provide a reproducible example, and you posted HTML so the quality of any answer will be limited by the quality of your question. My stab at your problem is that you should read ?fread, and in particular should try using the colClasses argument. -- Sent from my phone. Please excuse my brevity. On March 22, 2017 8:52:55 AM PDT, Santosh <santosh2005 at gmail.com> wrote:>Hi > >I have been using "fread" utility of "data.table" packge .. on a >dataset of >about 20 million rows. It's a fantastic package to read datasets. Thank >you, Matt D. > >However, I am faced with a peculiar instance of certain numbers in a >column being transformed. > >In the dataset, a column has values ranging from 1 to 9########## >(nchar(x)=11, e.g. 98765432109). After using "fread" to read the >dataset, >values in all the columns are displayed correctly upto the first 1000 >rows. >If "fread" is applied for reading >1000 rows of the total of 20Million >rows, the values in only this (column (having wide range of values) are >displayed as x.xxxxxxxe-3yy. (e.g. 3.5639877e-324) > >I tried reading all the columns as "character" and didn't help. > >Would highly appreciate your assistance! > >Thanks so much in advance. > >Best regards, >Santosh > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Here is a way to reproduce the problem: > data.table::fread("9876543210\n") # number bigger than 2^31-1 V1 1: 4.879661e-314 and your work-around does fix things up > data.table::fread("9876543210\n", colClasses="numeric") V1 1: 9876543210 Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Mar 22, 2017 at 9:58 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> You failed to provide a reproducible example, and you posted HTML so the quality of any answer will be limited by the quality of your question. > > My stab at your problem is that you should read ?fread, and in particular should try using the colClasses argument. > -- > Sent from my phone. Please excuse my brevity. > > On March 22, 2017 8:52:55 AM PDT, Santosh <santosh2005 at gmail.com> wrote: >>Hi >> >>I have been using "fread" utility of "data.table" packge .. on a >>dataset of >>about 20 million rows. It's a fantastic package to read datasets. Thank >>you, Matt D. >> >>However, I am faced with a peculiar instance of certain numbers in a >>column being transformed. >> >>In the dataset, a column has values ranging from 1 to 9########## >>(nchar(x)=11, e.g. 98765432109). After using "fread" to read the >>dataset, >>values in all the columns are displayed correctly upto the first 1000 >>rows. >>If "fread" is applied for reading >1000 rows of the total of 20Million >>rows, the values in only this (column (having wide range of values) are >>displayed as x.xxxxxxxe-3yy. (e.g. 3.5639877e-324) >> >>I tried reading all the columns as "character" and didn't help. >> >>Would highly appreciate your assistance! >> >>Thanks so much in advance. >> >>Best regards, >>Santosh >> >> [[alternative HTML version deleted]] >> >>______________________________________________ >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.