Hi I have been using "fread" utility of "data.table" packge .. on a dataset of about 20 million rows. It's a fantastic package to read datasets. Thank you, Matt D. However, I am faced with a peculiar instance of certain numbers in a column being transformed. In the dataset, a column has values ranging from 1 to 9########## (nchar(x)=11, e.g. 98765432109). After using "fread" to read the dataset, values in all the columns are displayed correctly upto the first 1000 rows. If "fread" is applied for reading >1000 rows of the total of 20Million rows, the values in only this (column (having wide range of values) are displayed as x.xxxxxxxe-3yy. (e.g. 3.5639877e-324) I tried reading all the columns as "character" and didn't help. Would highly appreciate your assistance! Thanks so much in advance. Best regards, Santosh [[alternative HTML version deleted]]
You failed to provide a reproducible example, and you posted HTML so the quality of any answer will be limited by the quality of your question. My stab at your problem is that you should read ?fread, and in particular should try using the colClasses argument. -- Sent from my phone. Please excuse my brevity. On March 22, 2017 8:52:55 AM PDT, Santosh <santosh2005 at gmail.com> wrote:>Hi > >I have been using "fread" utility of "data.table" packge .. on a >dataset of >about 20 million rows. It's a fantastic package to read datasets. Thank >you, Matt D. > >However, I am faced with a peculiar instance of certain numbers in a >column being transformed. > >In the dataset, a column has values ranging from 1 to 9########## >(nchar(x)=11, e.g. 98765432109). After using "fread" to read the >dataset, >values in all the columns are displayed correctly upto the first 1000 >rows. >If "fread" is applied for reading >1000 rows of the total of 20Million >rows, the values in only this (column (having wide range of values) are >displayed as x.xxxxxxxe-3yy. (e.g. 3.5639877e-324) > >I tried reading all the columns as "character" and didn't help. > >Would highly appreciate your assistance! > >Thanks so much in advance. > >Best regards, >Santosh > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Here is a way to reproduce the problem:
> data.table::fread("9876543210\n") # number bigger than 2^31-1
V1
1: 4.879661e-314
and your work-around does fix things up
> data.table::fread("9876543210\n",
colClasses="numeric")
V1
1: 9876543210
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Mar 22, 2017 at 9:58 AM, Jeff Newmiller
<jdnewmil at dcn.davis.ca.us> wrote:> You failed to provide a reproducible example, and you posted HTML so the
quality of any answer will be limited by the quality of your question.
>
> My stab at your problem is that you should read ?fread, and in particular
should try using the colClasses argument.
> --
> Sent from my phone. Please excuse my brevity.
>
> On March 22, 2017 8:52:55 AM PDT, Santosh <santosh2005 at gmail.com>
wrote:
>>Hi
>>
>>I have been using "fread" utility of "data.table"
packge .. on a
>>dataset of
>>about 20 million rows. It's a fantastic package to read datasets.
Thank
>>you, Matt D.
>>
>>However, I am faced with a peculiar instance of certain numbers in a
>>column being transformed.
>>
>>In the dataset, a column has values ranging from 1 to 9##########
>>(nchar(x)=11, e.g. 98765432109). After using "fread" to read
the
>>dataset,
>>values in all the columns are displayed correctly upto the first 1000
>>rows.
>>If "fread" is applied for reading >1000 rows of the total
of 20Million
>>rows, the values in only this (column (having wide range of values) are
>>displayed as x.xxxxxxxe-3yy. (e.g. 3.5639877e-324)
>>
>>I tried reading all the columns as "character" and didn't
help.
>>
>>Would highly appreciate your assistance!
>>
>>Thanks so much in advance.
>>
>>Best regards,
>>Santosh
>>
>> [[alternative HTML version deleted]]
>>
>>______________________________________________
>>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.