Hello, I am trying to read a dataset with 100,000 rows and around 365 columns into R, using read.table/read.csv. In Windows XP, with R 32 bit, I am able to read only 15266 rows and not more than that. I tried the same in R running in Ubuntu and it does the same and reads only 15266 rows. Using the nrows paramter i can read rows less than 15266, but when i used a value larger than 15266, it reads only 15266 nevertheless. Thank you for your patience and responses. Regards Harsh Singhal Bangalore, India
Harsh wrote:> Hello, > I am trying to read a dataset with 100,000 rows and around 365 columns > into R, using read.table/read.csv. > In Windows XP, with R 32 bit, I am able to read only 15266 rows and > not more than that. > I tried the same in R running in Ubuntu and it does the same and reads > only 15266 rows. > Using the nrows paramter i can read rows less than 15266, but when i > used a value larger than 15266, it reads only 15266 nevertheless.What happens exactly? Error message? How does you file look like in rows 15260-15270? Uwe Ligges> Thank you for your patience and responses. > > Regards > Harsh Singhal > Bangalore, India > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Prof Brian Ripley
2008-Dec-02 09:20 UTC
[R] Limit on number of Rows when reading R dataset
Take a look at your dataset at around that row. Perhaps you have an unmatched quote? The limit on the number of rows of a data frame is far larger than 100,000 (2^31-1, but you will run out of address space on a 32-bit platform before that - see ?"Memory-limits"). On Tue, 2 Dec 2008, Harsh wrote:> Hello, > I am trying to read a dataset with 100,000 rows and around 365 columns > into R, using read.table/read.csv. > In Windows XP, with R 32 bit, I am able to read only 15266 rows and > not more than that. > I tried the same in R running in Ubuntu and it does the same and reads > only 15266 rows. > Using the nrows paramter i can read rows less than 15266, but when i > used a value larger than 15266, it reads only 15266 nevertheless. > > Thank you for your patience and responses. > > Regards > Harsh Singhal > Bangalore, India-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Thank you Uwe and Prof. Ripley. The problem was solved. The row in question indeed have garbage data, which possibly was truncating the number of lines read. I apologise for the oversight. Thank you once again. Regards Harsh Singhal Bangalore, India On Tue, Dec 2, 2008 at 2:50 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:> Take a look at your dataset at around that row. Perhaps you have an > unmatched quote? > > The limit on the number of rows of a data frame is far larger than 100,000 > (2^31-1, but you will run out of address space on a 32-bit platform before > that - see ?"Memory-limits"). > > On Tue, 2 Dec 2008, Harsh wrote: > >> Hello, >> I am trying to read a dataset with 100,000 rows and around 365 columns >> into R, using read.table/read.csv. >> In Windows XP, with R 32 bit, I am able to read only 15266 rows and >> not more than that. >> I tried the same in R running in Ubuntu and it does the same and reads >> only 15266 rows. >> Using the nrows paramter i can read rows less than 15266, but when i >> used a value larger than 15266, it reads only 15266 nevertheless. >> >> Thank you for your patience and responses. >> >> Regards >> Harsh Singhal >> Bangalore, India > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 >