I am reading in a very large file with names in it and R is truncating the number of rows it reads in. The separator in this file is a pipe '|' and so I use dat <- read.delim('pathToMyFile', header= TRUE, sep='|') It turns out that it is reading up to row 61145 and stopping and I think I see why, but am not sure of the best solution to this problem. I see the name of the person in the next row has a quote in it, such as: Joe Sm"ith I *think* this is causing a problem in the read in. In fact, whenever I use Ø tail(dat) Ø or dat[61145,] R crashes. But, it doesn't crash when I use head(dat) or index any other row. I could change my raw data and manually delete this ". However, is there another solution within the args of read.delim that would be useful as a solution such that I would not have to manually change my raw data Harold [[alternative HTML version deleted]]
Harold - If there aren't any true quoted fields in the file, you could pass the quote="" option to read.delim(). - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Wed, 28 Jul 2010, Doran, Harold wrote:> I am reading in a very large file with names in it and R is truncating the number of rows it reads in. The separator in this file is a pipe '|' and so I use > > dat <- read.delim('pathToMyFile', header= TRUE, sep='|') > > It turns out that it is reading up to row 61145 and stopping and I think I see why, but am not sure of the best solution to this problem. I see the name of the person in the next row has a quote in it, such as: > > Joe Sm"ith > > I *think* this is causing a problem in the read in. In fact, whenever I use > > > ? tail(dat) > > ? or dat[61145,] > > R crashes. > > But, it doesn't crash when I use head(dat) or index any other row. I could change my raw data and manually delete this ". However, is there another solution within the args of read.delim that would be useful as a solution such that I would not have to manually change my raw data > > Harold > > [[alternative HTML version deleted]] > >
Thank you, Phil. Unfortunately, there are quotes used properly elsewhere. ----- Original Message ----- From: Phil Spector <spector at stat.berkeley.edu> To: Doran, Harold Cc: r-help at r-project.org <r-help at r-project.org> Sent: Wed Jul 28 18:29:32 2010 Subject: Re: [R] read.delim() Harold - If there aren't any true quoted fields in the file, you could pass the quote="" option to read.delim(). - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Wed, 28 Jul 2010, Doran, Harold wrote:> I am reading in a very large file with names in it and R is truncating the number of rows it reads in. The separator in this file is a pipe '|' and so I use > > dat <- read.delim('pathToMyFile', header= TRUE, sep='|') > > It turns out that it is reading up to row 61145 and stopping and I think I see why, but am not sure of the best solution to this problem. I see the name of the person in the next row has a quote in it, such as: > > Joe Sm"ith > > I *think* this is causing a problem in the read in. In fact, whenever I use > > > ? tail(dat) > > ? or dat[61145,] > > R crashes. > > But, it doesn't crash when I use head(dat) or index any other row. I could change my raw data and manually delete this ". However, is there another solution within the args of read.delim that would be useful as a solution such that I would not have to manually change my raw data > > Harold > > [[alternative HTML version deleted]] > >