I am reading in a very large file with names in it and R is truncating the
number of rows it reads in. The separator in this file is a pipe '|' and
so I use
dat <- read.delim('pathToMyFile', header= TRUE, sep='|')
It turns out that it is reading up to row 61145 and stopping and I think I see
why, but am not sure of the best solution to this problem. I see the name of the
person in the next row has a quote in it, such as:
Joe Sm"ith
I *think* this is causing a problem in the read in. In fact, whenever I use
Ø tail(dat)
Ø or dat[61145,]
R crashes.
But, it doesn't crash when I use head(dat) or index any other row. I could
change my raw data and manually delete this ". However, is there another
solution within the args of read.delim that would be useful as a solution such
that I would not have to manually change my raw data
Harold
[[alternative HTML version deleted]]
Harold -
If there aren't any true quoted fields in the file, you
could pass the quote="" option to read.delim().
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Wed, 28 Jul 2010, Doran, Harold wrote:
> I am reading in a very large file with names in it and R is truncating the
number of rows it reads in. The separator in this file is a pipe '|' and
so I use
>
> dat <- read.delim('pathToMyFile', header= TRUE, sep='|')
>
> It turns out that it is reading up to row 61145 and stopping and I think I
see why, but am not sure of the best solution to this problem. I see the name of
the person in the next row has a quote in it, such as:
>
> Joe Sm"ith
>
> I *think* this is causing a problem in the read in. In fact, whenever I use
>
>
> ? tail(dat)
>
> ? or dat[61145,]
>
> R crashes.
>
> But, it doesn't crash when I use head(dat) or index any other row. I
could change my raw data and manually delete this ". However, is there
another solution within the args of read.delim that would be useful as a
solution such that I would not have to manually change my raw data
>
> Harold
>
> [[alternative HTML version deleted]]
>
>
Thank you, Phil. Unfortunately, there are quotes used properly elsewhere.
----- Original Message -----
From: Phil Spector <spector at stat.berkeley.edu>
To: Doran, Harold
Cc: r-help at r-project.org <r-help at r-project.org>
Sent: Wed Jul 28 18:29:32 2010
Subject: Re: [R] read.delim()
Harold -
If there aren't any true quoted fields in the file, you
could pass the quote="" option to read.delim().
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Wed, 28 Jul 2010, Doran, Harold wrote:
> I am reading in a very large file with names in it and R is truncating the
number of rows it reads in. The separator in this file is a pipe '|' and
so I use
>
> dat <- read.delim('pathToMyFile', header= TRUE, sep='|')
>
> It turns out that it is reading up to row 61145 and stopping and I think I
see why, but am not sure of the best solution to this problem. I see the name of
the person in the next row has a quote in it, such as:
>
> Joe Sm"ith
>
> I *think* this is causing a problem in the read in. In fact, whenever I use
>
>
> ? tail(dat)
>
> ? or dat[61145,]
>
> R crashes.
>
> But, it doesn't crash when I use head(dat) or index any other row. I
could change my raw data and manually delete this ". However, is there
another solution within the args of read.delim that would be useful as a
solution such that I would not have to manually change my raw data
>
> Harold
>
> [[alternative HTML version deleted]]
>
>