I'm trying to read in some data from a .csv format and have come across the following issue. Here is a simple example for replication # A sample .csv format schid,sch_name 331-802-7081,School One 464-551-7357,School Two 388-517-7627,School Three \& Four 388-517-4394,School Five Note the third line includes the \ character. However, when I read the data in I get> read.csv(file.choose())schid sch_name 1 331-802-7081 School One 2 464-551-7357 School Two 3 388-517-7627 School Three & Four 4 388-517-4394 School Five It turns out to be very important to read in this character as I have a program that loops through a data set and Sweave's about 30,000 files. The variable sch_name gets dropped into the tex file using \Sexpr{tmp$sch_name}. However, if there is an &, the latex file won't compile properly. So, what I need is for the data to be read in as schid sch_name 1 331-802-7081 School One 2 464-551-7357 School Two 3 388-517-7627 School Three \& Four 4 388-517-4394 School Five I am obligated by a client to include the & in the school name, so eliminating that isn't an option. I thought maybe comment.char or quote would be what I needed, but they didn't resolve the issue. I'm certain I'm missing something simple, I just can't see it. Any thoughts? Harold [[alternative HTML version deleted]]
Dear Harold, One thing you can do is to read the file "plainly", even if the "\" is lost and then, inside R, change the string value with gsub. Sincerely, Carlos J. Gil Bellosta http://www.datanalytics.com http://www.data-mining-blog.com El mi?, 16-08-2006 a las 14:43 -0400, Doran, Harold escribi?:> I'm trying to read in some data from a .csv format and have come across > the following issue. Here is a simple example for replication > > # A sample .csv format > schid,sch_name > 331-802-7081,School One > 464-551-7357,School Two > 388-517-7627,School Three \& Four > 388-517-4394,School Five > > Note the third line includes the \ character. However, when I read the > data in I get > > > read.csv(file.choose()) > schid sch_name > 1 331-802-7081 School One > 2 464-551-7357 School Two > 3 388-517-7627 School Three & Four > 4 388-517-4394 School Five > > It turns out to be very important to read in this character as I have a > program that loops through a data set and Sweave's about 30,000 files. > The variable sch_name gets dropped into the tex file using > \Sexpr{tmp$sch_name}. However, if there is an &, the latex file won't > compile properly. So, what I need is for the data to be read in as > > schid sch_name > 1 331-802-7081 School One > 2 464-551-7357 School Two > 3 388-517-7627 School Three \& Four > 4 388-517-4394 School Five > > I am obligated by a client to include the & in the school name, so > eliminating that isn't an option. I thought maybe comment.char or quote > would be what I needed, but they didn't resolve the issue. I'm certain > I'm missing something simple, I just can't see it. > > Any thoughts? > > Harold > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Try 'gsub'> yschid sch_name 1 331-802-7081 School One 2 464-551-7357 School Two 3 388-517-7627 School Three & Four 4 388-517-4394 School Five> levels(y$sch_name) <- gsub("&", "\\\\&", levels(y$sch_name)) > yschid sch_name 1 331-802-7081 School One 2 464-551-7357 School Two 3 388-517-7627 School Three \\& Four 4 388-517-4394 School Five>On 8/16/06, Doran, Harold <HDoran@air.org> wrote:> > I'm trying to read in some data from a .csv format and have come across > the following issue. Here is a simple example for replication > > # A sample .csv format > schid,sch_name > 331-802-7081,School One > 464-551-7357,School Two > 388-517-7627,School Three \& Four > 388-517-4394,School Five > > Note the third line includes the \ character. However, when I read the > data in I get > > > read.csv(file.choose()) > schid sch_name > 1 331-802-7081 School One > 2 464-551-7357 School Two > 3 388-517-7627 School Three & Four > 4 388-517-4394 School Five > > It turns out to be very important to read in this character as I have a > program that loops through a data set and Sweave's about 30,000 files. > The variable sch_name gets dropped into the tex file using > \Sexpr{tmp$sch_name}. However, if there is an &, the latex file won't > compile properly. So, what I need is for the data to be read in as > > schid sch_name > 1 331-802-7081 School One > 2 464-551-7357 School Two > 3 388-517-7627 School Three \& Four > 4 388-517-4394 School Five > > I am obligated by a client to include the & in the school name, so > eliminating that isn't an option. I thought maybe comment.char or quote > would be what I needed, but they didn't resolve the issue. I'm certain > I'm missing something simple, I just can't see it. > > Any thoughts? > > Harold > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]]
On Wed, 2006-08-16 at 14:43 -0400, Doran, Harold wrote:> I'm trying to read in some data from a .csv format and have come across > the following issue. Here is a simple example for replication > > # A sample .csv format > schid,sch_name > 331-802-7081,School One > 464-551-7357,School Two > 388-517-7627,School Three \& Four > 388-517-4394,School Five > > Note the third line includes the \ character. However, when I read the > data in I get > > > read.csv(file.choose()) > schid sch_name > 1 331-802-7081 School One > 2 464-551-7357 School Two > 3 388-517-7627 School Three & Four > 4 388-517-4394 School Five > > It turns out to be very important to read in this character as I have a > program that loops through a data set and Sweave's about 30,000 files. > The variable sch_name gets dropped into the tex file using > \Sexpr{tmp$sch_name}. However, if there is an &, the latex file won't > compile properly. So, what I need is for the data to be read in as > > schid sch_name > 1 331-802-7081 School One > 2 464-551-7357 School Two > 3 388-517-7627 School Three \& Four > 4 388-517-4394 School Five > > I am obligated by a client to include the & in the school name, so > eliminating that isn't an option. I thought maybe comment.char or quote > would be what I needed, but they didn't resolve the issue. I'm certain > I'm missing something simple, I just can't see it. > > Any thoughts? > > HaroldHarold, What version of R and OS are you running? Under: Version 2.3.1 Patched (2006-08-06 r38829) on FC5:> read.csv("test.csv")schid sch_name 1 331-802-7081 School One 2 464-551-7357 School Two 3 388-517-7627 School Three \\& Four 4 388-517-4394 School Five The '\' is doubled. Take note of the impact of the 'allowEscapes' argument:> read.csv("test.csv", allowEscapes = TRUE)schid sch_name 1 331-802-7081 School One 2 464-551-7357 School Two 3 388-517-7627 School Three & Four 4 388-517-4394 School Five The '\' is lost. Try it with 'allowEscapes = FALSE' explicitly. HTH, Marc Schwartz
OK, thanks to you and Carlos. I see how this works. Now, I just want 1 "\" (miktex doesn't work with \\). I tried tinkering around with what you have for the replacement portion of the function. Is it possible to only have only one \? ________________________________ From: jim holtman [mailto:jholtman@gmail.com] Sent: Wednesday, August 16, 2006 3:10 PM To: Doran, Harold Cc: r-help@stat.math.ethz.ch Subject: Re: [R] read.csv issue Try 'gsub' > y schid sch_name 1 331-802-7081 School One 2 464-551-7357 School Two 3 388-517-7627 School Three & Four 4 388-517-4394 School Five > levels(y$sch_name) <- gsub("&", " \\\\&", levels(y$sch_name)) > y schid sch_name 1 331-802-7081 School One 2 464-551-7357 School Two 3 388-517-7627 School Three \\& Four 4 388-517-4394 School Five > On 8/16/06, Doran, Harold <HDoran@air.org> wrote: I'm trying to read in some data from a .csv format and have come across the following issue. Here is a simple example for replication # A sample .csv format schid,sch_name 331-802-7081,School One 464-551-7357,School Two 388-517-7627,School Three \& Four 388-517-4394,School Five Note the third line includes the \ character. However, when I read the data in I get > read.csv(file.choose()) schid sch_name 1 331-802-7081 School One 2 464-551-7357 School Two 3 388-517-7627 School Three & Four 4 388-517-4394 School Five It turns out to be very important to read in this character as I have a program that loops through a data set and Sweave's about 30,000 files. The variable sch_name gets dropped into the tex file using \Sexpr{tmp$sch_name}. However, if there is an &, the latex file won't compile properly. So, what I need is for the data to be read in as schid sch_name 1 331-802-7081 School One 2 464-551-7357 School Two 3 388-517-7627 School Three \& Four 4 388-517-4394 School Five I am obligated by a client to include the & in the school name, so eliminating that isn't an option. I thought maybe comment.char or quote would be what I needed, but they didn't resolve the issue. I'm certain I'm missing something simple, I just can't see it. Any thoughts? Harold [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]]
Set allowEscapes = FALSE when reading. See the help page for more details. There is perhaps an argument for changing the default for allowEscapes under read.csv, especially as people have now changed that for comment.char (in R-devel). On Wed, 16 Aug 2006, Doran, Harold wrote:> I'm trying to read in some data from a .csv format and have come across > the following issue. Here is a simple example for replication > > # A sample .csv format > schid,sch_name > 331-802-7081,School One > 464-551-7357,School Two > 388-517-7627,School Three \& Four > 388-517-4394,School Five > > Note the third line includes the \ character. However, when I read the > data in I get > > > read.csv(file.choose()) > schid sch_name > 1 331-802-7081 School One > 2 464-551-7357 School Two > 3 388-517-7627 School Three & Four > 4 388-517-4394 School Five > > It turns out to be very important to read in this character as I have a > program that loops through a data set and Sweave's about 30,000 files. > The variable sch_name gets dropped into the tex file using > \Sexpr{tmp$sch_name}. However, if there is an &, the latex file won't > compile properly. So, what I need is for the data to be read in as > > schid sch_name > 1 331-802-7081 School One > 2 464-551-7357 School Two > 3 388-517-7627 School Three \& Four > 4 388-517-4394 School Five > > I am obligated by a client to include the & in the school name, so > eliminating that isn't an option. I thought maybe comment.char or quote > would be what I needed, but they didn't resolve the issue. I'm certain > I'm missing something simple, I just can't see it. > > Any thoughts? > > Harold > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Well, for this particular program I am using 2.1.1, though I normally use 2.3.0. Long story about why the old version is used, but I must for this particular program.> -----Original Message----- > From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] > Sent: Wednesday, August 16, 2006 3:26 PM > To: Doran, Harold > Cc: r-help at stat.math.ethz.ch > Subject: Re: [R] read.csv issue > > On Wed, 16 Aug 2006, Prof Brian Ripley wrote: > > > Set allowEscapes = FALSE when reading. See the help page > for more details. > > > > There is perhaps an argument for changing the default for > allowEscapes > > under read.csv, especially as people have now changed that for > > comment.char (in R-devel). > > Oops, it was already changed in 2.2.0. What version of R is this? > > > On Wed, 16 Aug 2006, Doran, Harold wrote: > > > > > I'm trying to read in some data from a .csv format and have come > > > across the following issue. Here is a simple example for > replication > > > > > > # A sample .csv format > > > schid,sch_name > > > 331-802-7081,School One > > > 464-551-7357,School Two > > > 388-517-7627,School Three \& Four > > > 388-517-4394,School Five > > > > > > Note the third line includes the \ character. However, > when I read > > > the data in I get > > > > > > > read.csv(file.choose()) > > > schid sch_name > > > 1 331-802-7081 School One > > > 2 464-551-7357 School Two > > > 3 388-517-7627 School Three & Four > > > 4 388-517-4394 School Five > > > > > > It turns out to be very important to read in this character as I > > > have a program that loops through a data set and Sweave's > about 30,000 files. > > > The variable sch_name gets dropped into the tex file using > > > \Sexpr{tmp$sch_name}. However, if there is an &, the latex file > > > won't compile properly. So, what I need is for the data > to be read > > > in as > > > > > > schid sch_name > > > 1 331-802-7081 School One > > > 2 464-551-7357 School Two > > > 3 388-517-7627 School Three \& Four > > > 4 388-517-4394 School Five > > > > > > I am obligated by a client to include the & in the school > name, so > > > eliminating that isn't an option. I thought maybe comment.char or > > > quote would be what I needed, but they didn't resolve the > issue. I'm > > > certain I'm missing something simple, I just can't see it. > > > > > > Any thoughts? > > > > > > Harold > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 >