Hello, I have a problem with the variable type defined by reading a csv file with read.csv2. Here is a test file saved as < test.csv > : var1;var2;var3 TI;1995;4.5 VD;1990;4.8 FR;1994;3.9 VS;1993;5.1 FR;1995;4.7 FR;1992;5.8 That I read in R with : read.csv2("test.csv")->don;don don$var3 ## [1] 4.5 4.8 3.9 5.1 4.7 5.8 ## Levels: 3.9 4.5 4.7 4.8 5.1 5.8 as.double(don$var3) ## [1] 2 4 1 5 3 6 Why is it by default a <levels> type ? And how can I get the decimal value for var3 Thanks a lot for your answer. With my best regards, Pascale Voirin [[alternative HTML version deleted]]
Voirin Pascale <Pascale.Voirin at hefr.ch> writes:> Hello, > > I have a problem with the variable type defined by reading a csv file with read.csv2. > > Here is a test file saved as < test.csv > : > var1;var2;var3 > TI;1995;4.5 > VD;1990;4.8 > FR;1994;3.9 > VS;1993;5.1 > FR;1995;4.7 > FR;1992;5.8 > > That I read in R with : > read.csv2("test.csv")->don;don > don$var3 > ## [1] 4.5 4.8 3.9 5.1 4.7 5.8 > ## Levels: 3.9 4.5 4.7 4.8 5.1 5.8 > > as.double(don$var3) > ## [1] 2 4 1 5 3 6 > > Why is it by default a <levels> type ? And how can I get the decimal value for var3You very likely have a character in your column named var3. Just check your Levels after the import, and you should see it. Cheers, Rainer> > Thanks a lot for your answer. > With my best regards, > > Pascale Voirin > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Rainer M. Krug email: Rainer<at>krugs<dot>de PGP: 0x0F52F982 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 454 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20160929/ecab5bc2/attachment.bin>
On 29/09/2016 4:59 AM, Voirin Pascale wrote:> Hello, > > I have a problem with the variable type defined by reading a csv file with read.csv2. > > Here is a test file saved as < test.csv > : > var1;var2;var3 > TI;1995;4.5 > VD;1990;4.8 > FR;1994;3.9 > VS;1993;5.1 > FR;1995;4.7 > FR;1992;5.8 > > That I read in R with : > read.csv2("test.csv")->don;don > don$var3 > ## [1] 4.5 4.8 3.9 5.1 4.7 5.8 > ## Levels: 3.9 4.5 4.7 4.8 5.1 5.8 > > as.double(don$var3) > ## [1] 2 4 1 5 3 6 > > Why is it by default a <levels> type ? And how can I get the decimal value for var3It's a "factor". read.csv2() defaults to a decimal separator of "," rather than ".", so the last column doesn't look like numbers, and they're being read as character strings, and then automatically converted to a factor. Reading as read.csv2("test.csv", dec = ".") should give you what you want, or you can convert after the fact with as.numeric(as.character(don$var3)) Duncan Murdoch> > Thanks a lot for your answer. > With my best regards, > > Pascale Voirin > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hello, The defaults in read.csv2 are ";" as the separator and "," as the decimal symbol. It seems that the file you import is not a true csv since it mixes up two norms. You can solve your problem in defining the dec option equals to ".": read.csv2("test.csv",dec=".")->don Alain On 29/09/16 10:59, Voirin Pascale wrote:> Hello, > > I have a problem with the variable type defined by reading a csv file with read.csv2. > > Here is a test file saved as < test.csv > : > var1;var2;var3 > TI;1995;4.5 > VD;1990;4.8 > FR;1994;3.9 > VS;1993;5.1 > FR;1995;4.7 > FR;1992;5.8 > > That I read in R with : > read.csv2("test.csv")->don;don > don$var3 > ## [1] 4.5 4.8 3.9 5.1 4.7 5.8 > ## Levels: 3.9 4.5 4.7 4.8 5.1 5.8 > > as.double(don$var3) > ## [1] 2 4 1 5 3 6 > > Why is it by default a <levels> type ? And how can I get the decimal value for var3 > > Thanks a lot for your answer. > With my best regards, > > Pascale Voirin > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > . >-- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Universit? catholique de Louvain http://www.uclouvain.be/smcs Bureau c.316 Voie du Roman Pays, 20 (bte L1.04.01) B-1348 Louvain-la-Neuve Belgium Tel: +32 10 47 30 50 Acc?s: http://www.uclouvain.be/323631.html
> On 29 Sep 2016, at 11:40 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > > It's a "factor". read.csv2() defaults to a decimal separator of "," rather than ".", so the last column doesn't look like numbers, and they're being read as character strings, and then automatically converted to a factor. Reading as >Yep, that's the whole point of read.csv2 -- someone stupidly decided (back in the 90's) that the use of comma as a decimal separator in some languages should extend to storage file formats. That, of course, ruined the CSV standard use of comma as a field separator and prompted the double-standard situation where .csv files can be comma/period or semicolon/comma style, often depending on languages settings, which in turn can make data transfer between different languages an ungodly mess.... As Duncan points out, R provides settings that will (mostly) let you handle csv files that are written in hybrid formats like semicolon/period.
Or, you can just use read.csv with sep=';' read.csv("test.csv", sep=';') -> don Dan Daniel Nordlund, PhD Research and Data Analysis Division Services & Enterprise Support Administration Washington State Department of Social and Health Services> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Alain > Guillet > Sent: Thursday, September 29, 2016 2:42 AM > To: r-help at r-project.org > Subject: Re: [R] using read.csv2() > > Hello, > > The defaults in read.csv2 are ";" as the separator and "," as the decimal > symbol. It seems that the file you import is not a true csv since it mixes up > two norms. > > You can solve your problem in defining the dec option equals to ".": > > read.csv2("test.csv",dec=".")->don > > > Alain > > On 29/09/16 10:59, Voirin Pascale wrote: > > Hello, > > > > I have a problem with the variable type defined by reading a csv file with > read.csv2. > > > > Here is a test file saved as < test.csv > : > > var1;var2;var3 > > TI;1995;4.5 > > VD;1990;4.8 > > FR;1994;3.9 > > VS;1993;5.1 > > FR;1995;4.7 > > FR;1992;5.8 > > > > That I read in R with : > > read.csv2("test.csv")->don;don > > don$var3 > > ## [1] 4.5 4.8 3.9 5.1 4.7 5.8 > > ## Levels: 3.9 4.5 4.7 4.8 5.1 5.8 > > > > as.double(don$var3) > > ## [1] 2 4 1 5 3 6 > > > > Why is it by default a <levels> type ? And how can I get the decimal > > value for var3 > > > > Thanks a lot for your answer. > > With my best regards, > > > > Pascale Voirin > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > . > > > > -- > Alain Guillet > Statistician and Computer Scientist > > SMCS - IMMAQ - Universit? catholique de Louvain > http://www.uclouvain.be/smcs > > Bureau c.316 > Voie du Roman Pays, 20 (bte L1.04.01) > B-1348 Louvain-la-Neuve > Belgium > > Tel: +32 10 47 30 50 > > Acc?s: http://www.uclouvain.be/323631.html > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.