Hello,
I have a problem with the variable type defined by reading a csv file with
read.csv2.
Here is a test file saved as < test.csv > :
var1;var2;var3
TI;1995;4.5
VD;1990;4.8
FR;1994;3.9
VS;1993;5.1
FR;1995;4.7
FR;1992;5.8
That I read in R with :
read.csv2("test.csv")->don;don
don$var3
## [1] 4.5 4.8 3.9 5.1 4.7 5.8
## Levels: 3.9 4.5 4.7 4.8 5.1 5.8
as.double(don$var3)
## [1] 2 4 1 5 3 6
Why is it by default a <levels> type ? And how can I get the decimal
value for var3
Thanks a lot for your answer.
With my best regards,
Pascale Voirin
[[alternative HTML version deleted]]
Voirin Pascale <Pascale.Voirin at hefr.ch> writes:> Hello, > > I have a problem with the variable type defined by reading a csv file with read.csv2. > > Here is a test file saved as < test.csv > : > var1;var2;var3 > TI;1995;4.5 > VD;1990;4.8 > FR;1994;3.9 > VS;1993;5.1 > FR;1995;4.7 > FR;1992;5.8 > > That I read in R with : > read.csv2("test.csv")->don;don > don$var3 > ## [1] 4.5 4.8 3.9 5.1 4.7 5.8 > ## Levels: 3.9 4.5 4.7 4.8 5.1 5.8 > > as.double(don$var3) > ## [1] 2 4 1 5 3 6 > > Why is it by default a <levels> type ? And how can I get the decimal value for var3You very likely have a character in your column named var3. Just check your Levels after the import, and you should see it. Cheers, Rainer> > Thanks a lot for your answer. > With my best regards, > > Pascale Voirin > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Rainer M. Krug email: Rainer<at>krugs<dot>de PGP: 0x0F52F982 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 454 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20160929/ecab5bc2/attachment.bin>
On 29/09/2016 4:59 AM, Voirin Pascale wrote:> Hello, > > I have a problem with the variable type defined by reading a csv file with read.csv2. > > Here is a test file saved as < test.csv > : > var1;var2;var3 > TI;1995;4.5 > VD;1990;4.8 > FR;1994;3.9 > VS;1993;5.1 > FR;1995;4.7 > FR;1992;5.8 > > That I read in R with : > read.csv2("test.csv")->don;don > don$var3 > ## [1] 4.5 4.8 3.9 5.1 4.7 5.8 > ## Levels: 3.9 4.5 4.7 4.8 5.1 5.8 > > as.double(don$var3) > ## [1] 2 4 1 5 3 6 > > Why is it by default a <levels> type ? And how can I get the decimal value for var3It's a "factor". read.csv2() defaults to a decimal separator of "," rather than ".", so the last column doesn't look like numbers, and they're being read as character strings, and then automatically converted to a factor. Reading as read.csv2("test.csv", dec = ".") should give you what you want, or you can convert after the fact with as.numeric(as.character(don$var3)) Duncan Murdoch> > Thanks a lot for your answer. > With my best regards, > > Pascale Voirin > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hello,
The defaults in read.csv2 are ";" as the separator and ","
as the
decimal symbol. It seems that the file you import is not a true csv
since it mixes up two norms.
You can solve your problem in defining the dec option equals to ".":
read.csv2("test.csv",dec=".")->don
Alain
On 29/09/16 10:59, Voirin Pascale wrote:> Hello,
>
> I have a problem with the variable type defined by reading a csv file with
read.csv2.
>
> Here is a test file saved as < test.csv > :
> var1;var2;var3
> TI;1995;4.5
> VD;1990;4.8
> FR;1994;3.9
> VS;1993;5.1
> FR;1995;4.7
> FR;1992;5.8
>
> That I read in R with :
> read.csv2("test.csv")->don;don
> don$var3
> ## [1] 4.5 4.8 3.9 5.1 4.7 5.8
> ## Levels: 3.9 4.5 4.7 4.8 5.1 5.8
>
> as.double(don$var3)
> ## [1] 2 4 1 5 3 6
>
> Why is it by default a <levels> type ? And how can I get the decimal
value for var3
>
> Thanks a lot for your answer.
> With my best regards,
>
> Pascale Voirin
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> .
>
--
Alain Guillet
Statistician and Computer Scientist
SMCS - IMMAQ - Universit? catholique de Louvain
http://www.uclouvain.be/smcs
Bureau c.316
Voie du Roman Pays, 20 (bte L1.04.01)
B-1348 Louvain-la-Neuve
Belgium
Tel: +32 10 47 30 50
Acc?s: http://www.uclouvain.be/323631.html
> On 29 Sep 2016, at 11:40 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > > It's a "factor". read.csv2() defaults to a decimal separator of "," rather than ".", so the last column doesn't look like numbers, and they're being read as character strings, and then automatically converted to a factor. Reading as >Yep, that's the whole point of read.csv2 -- someone stupidly decided (back in the 90's) that the use of comma as a decimal separator in some languages should extend to storage file formats. That, of course, ruined the CSV standard use of comma as a field separator and prompted the double-standard situation where .csv files can be comma/period or semicolon/comma style, often depending on languages settings, which in turn can make data transfer between different languages an ungodly mess.... As Duncan points out, R provides settings that will (mostly) let you handle csv files that are written in hybrid formats like semicolon/period.
Or, you can just use read.csv with sep=';'
read.csv("test.csv", sep=';') -> don
Dan
Daniel Nordlund, PhD
Research and Data Analysis Division
Services & Enterprise Support Administration
Washington State Department of Social and Health Services
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Alain
> Guillet
> Sent: Thursday, September 29, 2016 2:42 AM
> To: r-help at r-project.org
> Subject: Re: [R] using read.csv2()
>
> Hello,
>
> The defaults in read.csv2 are ";" as the separator and
"," as the decimal
> symbol. It seems that the file you import is not a true csv since it mixes
up
> two norms.
>
> You can solve your problem in defining the dec option equals to
".":
>
> read.csv2("test.csv",dec=".")->don
>
>
> Alain
>
> On 29/09/16 10:59, Voirin Pascale wrote:
> > Hello,
> >
> > I have a problem with the variable type defined by reading a csv file
with
> read.csv2.
> >
> > Here is a test file saved as < test.csv > :
> > var1;var2;var3
> > TI;1995;4.5
> > VD;1990;4.8
> > FR;1994;3.9
> > VS;1993;5.1
> > FR;1995;4.7
> > FR;1992;5.8
> >
> > That I read in R with :
> > read.csv2("test.csv")->don;don
> > don$var3
> > ## [1] 4.5 4.8 3.9 5.1 4.7 5.8
> > ## Levels: 3.9 4.5 4.7 4.8 5.1 5.8
> >
> > as.double(don$var3)
> > ## [1] 2 4 1 5 3 6
> >
> > Why is it by default a <levels> type ? And how can I get the
decimal
> > value for var3
> >
> > Thanks a lot for your answer.
> > With my best regards,
> >
> > Pascale Voirin
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > .
> >
>
> --
> Alain Guillet
> Statistician and Computer Scientist
>
> SMCS - IMMAQ - Universit? catholique de Louvain
> http://www.uclouvain.be/smcs
>
> Bureau c.316
> Voie du Roman Pays, 20 (bte L1.04.01)
> B-1348 Louvain-la-Neuve
> Belgium
>
> Tel: +32 10 47 30 50
>
> Acc?s: http://www.uclouvain.be/323631.html
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.