Don't be so US-centric, Abby... how do you know that javad's version of Excel doesn't default to using semicolons? ?read.csv2 On July 1, 2019 6:06:32 PM PDT, Abby Spurdle <spurdle.a at gmail.com> wrote:>> I am trying to read an excel CSV file (1.csv). When I read it as csv >file >> in R, the R shows me the exact number of row. But it puts all columns >in >> one column, while I have 3 or 4 columns in the data frame. >> d4 = read.table("./4.csv",sep=";",header=TRUE) > >Firstly, I recommend against naming your file "1.csv". >(Start with a letter not a number). > >Secondly, a CSV file should be separated by commas not semicolons. >You can specify sep=",", however, it's probably easier to use the >read.csv() function. > >Note that you should be able to open your file in a text editor to see >the >separators. > >> I dont know why in the "save as type" box Unicode text (*.txt) > >Other posters have suggested that you need to specify the encoding. >Assuming that you create your CSV file correctly in Excel, I doubt that >this is necessary, but I could be wrong... > >Your comment suggests that you have saved your document as "Unicode >text". >You need to tell Excel to save the file as a CSV file. >(There should be a list of save options). > >Simply typing a file name with a .csv file extension is unlikely to >produce >the desired result. > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
> Don't be so US-centric, Abby... how do you know that javad's version ofExcel doesn't default to using semicolons? I don't. However, Comma-Separated Values (CSV) are, comma separated, by definition. So, if the files use semicolons, then... Also, the use of the wrong sep="my.delim" argument is the most likely cause of single column output. However, you're right, I don't really know, I'm just guessing... [[alternative HTML version deleted]]
Dear all; I use your suggestion but I gave the same warning messages. I changed the file name (Data.csv). " d4<-read.csv("./Data.csv",sep=";",header=TRUE,encoding="UTF-16") Warning messages: 1: In read.table(file = file, header = header, sep = sep, quote quote, : line 1 appears to contain embedded nulls 2: In read.table(file = file, header = header, sep = sep, quote quote, : line 2 appears to contain embedded nulls 3: In read.table(file = file, header = header, sep = sep, quote quote, : line 3 appears to contain embedded nulls 4: In read.table(file = file, header = header, sep = sep, quote quote, : line 4 appears to contain embedded nulls 5: In read.table(file = file, header = header, sep = sep, quote quote, : line 5 appears to contain embedded nulls 6: In read.table(file = file, header = header, sep = sep, quote quote, : line 1 appears to contain embedded nulls 7: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : embedded nul(s) found in input " I opened the Data in notepad. This is the head of Data.csv. The columns have been separated by semicolons. " "INLET Time";"INLET ValueY";"TRATED WATER TANK Time";"TRATED WATER TANK ValueY" 10/28/2018;550.057861328125;10/28/2018;487.812530517578 10/28/2018 12:00:01 ?.?;550.057861328125;10/28/2018 12:00:01 ?.?;487.812530517578 10/28/2018 12:00:02 ?.?;550.057861328125;10/28/2018 12:00:02 ?.?;487.812530517578 10/28/2018 12:00:03 ?.?;550.057861328125;10/28/2018 12:00:03 ?.?;487.812530517578 10/28/2018 12:00:04 ?.?;550.057861328125;10/28/2018 12:00:04 ?.?;487.812530517578 . . . " Thanks. On Tue, Jul 2, 2019 at 6:14 AM Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> Don't be so US-centric, Abby... how do you know that javad's version of > Excel doesn't default to using semicolons? > > ?read.csv2 > > On July 1, 2019 6:06:32 PM PDT, Abby Spurdle <spurdle.a at gmail.com> wrote: > >> I am trying to read an excel CSV file (1.csv). When I read it as csv > >file > >> in R, the R shows me the exact number of row. But it puts all columns > >in > >> one column, while I have 3 or 4 columns in the data frame. > >> d4 = read.table("./4.csv",sep=";",header=TRUE) > > > >Firstly, I recommend against naming your file "1.csv". > >(Start with a letter not a number). > > > >Secondly, a CSV file should be separated by commas not semicolons. > >You can specify sep=",", however, it's probably easier to use the > >read.csv() function. > > > >Note that you should be able to open your file in a text editor to see > >the > >separators. > > > >> I dont know why in the "save as type" box Unicode text (*.txt) > > > >Other posters have suggested that you need to specify the encoding. > >Assuming that you create your CSV file correctly in Excel, I doubt that > >this is necessary, but I could be wrong... > > > >Your comment suggests that you have saved your document as "Unicode > >text". > >You need to tell Excel to save the file as a CSV file. > >(There should be a list of save options). > > > >Simply typing a file name with a .csv file extension is unlikely to > >produce > >the desired result. > > > > [[alternative HTML version deleted]] > > > >______________________________________________ > >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > -- > Sent from my phone. Please excuse my brevity. >-- Best Regards Javad Bayat M.Sc. Environment Engineering Alternative Mail: bayat194 at yahoo.com [[alternative HTML version deleted]]
If I recall correctly, Excel's 'Unicode' used to mean "UTF-16", which R's scan() did not recognize without a hint. The relevant argument is fileEncoding, not encoding. UTF-16 files generally have lots of null bytes and UTF-8 files have no null bytes and if you try to read UTF-16 as UTF-8 you get the embedded-null warning. I don't have Excel installed, but the following example is from R-3.5.2 on a Linux box.> f8 <- file(tf8 <- tempfile(), open="w", encoding="UTF-8") > cat("\u0416;zh\n", file=f8); close(f8) > readBin(tf8, what="raw", n=file.size(tf8))[1] d0 96 3b 7a 68 0a> > f16 <- file(tf16 <- tempfile(), open="w", encoding="UTF-16") > cat("\u0416;zh\n", file=f16); close(f16) > readBin(tf16, what="raw", n=file.size(tf16))[1] ff fe 16 04 3b 00 7a 00 68 00 0a 00> > read.csv(tf8, sep=";", header=FALSE)V1 V2 1 ? zh> read.csv(tf16, sep=";", header=FALSE)Error in type.convert.default(data[[i]], as.is = as.is[i], dec = dec, : invalid multibyte string at '<ff><fe>' In addition: Warning messages: 1: In read.table(file = file, header = header, sep = sep, quote = quote, : line 1 appears to contain embedded nulls 2: In read.table(file = file, header = header, sep = sep, quote = quote, : line 2 appears to contain embedded nulls 3: In read.table(file = file, header = header, sep = sep, quote = quote, : incomplete final line found by readTableHeader on '/tmp/RtmpzfG6eG/file40e53389f40e'> read.csv(tf16, sep=";", header=FALSE, fileEncoding="UTF-16")V1 V2 1 ? zh . Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Jul 1, 2019 at 8:12 PM Abby Spurdle <spurdle.a at gmail.com> wrote:> > Don't be so US-centric, Abby... how do you know that javad's version of > Excel doesn't default to using semicolons? > > I don't. > > However, Comma-Separated Values (CSV) are, comma separated, by definition. > So, if the files use semicolons, then... > > Also, the use of the wrong sep="my.delim" argument is the most likely cause > of single column output. > > However, you're right, I don't really know, I'm just guessing... > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi Javad, I could not make sense of the data structure associated with the csv file as it is copied in your previous message. Would you mind sending a link so one can download your csv file directly (or at least the first few lines) so people can check the exact properties of your file? Yours. Olivier. On Tue, 2 Jul 2019 07:56:07 +0430 javad bayat <j.bayat194 at gmail.com> wrote:> Dear all; > I use your suggestion but I gave the same warning messages. I changed > the file name (Data.csv). > " > d4<-read.csv("./Data.csv",sep=";",header=TRUE,encoding="UTF-16") > Warning messages: > 1: In read.table(file = file, header = header, sep = sep, > quote = quote, : > line 1 appears to contain embedded nulls > 2: In read.table(file = file, header = header, sep = sep, > quote = quote, : > line 2 appears to contain embedded nulls > 3: In read.table(file = file, header = header, sep = sep, > quote = quote, : > line 3 appears to contain embedded nulls > 4: In read.table(file = file, header = header, sep = sep, > quote = quote, : > line 4 appears to contain embedded nulls > 5: In read.table(file = file, header = header, sep = sep, > quote = quote, : > line 5 appears to contain embedded nulls > 6: In read.table(file = file, header = header, sep = sep, > quote = quote, : > line 1 appears to contain embedded nulls > 7: In scan(file = file, what = what, sep = sep, quote > quote, dec = dec, : > embedded nul(s) found in input > " > > I opened the Data in notepad. This is the head of Data.csv. The > columns have been separated by semicolons. > " > "INLET Time";"INLET ValueY";"TRATED WATER TANK Time";"TRATED WATER > TANK ValueY" > 10/28/2018;550.057861328125;10/28/2018;487.812530517578 > 10/28/2018 12:00:01 ?.?;550.057861328125;10/28/2018 12:00:01 > ?.?;487.812530517578 > 10/28/2018 12:00:02 ?.?;550.057861328125;10/28/2018 12:00:02 > ?.?;487.812530517578 > 10/28/2018 12:00:03 ?.?;550.057861328125;10/28/2018 12:00:03 > ?.?;487.812530517578 > 10/28/2018 12:00:04 ?.?;550.057861328125;10/28/2018 12:00:04 > ?.?;487.812530517578 > . > . > . > " > Thanks. > > > > On Tue, Jul 2, 2019 at 6:14 AM Jeff Newmiller > <jdnewmil at dcn.davis.ca.us> wrote: > > > Don't be so US-centric, Abby... how do you know that javad's > > version of Excel doesn't default to using semicolons? > > > > ?read.csv2 > > > > On July 1, 2019 6:06:32 PM PDT, Abby Spurdle <spurdle.a at gmail.com> > > wrote: > > >> I am trying to read an excel CSV file (1.csv). When I read it as > > >> csv > > >file > > >> in R, the R shows me the exact number of row. But it puts all > > >> columns > > >in > > >> one column, while I have 3 or 4 columns in the data frame. > > >> d4 = read.table("./4.csv",sep=";",header=TRUE) > > > > > >Firstly, I recommend against naming your file "1.csv". > > >(Start with a letter not a number). > > > > > >Secondly, a CSV file should be separated by commas not semicolons. > > >You can specify sep=",", however, it's probably easier to use the > > >read.csv() function. > > > > > >Note that you should be able to open your file in a text editor to > > >see the > > >separators. > > > > > >> I dont know why in the "save as type" box Unicode text (*.txt) > > > > > >Other posters have suggested that you need to specify the encoding. > > >Assuming that you create your CSV file correctly in Excel, I doubt > > >that this is necessary, but I could be wrong... > > > > > >Your comment suggests that you have saved your document as "Unicode > > >text". > > >You need to tell Excel to save the file as a CSV file. > > >(There should be a list of save options). > > > > > >Simply typing a file name with a .csv file extension is unlikely to > > >produce > > >the desired result. > > > > > > [[alternative HTML version deleted]] > > > > > >______________________________________________ > > >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >https://stat.ethz.ch/mailman/listinfo/r-help > > >PLEASE do read the posting guide > > >http://www.R-project.org/posting-guide.html > > >and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Sent from my phone. Please excuse my brevity. > > > > > -- > Best Regards > Javad Bayat > M.Sc. Environment Engineering > Alternative Mail: bayat194 at yahoo.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, > minimal, self-contained, reproducible code.-- Olivier Crouzet, PhD /Ma?tre de Conf?rences/ @LLING - Laboratoire de Linguistique de Nantes UMR6310 CNRS / Universit? de Nantes /Guest Researcher/ @UMCG (University Medical Center Groningen) ENT department Rijksuniversiteit Groningen
Try changing encoding="UTF-16" to fileEncoding="UTF-16". Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Jul 1, 2019 at 9:30 PM javad bayat <j.bayat194 at gmail.com> wrote:> Dear all; > I use your suggestion but I gave the same warning messages. I changed the > file name (Data.csv). > " > d4<-read.csv("./Data.csv",sep=";",header=TRUE,encoding="UTF-16") > Warning messages: > 1: In read.table(file = file, header = header, sep = sep, quote > quote, : > line 1 appears to contain embedded nulls > 2: In read.table(file = file, header = header, sep = sep, quote > quote, : > line 2 appears to contain embedded nulls > 3: In read.table(file = file, header = header, sep = sep, quote > quote, : > line 3 appears to contain embedded nulls > 4: In read.table(file = file, header = header, sep = sep, quote > quote, : > line 4 appears to contain embedded nulls > 5: In read.table(file = file, header = header, sep = sep, quote > quote, : > line 5 appears to contain embedded nulls > 6: In read.table(file = file, header = header, sep = sep, quote > quote, : > line 1 appears to contain embedded nulls > 7: In scan(file = file, what = what, sep = sep, quote = quote, > dec = dec, : > embedded nul(s) found in input > " > > I opened the Data in notepad. This is the head of Data.csv. The columns > have been separated by semicolons. > " > "INLET Time";"INLET ValueY";"TRATED WATER TANK Time";"TRATED WATER TANK > ValueY" > 10/28/2018;550.057861328125;10/28/2018;487.812530517578 > 10/28/2018 12:00:01 ?.?;550.057861328125;10/28/2018 12:00:01 > ?.?;487.812530517578 > 10/28/2018 12:00:02 ?.?;550.057861328125;10/28/2018 12:00:02 > ?.?;487.812530517578 > 10/28/2018 12:00:03 ?.?;550.057861328125;10/28/2018 12:00:03 > ?.?;487.812530517578 > 10/28/2018 12:00:04 ?.?;550.057861328125;10/28/2018 12:00:04 > ?.?;487.812530517578 > . > . > . > " > Thanks. > > > > On Tue, Jul 2, 2019 at 6:14 AM Jeff Newmiller <jdnewmil at dcn.davis.ca.us> > wrote: > > > Don't be so US-centric, Abby... how do you know that javad's version of > > Excel doesn't default to using semicolons? > > > > ?read.csv2 > > > > On July 1, 2019 6:06:32 PM PDT, Abby Spurdle <spurdle.a at gmail.com> > wrote: > > >> I am trying to read an excel CSV file (1.csv). When I read it as csv > > >file > > >> in R, the R shows me the exact number of row. But it puts all columns > > >in > > >> one column, while I have 3 or 4 columns in the data frame. > > >> d4 = read.table("./4.csv",sep=";",header=TRUE) > > > > > >Firstly, I recommend against naming your file "1.csv". > > >(Start with a letter not a number). > > > > > >Secondly, a CSV file should be separated by commas not semicolons. > > >You can specify sep=",", however, it's probably easier to use the > > >read.csv() function. > > > > > >Note that you should be able to open your file in a text editor to see > > >the > > >separators. > > > > > >> I dont know why in the "save as type" box Unicode text (*.txt) > > > > > >Other posters have suggested that you need to specify the encoding. > > >Assuming that you create your CSV file correctly in Excel, I doubt that > > >this is necessary, but I could be wrong... > > > > > >Your comment suggests that you have saved your document as "Unicode > > >text". > > >You need to tell Excel to save the file as a CSV file. > > >(There should be a list of save options). > > > > > >Simply typing a file name with a .csv file extension is unlikely to > > >produce > > >the desired result. > > > > > > [[alternative HTML version deleted]] > > > > > >______________________________________________ > > >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >https://stat.ethz.ch/mailman/listinfo/r-help > > >PLEASE do read the posting guide > > >http://www.R-project.org/posting-guide.html > > >and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Sent from my phone. Please excuse my brevity. > > > > > -- > Best Regards > Javad Bayat > M.Sc. Environment Engineering > Alternative Mail: bayat194 at yahoo.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]