Dear R helpers, I am trying to read a CSV file in R called EUROPE (originally an Excel file which I have saved as a CSV file) using the command EUROPEDATA <- read.csv("EUROPE.csv") EUROPE.csv is basically a matrix of dimension 440*44, and has a line of headers, i.e. each column has a name. Using read.csv I can't load the data into R properly. Although the first 20 columns or so are read in properly, some of the data from the remaining columns are missing, eg. For Column 29, the loaded file cannot read the first 120 observations and puts NA in their place, whereas the rest of the column is read in properly! I find this really strange. I have tried to use read.table and scan commands as well, with the header = T option, but still the problem is not solved. Please note the columns are formatted in the same way, and contain numbers (apart form the header row). Does anybody have any idea how I can read the data properly into R?
I think that command should work (assuming that it is *comma* rather than semi-colon delimited, which is used in countries where a comma is used as a decimal point, in which case you should use read.csv2 instead). So, is your data definitely as clean as you think. Have you looked at the data in a text editor? What are the dimensions of the resulting data frame? On 19/09/06, Mesomeris, Spyros [CIR] <spyros.mesomeris at citigroup.com> wrote:> Dear R helpers, > > I am trying to read a CSV file in R called EUROPE (originally an Excel > file which I have saved as a CSV file) using the command > > EUROPEDATA <- read.csv("EUROPE.csv") > > EUROPE.csv is basically a matrix of dimension 440*44, and has a line of > headers, i.e. each column has a name. > > Using read.csv I can't load the data into R properly. Although the first > 20 columns or so are read in properly, some of the data from the > remaining columns are missing, eg. For Column 29, the loaded file cannot > read the first 120 observations and puts NA in their place, whereas the > rest of the column is read in properly! I find this really strange. > > I have tried to use read.table and scan commands as well, with the > header = T option, but still the problem is not solved. Please note the > columns are formatted in the same way, and contain numbers (apart form > the header row). > > Does anybody have any idea how I can read the data properly into R? > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ================================David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP
On 9/19/06, Mesomeris, Spyros [CIR] <spyros.mesomeris at citigroup.com> wrote:> Dear R helpers, > > I am trying to read a CSV file in R called EUROPE (originally an Excel > file which I have saved as a CSV file) using the command > > EUROPEDATA <- read.csv("EUROPE.csv") > > EUROPE.csv is basically a matrix of dimension 440*44, and has a line of > headers, i.e. each column has a name.Check your file for unicode characters, they will get in the way. I'm new to R myself but have used both read.delim and read.csv. I've only had problems when the files contained unicode. I don't know excel, but in staroffice you need to explicitly set the character set during the save if the file had unicode on input. Cheers, Geoff Russell
Thanks David, It has actually worked, the problem was the formatting of the N/A values in Excel. R apparently doesn't like to see #N/A that Excel produces if a formula cannot be returned. So, saving the file as csv (comma delimited) file and removing all the #N/A observations, leaving those cells empty, and then uploading the file into R, works fine. I hope this is helpful for other users as well Thanks again -----Original Message----- From: David Barron [mailto:mothsailor at googlemail.com] Sent: 19 September 2006 11:33 To: Mesomeris, Spyros [CIR]; r-help Subject: Re: [R] Reading a file in R I think that command should work (assuming that it is *comma* rather than semi-colon delimited, which is used in countries where a comma is used as a decimal point, in which case you should use read.csv2 instead). So, is your data definitely as clean as you think. Have you looked at the data in a text editor? What are the dimensions of the resulting data frame? On 19/09/06, Mesomeris, Spyros [CIR] <spyros.mesomeris at citigroup.com> wrote:> Dear R helpers, > > I am trying to read a CSV file in R called EUROPE (originally an Excel> file which I have saved as a CSV file) using the command > > EUROPEDATA <- read.csv("EUROPE.csv") > > EUROPE.csv is basically a matrix of dimension 440*44, and has a line > of headers, i.e. each column has a name. > > Using read.csv I can't load the data into R properly. Although the > first 20 columns or so are read in properly, some of the data from the> remaining columns are missing, eg. For Column 29, the loaded file > cannot read the first 120 observations and puts NA in their place, > whereas the rest of the column is read in properly! I find this reallystrange.> > I have tried to use read.table and scan commands as well, with the > header = T option, but still the problem is not solved. Please note > the columns are formatted in the same way, and contain numbers (apart > form the header row). > > Does anybody have any idea how I can read the data properly into R? > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ================================David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP