Hi, I have a dataset saved in *.csv format, that contains 13 columns (the first column being the title name and the rest experiments) and about 2500 rows. Not all columns in the row have data in it i.e for eg BS00,-0.084,0.0136,-0.1569,-0.6484,1.103,1.7859,0.40287,0.5368,0.08461,-0.1935,-0.147974,0.30685 BS01,0.491270283,0.875826172,,,,,,,,,, BS02,0.090794476,0.225858954,,,0.32643,0.34317,0.133145295,,,0.115832599,0.47636458, BS03,0.019828221,-0.095735935,-0.122767219,-0.0676,0.002533,-0.1510361,0.736247,2.053192,-0.423658,0.4591219,1.1245015, BS04,-0.435189342,-0.041595955,-0.781281128,-1.923036,-3.230167102,,,,0.152322609,-1.495513519,, I am using R to perform a correlation, but I am getting an error while trying to read the data as "> person.data<-read.table("datafile.csv",header=TRUE,sep=',',row.names=1) Error in scan (file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1919 did not have 13 elements Execution halted " The error looks as though there is a problem with the last element being not read when it is blank. I could introduce terms like "na" to the blank elements but I donot want to do that because this will hinder my future analysis. Can some one suggest me a solution to overcome this problem while reading the data? , or is there something that I have missed to make the data readable. Thank you in advance, PS: The data was imported from a experiment and saved in excel sheet as a *.csv and then used.
Use x <- count.fields('datafile.csv',sep=',') on your file. This will tell you the number of columns that R thinks is one each line. Then do: table(x) to see if all the lines have the same number of columns. If not, then find the lines which don't: which(x != legal.length) and then look at your input data for those lines. On 7/21/06, Ahamarshan jn <ashgene@yahoo.co.in> wrote:> > Hi, > I have a dataset saved in *.csv format, that contains > 13 columns (the first column being the title name and > the rest experiments) and about 2500 rows. > Not all columns in the row have data in it > i.e for eg > > BS00,-0.084,0.0136,-0.1569,-0.6484,1.103,1.7859,0.40287,0.5368,0.08461,- > 0.1935,-0.147974,0.30685 > > BS01,0.491270283,0.875826172,,,,,,,,,, > > BS02,0.090794476,0.225858954,,,0.32643,0.34317,0.133145295,,,0.115832599, > 0.47636458, > > BS03,0.019828221,-0.095735935,-0.122767219,-0.0676,0.002533,-0.1510361, > 0.736247,2.053192,-0.423658,0.4591219,1.1245015, > > BS04,-0.435189342,-0.041595955,-0.781281128,-1.923036,-3.230167102,,,, > 0.152322609,-1.495513519,, > > > I am using R to perform a correlation, but I am > getting an error while trying to read the data as > > > "> > person.data<-read.table("datafile.csv",header=TRUE,sep=',',row.names=1) > > Error in scan (file = file, what = what, sep = sep, > quote = quote, dec = dec, : > line 1919 did not have 13 elements > Execution halted " > > The error looks as though there is a problem with the > last element being not read when it is blank. I could > introduce terms like "na" to the blank elements but I > donot want to do that because this will hinder my > future analysis. > > Can some one suggest me a solution to overcome this > problem while reading the data? , or is there > something that I have missed to make the data > readable. > > Thank you in advance, > > PS: The data was imported from a experiment and saved > in excel sheet as a *.csv and then used. > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]]
See if this works: read.csv("datafile.csv", row.names = 1, fill = TRUE) On 7/21/06, Ahamarshan jn <ashgene at yahoo.co.in> wrote:> Hi, > I have a dataset saved in *.csv format, that contains > 13 columns (the first column being the title name and > the rest experiments) and about 2500 rows. > Not all columns in the row have data in it > i.e for eg > > BS00,-0.084,0.0136,-0.1569,-0.6484,1.103,1.7859,0.40287,0.5368,0.08461,-0.1935,-0.147974,0.30685 > > BS01,0.491270283,0.875826172,,,,,,,,,, > > BS02,0.090794476,0.225858954,,,0.32643,0.34317,0.133145295,,,0.115832599,0.47636458, > > BS03,0.019828221,-0.095735935,-0.122767219,-0.0676,0.002533,-0.1510361,0.736247,2.053192,-0.423658,0.4591219,1.1245015, > > BS04,-0.435189342,-0.041595955,-0.781281128,-1.923036,-3.230167102,,,,0.152322609,-1.495513519,, > > > I am using R to perform a correlation, but I am > getting an error while trying to read the data as > > > "> > person.data<-read.table("datafile.csv",header=TRUE,sep=',',row.names=1) > > Error in scan (file = file, what = what, sep = sep, > quote = quote, dec = dec, : > line 1919 did not have 13 elements > Execution halted " > > The error looks as though there is a problem with the > last element being not read when it is blank. I could > introduce terms like "na" to the blank elements but I > donot want to do that because this will hinder my > future analysis. > > Can some one suggest me a solution to overcome this > problem while reading the data? , or is there > something that I have missed to make the data > readable. > > Thank you in advance, > > PS: The data was imported from a experiment and saved > in excel sheet as a *.csv and then used. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi! On Fri, Jul 21, 2006 at 05:43:03AM -0700, Ahamarshan jn wrote:> I have a dataset saved in *.csv format, that contains[...]> BS00,-0.084,0.0136,-0.1569,-0.6484,1.103,1.7859,0.40287,0.5368,0.08461,-0.1935,-0.147974,0.30685 > BS01,0.491270283,0.875826172,,,,,,,,,, > BS02,0.090794476,0.225858954,,,0.32643,0.34317,0.133145295,,,0.115832599,0.47636458, > BS03,0.019828221,-0.095735935,-0.122767219,-0.0676,0.002533,-0.1510361,0.736247,2.053192,-0.423658,0.4591219,1.1245015, > BS04,-0.435189342,-0.041595955,-0.781281128,-1.923036,-3.230167102,,,,0.152322609,-1.495513519,,> person.data<-read.table("datafile.csv",header=TRUE,sep=',',row.names=1) > > Error in scan (file = file, what = what, sep = sep, > quote = quote, dec = dec, : > line 1919 did not have 13 elements > Execution halted "R does handle empty elements fine. The error message you quote occurs if a row does not contain the expected number of elements (empty or not.) Did you have a look at row 1919? Does it really contain the same number of separators (commas) as the other ones? Some programs handle empty elements at the end of a row in a 'lazy' way and simply ommit them. If this is the case you can use the option 'fill=TRUE' to tell read.table that you want it to silently pad short rows with empty elements. Another 'popular' reason for funny errors with read.table is the unexpected occurence of quotation or comment characters in the data... cu Philipp -- Dr. Philipp Pagel Tel. +49-8161-71 2131 Dept. of Genome Oriented Bioinformatics Fax. +49-8161-71 2186 Technical University of Munich Science Center Weihenstephan 85350 Freising, Germany and Institute for Bioinformatics / MIPS Tel. +49-89-3187 3675 GSF - National Research Center Fax. +49-89-3187 3585 for Environment and Health Ingolst?dter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/staff/pagel
On Fri, 2006-07-21 at 05:43 -0700, Ahamarshan jn wrote:> Hi, > I have a dataset saved in *.csv format, that contains > 13 columns (the first column being the title name and > the rest experiments) and about 2500 rows. > Not all columns in the row have data in it > i.e for eg > > BS00,-0.084,0.0136,-0.1569,-0.6484,1.103,1.7859,0.40287,0.5368,0.08461,-0.1935,-0.147974,0.30685 > > BS01,0.491270283,0.875826172,,,,,,,,,, > > BS02,0.090794476,0.225858954,,,0.32643,0.34317,0.133145295,,,0.115832599,0.47636458, > > BS03,0.019828221,-0.095735935,-0.122767219,-0.0676,0.002533,-0.1510361,0.736247,2.053192,-0.423658,0.4591219,1.1245015, > > BS04,-0.435189342,-0.041595955,-0.781281128,-1.923036,-3.230167102,,,,0.152322609,-1.495513519,, > > > I am using R to perform a correlation, but I am > getting an error while trying to read the data as > > > "> > person.data<-read.table("datafile.csv",header=TRUE,sep=',',row.names=1) > > Error in scan (file = file, what = what, sep = sep, > quote = quote, dec = dec, : > line 1919 did not have 13 elements > Execution halted " > > The error looks as though there is a problem with the > last element being not read when it is blank. I could > introduce terms like "na" to the blank elements but I > donot want to do that because this will hinder my > future analysis. > > Can some one suggest me a solution to overcome this > problem while reading the data? , or is there > something that I have missed to make the data > readable. > > Thank you in advance, > > PS: The data was imported from a experiment and saved > in excel sheet as a *.csv and then used.You have already had other replies, to which I would add, be sure to read Chapter 8 in the R Import/Export Manual regarding importing Excel files and other options besides exporting to a CSV file. In addition, the issue of Excel generating CSV files with the last column missing on some rows is a known issue and is reported in the MSKB here: http://support.microsoft.com/default.aspx?scid=kb;EN-US;q77295 Even though the latest version of Excel listed in the article as being relevant is 97, I had this problem with 2000 and 2003 as well. I would instead use OpenOffice.org's Calc to do the export when this was required. Calc did not have this problem. HTH, Marc Schwartz