Good Afternoon I have noticed results similar to the following several times as I have used R over the past several years. My .csv file has a header row and 3073 rows of data. > rskreg<-read.table('D:/data/riskregions.csv',header=T,sep=",") > dim(rskreg) [1] 2722 13 > rskreg<-read.csv('D:/data/riskregions.csv',header=T) > dim(rskreg) [1] 3073 13 > Does someone know what could be causing the read.table and read.csv functions to give different results on some occasions? The riskregions.csv file was generated with and saved from MS.Excel. Joe A
Barry Rowlingson
2009-Mar-13 20:47 UTC
[R] different outcomes using read.table vs read.csv
2009/3/13 jatwood <jatwood at montana.edu>:> Good Afternoon > I have noticed results similar to the following several times as I have used > R over the past several years. > My .csv file has a header row and 3073 rows of data. > >> rskreg<-read.table('D:/data/riskregions.csv',header=T,sep=",") >> dim(rskreg) > [1] 2722 ? 13 >> rskreg<-read.csv('D:/data/riskregions.csv',header=T) >> dim(rskreg) > [1] 3073 ? 13 >> > > Does someone know what could be causing the read.table and read.csv > functions to give different results on some occasions? ?The riskregions.csv > file was generated with and saved from MS.Excel.read.table has 'comment.char="#"', so if a line starts with # it gets ignored. read.csv doesn't have this set, so it might explain why read.csv gets more than read.table... Do you have lines starting with #? Try read.table with comment.char="" and see if you get the right number. See the help for read.table for more info. I'd not seen this before, hope it hasn't bitten me... Barry
Without data it is a bit difficult. However, you may want to check out the following: library(prob) That is from: http://finzi.psych.upenn.edu/R/R-devel/archive/26683.html It allows you to diff the data.frames, so you can see what is missing. This should allow you to find out what rows are missing. Maybe some NA rows were automatically removed. --- On Fri, 3/13/09, jatwood <jatwood at montana.edu> wrote:> From: jatwood <jatwood at montana.edu> > Subject: [R] different outcomes using read.table vs read.csv > To: r-help at r-project.org > Date: Friday, March 13, 2009, 3:32 PM > Good Afternoon > I have noticed results similar to the following several > times as I have used R over the past several years. > My .csv file has a header row and 3073 rows of data. > > > > rskreg<-read.table('D:/data/riskregions.csv',header=T,sep=",") > > dim(rskreg) > [1] 2722 13 > > > rskreg<-read.csv('D:/data/riskregions.csv',header=T) > > dim(rskreg) > [1] 3073 13 > > > > Does someone know what could be causing the read.table and > read.csv functions to give different results on some > occasions? The riskregions.csv file was generated with and > saved from MS.Excel. > > Joe A > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code.