Rameswara Sashi Kiran Challa
2012-May-04 06:16 UTC
[R] read.table() vs read.delim() any difference??
Hi, I have a tab seperated file with 206 rows and 30 columns. I read in the file into R using read.table() function. I checked the dim() of the data frame created in R, it had only 103 rows (exactly half), 30 columns. Then I tried reading in the file using read.delim() function and this time the dim() showed to be 206 rows, 30 columns as expected. Reading the read.table() R-help documentation, I came across count.fields() function. On using that on the tab seperated file, I got to learn that the header line alone has 30 fields and rest of the rows have 9 fields. I am now just wondering why read.delim() function was able to read in the file correctly and read.table() wasn't able to read the file completely ? Could anyone please throw some light on this? Thanks for your valuable time, Regards Sashi [[alternative HTML version deleted]]
read.delim calls read.table so any differences between the two are caused by differences in the default values of some of the parameters. Take a look at the help file ?read.table read.table uses white space as separator; read.delim tabs read.table uses " and ' as quotes; read.delim just " etc. Jan Rameswara Sashi Kiran Challa <schalla at umail.iu.edu> schreef:> Hi, > > I have a tab seperated file with 206 rows and 30 columns. > > I read in the file into R using read.table() function. I checked the dim() > of the data frame created in R, it had only 103 rows (exactly half), 30 > columns. Then I tried reading in the file using read.delim() function and > this time the dim() showed to be 206 rows, 30 columns as expected. > Reading the read.table() R-help documentation, I came across count.fields() > function. On using that on the tab seperated file, I got to learn that the > header line alone has 30 fields and rest of the rows have 9 fields. I am > now just wondering why read.delim() function was able to read in the file > correctly and read.table() wasn't able to read the file completely ? > > Could anyone please throw some light on this? > > Thanks for your valuable time, > > Regards > Sashi > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On May 4, 2012, at 08:16 , Rameswara Sashi Kiran Challa wrote:> Hi, > > I have a tab seperated file with 206 rows and 30 columns. > > I read in the file into R using read.table() function. I checked the dim() > of the data frame created in R, it had only 103 rows (exactly half), 30 > columns. Then I tried reading in the file using read.delim() function and > this time the dim() showed to be 206 rows, 30 columns as expected. > Reading the read.table() R-help documentation, I came across count.fields() > function. On using that on the tab seperated file, I got to learn that the > header line alone has 30 fields and rest of the rows have 9 fields. I am > now just wondering why read.delim() function was able to read in the file > correctly and read.table() wasn't able to read the file completely ? > > Could anyone please throw some light on this?This can't be answered in abstractum. However, all that read.delim does is to call read.table with a specific set of arguments, so you should be able to get the right result from read.table(......., header = TRUE, sep = "\t", quote = "\"", dec = ".", fill = TRUE, comment.char = "") So check that it works. If you are curious as to what is causing the difference, just knock out the arguments one by one.> > Thanks for your valuable time, > > Regards > Sashi > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com