Since many people commenting on the gene name problem in Excel essentially tell us This could never have happened with R I want to show you a somewhat related issue: ff1 <- tempfile() cat(file = ff1, "12345", "1E002", sep = "\n") xdf1 <- read.fwf(ff1, widths = 5, stringsAsFactors=FALSE) ff2 <- tempfile() cat(file = ff2, "12345", "1E002","1A010", sep = "\n") xdf2 <- read.fwf(ff2, widths = 5, stringsAsFactors=FALSE) in xdf1, the variable is numeric, in xdf2, it is a character variable. Of course, in hindsight this makes sense. But the problem is similar to the Excel problem where something which could be a date is interpreted as a date. A possible solution with my read.fwf problem would be to have a parameter forcing variables to be read as strings. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 670 bytes Desc: Message signed with OpenPGP using GPGMail URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20160914/2dd78d01/attachment.bin>
What, like the colClasses argument? Darn that ellipsis and its consequent deferred documentation... but it _is_ mentioned in passing in ?read.fwf. -- Sent from my phone. Please excuse my brevity. On September 13, 2016 10:54:44 PM PDT, Erich Neuwirth <erich.neuwirth at univie.ac.at> wrote:>Since many people commenting on the gene name problem in Excel >essentially tell us >This could never have happened with R >I want to show you a somewhat related issue: > > >ff1 <- tempfile() >cat(file = ff1, "12345", "1E002", sep = "\n") >xdf1 <- read.fwf(ff1, widths = 5, stringsAsFactors=FALSE) > >ff2 <- tempfile() >cat(file = ff2, "12345", "1E002","1A010", sep = "\n") >xdf2 <- read.fwf(ff2, widths = 5, stringsAsFactors=FALSE) > >in xdf1, the variable is numeric, in xdf2, it is a character variable. >Of course, in hindsight this makes sense. But the problem is similar to >the >Excel problem where something which could be a date is interpreted as a >date. > >A possible solution with my read.fwf problem would be to have a >parameter >forcing variables to be read as strings. > > > >------------------------------------------------------------------------ > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
I see it (after a lot of peering at the code). It is a nasty problem but I suspect one that would get flagged later in an analysis (well in most cases). The Excel problem is serious in another way. Many people use Excel or other spreadsheets as data entry tool ---which I think was the cause of the issue in the gene study---and can lose the data completely if there is no paper backup. In your example, one can run str() and diagnose the problem and recover (i.e. convert)the data. If I have 30,000 rows of data in a spreadsheet is there anyway I can tell if some of my character data has converted to numerical dates and convert back? John Kane Kingston ON Canada> -----Original Message----- > From: erich.neuwirth at univie.ac.at > Sent: Wed, 14 Sep 2016 07:54:44 +0200 > To: r-help at r-project.org > Subject: [R] gene name problem in Excel, and an R analogue > > Since many people commenting on the gene name problem in Excel > essentially tell us > This could never have happened with R > I want to show you a somewhat related issue: > > > ff1 <- tempfile() > cat(file = ff1, "12345", "1E002", sep = "\n") > xdf1 <- read.fwf(ff1, widths = 5, stringsAsFactors=FALSE) > > ff2 <- tempfile() > cat(file = ff2, "12345", "1E002","1A010", sep = "\n") > xdf2 <- read.fwf(ff2, widths = 5, stringsAsFactors=FALSE) > > in xdf1, the variable is numeric, in xdf2, it is a character variable. > Of course, in hindsight this makes sense. But the problem is similar to > the > Excel problem where something which could be a date is interpreted as a > date. > > A possible solution with my read.fwf problem would be to have a parameter > forcing variables to be read as strings.____________________________________________________________ FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop!