msa@biostat.mgh.harvard.edu
1999-May-27 16:40 UTC
Factor structures not preserved after dump/dput (PR#200)
Full_Name: Marek Ancukiewicz Version: 0.64.0 OS: Linux (RedHat 6.0) Submission from: (NULL) (132.183.12.87) I've noticed that factor structures get recoded when the data is dumped using either dump or dput and then restored with source or dget. This occurs when the values taken by factors do not include 1. For example: a <- factor(1:5,1:5,c('a','b','c','d','e')) b <- a[3:5] dput(b,'b.data') new.b <- dget('b.data') Then b is not the same as new.b: > b [1] c d e Levels: a b c d e > new.b [1] a b c Levels: a b c d e This seems to be a very serious bug. It can make one to mislabel treatments: a very emabarassing (and potentially disastrous) mistake. The reason for this bug seems to lie in a way in which structure() treats factor structures. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Peter Dalgaard BSA
1999-May-27 17:06 UTC
Factor structures not preserved after dump/dput (PR#200)
msa@biostat.mgh.harvard.edu writes:> a <- factor(1:5,1:5,c('a','b','c','d','e')) > b <- a[3:5] > dput(b,'b.data') > new.b <- dget('b.data') > > Then b is not the same as new.b: > > b > [1] c d e > Levels: a b c d e > > new.b > [1] a b c > Levels: a b c d e > > This seems to be a very serious bug. It can make one to > mislabel treatments: a very emabarassing (and potentially > disastrous) mistake. The reason for this bug seems to > lie in a way in which structure() treats factor structures.Ouch! The fix seems to be to have structure() contain if (is.numeric(.Data) && any(names(attrib) == "levels")) .Data <- factor(.Data, levels = 1:max(1,.Data)) rather than just plain factor(.Data) . Will commit this in a moment. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._