Liaw, Andy
2005-Feb-11 18:24 UTC
[Rd] Notes on bug reports 3229 and 3242 - as.matrix.data.fram e
> From: Gorjanc Gregor > > ! Look after character ! > > From: Prof Brian Ripley [mailto:ripley@stats.ox.ac.uk] > You too have not give an reproducible example! > ! Yes, I was not able to do it from my data. But bellow is one. It is > ! a stupid one, but it works. The problem is use of as.data.frame in > ! tmp1$L <- as.data.frame(tmp$L). This looks like to produce > a corrupted > ! data.frame. If I use just tmp1$L <- tmp$L, write.table and > ! as.matrix.data.frame works OK. I still think that mine proposal can > ! give benefit, since it works also on corrupted data frames. > > data(warpbreaks) > tmp <- as.data.frame(tapply(breaks, list(wool, tension), mean)) > tmp1 <- data.frame(level=rownames(tmp)) > tmp1$L <- as.data.frame(tmp$L)Here's the problem that Brian is referring to: Why do you make one variable in the data frame a data frame? That's what caused problem in write.table()! Andy> write.table(tmp1) > Error in as.matrix.data.frame(x) : dim<- : dims [product 2] > do not match the length of object [3] > > tmp1$L <- tmp$L > write.table(tmp1) > "level" "L" > "1" "A" 44.55556 > "2" "B" 28.22222 > > If you have a corrupt data frame, the function may fail, > which is what > happened in the PR# you quote. > > Please note: you should not be calling as.matrix.data.frame, > but as.matrix. > ! I called it because I had problems with write.table and > that function > ! calls as.matrix.data.frame. > > On Fri, 11 Feb 2005, Gorjanc Gregor wrote: > > > Hello R developers. > > > > I encountered the same problem as Uwe Ligges with > as.matrix.data.frame() > > in bug reports 3229 and 3242 - under section not-reproducible. > > > > Example I have is: > > > >> tmp > > level 2100-D > > 1 biological_process unknown NA > > 2 cellular process -5.88 > > 3 development -8.42 > > 4 physiological process -6.55 > > 5 regulation of biological process NA > > 6 viral life cycle NA > > > >> str(tmp) > > `data.frame': 6 obs. of 2 variables: > > $ level : Factor w/ 6 levels "biological_..",..: 1 2 3 4 5 6 > > $ 2100-D_mean:`data.frame': 6 obs. of 1 variable: > > ..$ 2100-D: num NA -5.88 -8.42 -6.55 NA NA > > I think you have a data frame column in a data frame, and > that cannot be > made directly into a matrix. It's the steps that got you > here that are > the problem. > > >> as.matrix.data.frame(tmp) > > Error in as.matrix.data.frame(tmp) : dim<- : dims [product 6] do not > > match the length of object [7] > > > > The error associated with this is comming up at the end of function > > as.matrix.data.frame where it is used: > > > > dim(X) <- c(n, length(X)/n) > > > > ?dim says > > 'dim' has a method for 'data.frame's, which returns the > length of > > the 'row.names' attribute of 'x' and the length of 'x' (the > > numbers of "rows" and "columns"). > > > > This part is ok. The problem is with X, which is "intensively" > > modified through the function. Before this (dim(X) <- ...) call > > X in my case is: > > > >> x <- tmp > >> "code from as.matrix.data.frame down to dim(X) <- ..." > >> X > > [[1]] > > [1] "biological_process unknown" > > > > [[2]] > > [1] "cellular process" > > > > [[3]] > > [1] "development" > > > > [[4]] > > [1] "physiological process" > > > > [[5]] > > [1] "regulation of biological process" > > > > [[6]] > > [1] "viral life cycle" > > > > [[7]] > > [1] NA -5.88 -8.42 -6.55 NA NA > > > > So we can see, that X is somehow destroyed - the first and second > > column of tmp differ. For dim command this should really be one > > long vector. So the problem lies in line > > > > X <- unlist(X, recursive = FALSE, use.names = FALSE) > > > > where it should be > > > > X <- unlist(X, recursive = TRUE, use.names = FALSE) > > ^^^^ > > > > I have checked source code for that function from R as well as > > in R-devel sources. I was not succesfull in reproducing the above > > with the data frame bellow though. It did not report any problems > > with old as.matrix.data.frame. There must be some trick with > > first column in my data. So I am quite sure my suggestion is > > OK. > > > > tmp1 <- data.frame(level=c("A A", "B B"), x=c(NA, -5.8)) > > > > -- > > Lep pozdrav / With regards, > > Gregor GORJANC > > > > --------------------------------------------------------------- > > University of Ljubljana > > Biotechnical Faculty URI: http://www.bfro.uni-lj.si > > Zootechnical Department email: gregor.gorjanc <at> bfro.uni-lj.si > > Groblje 3 tel: +386 (0)1 72 17 861 > > SI-1230 Domzale fax: +386 (0)1 72 17 888 > > Slovenia > > > > ______________________________________________ > > R-devel@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > > > -- > Brian D. Ripley, ripley@stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >
Gorjanc Gregor
2005-Feb-11 21:41 UTC
[Rd] Notes on bug reports 3229 and 3242 - as.matrix.data.fram e
From: Liaw, Andy [mailto:andy_liaw@merck.com]> From: Gorjanc Gregor> ! Yes, I was not able to do it from my data. But bellow is one. It is > ! a stupid one, but it works. The problem is use of as.data.frame in > ! tmp1$L <- as.data.frame(tmp$L). This looks like to produce > a corrupted > ! data.frame. If I use just tmp1$L <- tmp$L, write.table and > ! as.matrix.data.frame works OK. I still think that mine proposal can > ! give benefit, since it works also on corrupted data frames. > > data(warpbreaks) > tmp <- as.data.frame(tapply(breaks, list(wool, tension), mean)) > tmp1 <- data.frame(level=rownames(tmp)) > tmp1$L <- as.data.frame(tmp$L)Here's the problem that Brian is referring to: Why do you make one variable in the data frame a data frame? That's what caused problem in write.table()! ! I agree completely and as I have described up it is my fault that ! I have/had problems with as.matrix.data.frame by use of write.table. ! But I think that my proposal is nice, since as.matrix.data.frame would ! be more robust. ! With regards, Gregor
Gorjanc Gregor
2005-Feb-12 03:40 UTC
[Rd] Notes on bug reports 3229 and 3242 - as.matrix.data.fram e
I agree. Sorry for bothering. With regards, Gregor -----Original Message----- From: Prof Brian Ripley [mailto:ripley@stats.ox.ac.uk] Sent: pet 2005-02-11 22:35 To: Gorjanc Gregor Cc: Liaw, Andy; r-devel@stat.math.ethz.ch Subject: RE: [Rd] Notes on bug reports 3229 and 3242 - as.matrix.data.fram e ...> ! I agree completely and as I have described up it is my fault that > ! I have/had problems with as.matrix.data.frame by use of write.table. > ! But I think that my proposal is nice, since as.matrix.data.frame would > ! be more robust.It is actually much less robust. It would work for embedded data frames of one column, but you could have a list column with entries of different lengths. e.g. X <- data.frame(x=1:2, y = I(list(a=1, b=3:4)))> as.matrix(X)x y a 1 1 b 2 Integer,2 With your fix, this becomes an error. And I could replace those entries by data frames containing lists of dates .... Note that in R-devel write.table does not convert data frames to matrices, so this does not arise. We could treat your example specially, but surely it was an error that is better found out about than hushed up. -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595