Gorjanc Gregor
2005-Feb-11 15:41 UTC
[Rd] Notes on bug reports 3229 and 3242 - as.matrix.data.frame
Hello R developers. I encountered the same problem as Uwe Ligges with as.matrix.data.frame() in bug reports 3229 and 3242 - under section not-reproducible. Example I have is:> tmplevel 2100-D 1 biological_process unknown NA 2 cellular process -5.88 3 development -8.42 4 physiological process -6.55 5 regulation of biological process NA 6 viral life cycle NA> str(tmp)`data.frame': 6 obs. of 2 variables: $ level : Factor w/ 6 levels "biological_..",..: 1 2 3 4 5 6 $ 2100-D_mean:`data.frame': 6 obs. of 1 variable: ..$ 2100-D: num NA -5.88 -8.42 -6.55 NA NA> as.matrix.data.frame(tmp)Error in as.matrix.data.frame(tmp) : dim<- : dims [product 6] do not match the length of object [7] The error associated with this is comming up at the end of function as.matrix.data.frame where it is used: dim(X) <- c(n, length(X)/n) ?dim says 'dim' has a method for 'data.frame's, which returns the length of the 'row.names' attribute of 'x' and the length of 'x' (the numbers of "rows" and "columns"). This part is ok. The problem is with X, which is "intensively" modified through the function. Before this (dim(X) <- ...) call X in my case is:> x <- tmp > "code from as.matrix.data.frame down to dim(X) <- ..." > X[[1]] [1] "biological_process unknown" [[2]] [1] "cellular process" [[3]] [1] "development" [[4]] [1] "physiological process" [[5]] [1] "regulation of biological process" [[6]] [1] "viral life cycle" [[7]] [1] NA -5.88 -8.42 -6.55 NA NA So we can see, that X is somehow destroyed - the first and second column of tmp differ. For dim command this should really be one long vector. So the problem lies in line X <- unlist(X, recursive = FALSE, use.names = FALSE) where it should be X <- unlist(X, recursive = TRUE, use.names = FALSE) ^^^^ I have checked source code for that function from R as well as in R-devel sources. I was not succesfull in reproducing the above with the data frame bellow though. It did not report any problems with old as.matrix.data.frame. There must be some trick with first column in my data. So I am quite sure my suggestion is OK. tmp1 <- data.frame(level=c("A A", "B B"), x=c(NA, -5.8)) -- Lep pozdrav / With regards, Gregor GORJANC --------------------------------------------------------------- University of Ljubljana Biotechnical Faculty URI: http://www.bfro.uni-lj.si Zootechnical Department email: gregor.gorjanc <at> bfro.uni-lj.si Groblje 3 tel: +386 (0)1 72 17 861 SI-1230 Domzale fax: +386 (0)1 72 17 888 Slovenia
Prof Brian Ripley
2005-Feb-11 16:47 UTC
[Rd] Notes on bug reports 3229 and 3242 - as.matrix.data.frame
You too have not give an reproducible example! If you have a corrupt data frame, the function may fail, which is what happened in the PR# you quote. Please note: you should not be calling as.matrix.data.frame, but as.matrix. On Fri, 11 Feb 2005, Gorjanc Gregor wrote:> Hello R developers. > > I encountered the same problem as Uwe Ligges with as.matrix.data.frame() > in bug reports 3229 and 3242 - under section not-reproducible. > > Example I have is: > >> tmp > level 2100-D > 1 biological_process unknown NA > 2 cellular process -5.88 > 3 development -8.42 > 4 physiological process -6.55 > 5 regulation of biological process NA > 6 viral life cycle NA > >> str(tmp) > `data.frame': 6 obs. of 2 variables: > $ level : Factor w/ 6 levels "biological_..",..: 1 2 3 4 5 6 > $ 2100-D_mean:`data.frame': 6 obs. of 1 variable: > ..$ 2100-D: num NA -5.88 -8.42 -6.55 NA NAI think you have a data frame column in a data frame, and that cannot be made directly into a matrix. It's the steps that got you here that are the problem.>> as.matrix.data.frame(tmp) > Error in as.matrix.data.frame(tmp) : dim<- : dims [product 6] do not > match the length of object [7] > > The error associated with this is comming up at the end of function > as.matrix.data.frame where it is used: > > dim(X) <- c(n, length(X)/n) > > ?dim says > 'dim' has a method for 'data.frame's, which returns the length of > the 'row.names' attribute of 'x' and the length of 'x' (the > numbers of "rows" and "columns"). > > This part is ok. The problem is with X, which is "intensively" > modified through the function. Before this (dim(X) <- ...) call > X in my case is: > >> x <- tmp >> "code from as.matrix.data.frame down to dim(X) <- ..." >> X > [[1]] > [1] "biological_process unknown" > > [[2]] > [1] "cellular process" > > [[3]] > [1] "development" > > [[4]] > [1] "physiological process" > > [[5]] > [1] "regulation of biological process" > > [[6]] > [1] "viral life cycle" > > [[7]] > [1] NA -5.88 -8.42 -6.55 NA NA > > So we can see, that X is somehow destroyed - the first and second > column of tmp differ. For dim command this should really be one > long vector. So the problem lies in line > > X <- unlist(X, recursive = FALSE, use.names = FALSE) > > where it should be > > X <- unlist(X, recursive = TRUE, use.names = FALSE) > ^^^^ > > I have checked source code for that function from R as well as > in R-devel sources. I was not succesfull in reproducing the above > with the data frame bellow though. It did not report any problems > with old as.matrix.data.frame. There must be some trick with > first column in my data. So I am quite sure my suggestion is > OK. > > tmp1 <- data.frame(level=c("A A", "B B"), x=c(NA, -5.8)) > > -- > Lep pozdrav / With regards, > Gregor GORJANC > > --------------------------------------------------------------- > University of Ljubljana > Biotechnical Faculty URI: http://www.bfro.uni-lj.si > Zootechnical Department email: gregor.gorjanc <at> bfro.uni-lj.si > Groblje 3 tel: +386 (0)1 72 17 861 > SI-1230 Domzale fax: +386 (0)1 72 17 888 > Slovenia > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Gorjanc Gregor
2005-Feb-11 17:21 UTC
[Rd] Notes on bug reports 3229 and 3242 - as.matrix.data.frame
! Look after character ! From: Prof Brian Ripley [mailto:ripley@stats.ox.ac.uk] You too have not give an reproducible example! ! Yes, I was not able to do it from my data. But bellow is one. It is ! a stupid one, but it works. The problem is use of as.data.frame in ! tmp1$L <- as.data.frame(tmp$L). This looks like to produce a corrupted ! data.frame. If I use just tmp1$L <- tmp$L, write.table and ! as.matrix.data.frame works OK. I still think that mine proposal can ! give benefit, since it works also on corrupted data frames. data(warpbreaks) tmp <- as.data.frame(tapply(breaks, list(wool, tension), mean)) tmp1 <- data.frame(level=rownames(tmp)) tmp1$L <- as.data.frame(tmp$L) write.table(tmp1) Error in as.matrix.data.frame(x) : dim<- : dims [product 2] do not match the length of object [3] tmp1$L <- tmp$L write.table(tmp1) "level" "L" "1" "A" 44.55556 "2" "B" 28.22222 If you have a corrupt data frame, the function may fail, which is what happened in the PR# you quote. Please note: you should not be calling as.matrix.data.frame, but as.matrix. ! I called it because I had problems with write.table and that function ! calls as.matrix.data.frame. On Fri, 11 Feb 2005, Gorjanc Gregor wrote:> Hello R developers. > > I encountered the same problem as Uwe Ligges with as.matrix.data.frame() > in bug reports 3229 and 3242 - under section not-reproducible. > > Example I have is: > >> tmp > level 2100-D > 1 biological_process unknown NA > 2 cellular process -5.88 > 3 development -8.42 > 4 physiological process -6.55 > 5 regulation of biological process NA > 6 viral life cycle NA > >> str(tmp) > `data.frame': 6 obs. of 2 variables: > $ level : Factor w/ 6 levels "biological_..",..: 1 2 3 4 5 6 > $ 2100-D_mean:`data.frame': 6 obs. of 1 variable: > ..$ 2100-D: num NA -5.88 -8.42 -6.55 NA NAI think you have a data frame column in a data frame, and that cannot be made directly into a matrix. It's the steps that got you here that are the problem.>> as.matrix.data.frame(tmp) > Error in as.matrix.data.frame(tmp) : dim<- : dims [product 6] do not > match the length of object [7] > > The error associated with this is comming up at the end of function > as.matrix.data.frame where it is used: > > dim(X) <- c(n, length(X)/n) > > ?dim says > 'dim' has a method for 'data.frame's, which returns the length of > the 'row.names' attribute of 'x' and the length of 'x' (the > numbers of "rows" and "columns"). > > This part is ok. The problem is with X, which is "intensively" > modified through the function. Before this (dim(X) <- ...) call > X in my case is: > >> x <- tmp >> "code from as.matrix.data.frame down to dim(X) <- ..." >> X > [[1]] > [1] "biological_process unknown" > > [[2]] > [1] "cellular process" > > [[3]] > [1] "development" > > [[4]] > [1] "physiological process" > > [[5]] > [1] "regulation of biological process" > > [[6]] > [1] "viral life cycle" > > [[7]] > [1] NA -5.88 -8.42 -6.55 NA NA > > So we can see, that X is somehow destroyed - the first and second > column of tmp differ. For dim command this should really be one > long vector. So the problem lies in line > > X <- unlist(X, recursive = FALSE, use.names = FALSE) > > where it should be > > X <- unlist(X, recursive = TRUE, use.names = FALSE) > ^^^^ > > I have checked source code for that function from R as well as > in R-devel sources. I was not succesfull in reproducing the above > with the data frame bellow though. It did not report any problems > with old as.matrix.data.frame. There must be some trick with > first column in my data. So I am quite sure my suggestion is > OK. > > tmp1 <- data.frame(level=c("A A", "B B"), x=c(NA, -5.8)) > > -- > Lep pozdrav / With regards, > Gregor GORJANC > > --------------------------------------------------------------- > University of Ljubljana > Biotechnical Faculty URI: http://www.bfro.uni-lj.si > Zootechnical Department email: gregor.gorjanc <at> bfro.uni-lj.si > Groblje 3 tel: +386 (0)1 72 17 861 > SI-1230 Domzale fax: +386 (0)1 72 17 888 > Slovenia > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595