I've encountered something that didn't arise using earlier versions of R (Linux). A dataframe is created and new columns added to it by doing calculations using apply with various functions on some of the original columns. It's somewhat too involved to give a toy example that's reproducible. However, the resulting phenemonon can be characterised by the following: Browse[1]> dim(mod.df) [1] 409 5 Browse[1]> object.size(mod.df) [1] 31520 Browse[1]> is.array(mod.df) [1] FALSE Browse[1]> mod.df[1:5,] Error in as.data.frame.default(x[[i]], optional = TRUE) : can't coerce array into a data.frame The whole dataframe would display correctly, so I figured it couldn't have much wrong with it. So I tried this: Browse[1]> write.table(mod.df, "mod.tmp", quote = F, sep = "\t", row.names = F) Browse[1]> mod.df <- read.table("mod.tmp", T, sep = "\t") Browse[1]> is.array(mod.df) [1] FALSE Browse[1]> object.size(mod.df) [1] 16164 Browse[1]> mod.df[1:5,] Site System Cultivar Type CFU 1 Canterbury ifp braeburn fruit 388 2 Canterbury ifp braeburn fruit 920 3 Canterbury ifp braeburn fruit 868 4 Canterbury ifp braeburn fruit 328 5 Canterbury ifp braeburn fruit 656 The size of the object using R-1.8.0 (which had no subsetting problems) was Browse[1]> object.size(mod.df) [1] 21160 I suspect it could have something to do with some of the changes mentioned in this part of the NEWS file: o Subscripting for data.frames has been rationalized: But I'm not smart enough to see what in those dozen or so would have a bearing on this case. I don't think the drop argument comes into what I've done. If that's not sufficient to give anyone a hint what could be happening, I'll have another attempt to get a toy version. Thanks. PS: Is there a more elegant way using a text connection instead or creating a temporary file in my work around? -- Patrick Connolly HortResearch Mt Albert Auckland New Zealand Ph: +64-9 815 4200 x 7188 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~ I have the world`s largest collection of seashells. I keep it on all the beaches of the world ... Perhaps you`ve seen it. ---Steven Wright ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
The changes you point to were between 1.7.1 and 1.8.0, not between 1.8.0 and 1.8.1. 1.7.1 was quite capable of producing invalid data frames from erroneous usages. I think we really do need to see a reproducible example. On Tue, 25 Nov 2003, Patrick Connolly wrote:> I've encountered something that didn't arise using earlier versions of > R (Linux). > > A dataframe is created and new columns added to it by doing > calculations using apply with various functions on some of the > original columns. It's somewhat too involved to give a toy example > that's reproducible. However, the resulting phenemonon can be > characterised by the following: > > Browse[1]> dim(mod.df) > [1] 409 5 > Browse[1]> object.size(mod.df) > [1] 31520 > Browse[1]> is.array(mod.df) > [1] FALSE > Browse[1]> mod.df[1:5,] > Error in as.data.frame.default(x[[i]], optional = TRUE) : > can't coerce array into a data.frame > > The whole dataframe would display correctly, so I figured it couldn't > have much wrong with it. So I tried this: > > > Browse[1]> write.table(mod.df, "mod.tmp", quote = F, sep = "\t", row.names = F) > Browse[1]> mod.df <- read.table("mod.tmp", T, sep = "\t") > Browse[1]> is.array(mod.df) > [1] FALSE > Browse[1]> object.size(mod.df) > [1] 16164 > Browse[1]> mod.df[1:5,] > Site System Cultivar Type CFU > 1 Canterbury ifp braeburn fruit 388 > 2 Canterbury ifp braeburn fruit 920 > 3 Canterbury ifp braeburn fruit 868 > 4 Canterbury ifp braeburn fruit 328 > 5 Canterbury ifp braeburn fruit 656 > > > The size of the object using R-1.8.0 (which had no subsetting > problems) was > > Browse[1]> object.size(mod.df) > [1] 21160 > > > I suspect it could have something to do with some of the changes > mentioned in this part of the NEWS file: > > o Subscripting for data.frames has been rationalized: > > But I'm not smart enough to see what in those dozen or so would have a > bearing on this case. I don't think the drop argument comes into what > I've done. > > If that's not sufficient to give anyone a hint what could be > happening, I'll have another attempt to get a toy version. > > > Thanks. > > PS: Is there a more elegant way using a text connection instead or > creating a temporary file in my work around?Yes! Use an anonymous file connection opened for rw. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Patrick Connolly <p.connolly at hortresearch.co.nz> writes:> I've encountered something that didn't arise using earlier versions of > R (Linux). > > A dataframe is created and new columns added to it by doing > calculations using apply with various functions on some of the > original columns. It's somewhat too involved to give a toy example > that's reproducible.You can try harder, though. See below.> However, the resulting phenemonon can be > characterised by the following: > > Browse[1]> dim(mod.df) > [1] 409 5 > Browse[1]> object.size(mod.df) > [1] 31520 > Browse[1]> is.array(mod.df) > [1] FALSE > Browse[1]> mod.df[1:5,] > Error in as.data.frame.default(x[[i]], optional = TRUE) : > can't coerce array into a data.frameLooks like one of the columns of mod.df is not what is should have been. So what does str(mod.df) say?. Also, just before the subsetting, try setting debug(as.data.frame.default) and see what its argument is in the case that fails.> The whole dataframe would display correctly, so I figured it couldn't > have much wrong with it.My bet is that it does...> I suspect it could have something to do with some of the changes > mentioned in this part of the NEWS file: > > o Subscripting for data.frames has been rationalized: > > But I'm not smart enough to see what in those dozen or so would have a > bearing on this case. I don't think the drop argument comes into what > I've done.Note that this was changed already in 1.8.0, which you say have no problems... My guess is that the code is not quite smart enough yet, e.g.> x <-data.frame(a=0:9,b=2:11) > x$b <- array(1:10,10) > xError in as.data.frame.default(x[[i]], optional = TRUE) : can't coerce array into a data.frame but it's not like that has worked before (certainly not in 1.7.1 anyway). One difference is that indexing used to clean up this kind of corrupted data frame, but now it gives you a data frame which is corrupted in the same way: 1.8.0 (x as above):> z <- x[1:5,] > za b 1 0 1 2 1 2 3 2 3 4 3 4 5 4 5> xError in as.data.frame.default(x[[i]], optional = TRUE) : can't coerce array into a data.frame 1.8.1:> z <- x[1:5,] > zError in as.data.frame.default(x[[i]], optional = TRUE) : can't coerce array into a data.frame> If that's not sufficient to give anyone a hint what could be > happening, I'll have another attempt to get a toy version. > > > Thanks. > > PS: Is there a more elegant way using a text connection instead or > creating a temporary file in my work around?Not really. Shouldn't have to do it though. You probably want to put an as.vector around those apply() calls instead. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907