thr3ads.net - R help - [R] 1.8.1 and subsetting dataframes [Nov 2003]

If this information is useful, please help other people find it:
Share via:

Patrick Connolly

2003-Nov-24 21:23 UTC

[R] 1.8.1 and subsetting dataframes

I've encountered something that didn't arise using earlier versions of
R (Linux).

A dataframe is created and new columns added to it by doing
calculations using apply with various functions on some of the
original columns.  It's somewhat too involved to give a toy example
that's reproducible.  However, the resulting phenemonon can be
characterised by the following:

Browse[1]> dim(mod.df)
[1] 409   5
Browse[1]> object.size(mod.df)
[1] 31520
Browse[1]> is.array(mod.df)
[1] FALSE
Browse[1]> mod.df[1:5,]
Error in as.data.frame.default(x[[i]], optional = TRUE) : 
	can't coerce array into a data.frame

The whole dataframe would display correctly, so I figured it couldn't
have much wrong with it.  So I tried this:


Browse[1]>     write.table(mod.df, "mod.tmp", quote = F, sep =
"\t", row.names = F)
Browse[1]>     mod.df <- read.table("mod.tmp", T, sep =
"\t")
Browse[1]> is.array(mod.df)
[1] FALSE
Browse[1]> object.size(mod.df)
[1] 16164
Browse[1]> mod.df[1:5,]
        Site System Cultivar  Type CFU
1 Canterbury    ifp braeburn fruit 388
2 Canterbury    ifp braeburn fruit 920
3 Canterbury    ifp braeburn fruit 868
4 Canterbury    ifp braeburn fruit 328
5 Canterbury    ifp braeburn fruit 656


The size of the object using R-1.8.0 (which had no subsetting
problems) was

Browse[1]> object.size(mod.df)
[1] 21160


I suspect it could have something to do with some of the changes
mentioned in this part of the NEWS file:

    o	Subscripting for data.frames has been rationalized:

But I'm not smart enough to see what in those dozen or so would have a
bearing on this case.  I don't think the drop argument comes into what
I've done.

If that's not sufficient to give anyone a hint what could be
happening, I'll have another attempt to get a toy version.


Thanks.

PS: Is there a more elegant way using a text connection instead or
creating a temporary file in my work around?

-- 
Patrick Connolly
HortResearch
Mt Albert
Auckland
New Zealand 
Ph: +64-9 815 4200 x 7188
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
I have the world`s largest collection of seashells. I keep it on all
the beaches of the world ... Perhaps you`ve seen it.  ---Steven Wright 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~

Prof Brian Ripley

2003-Nov-24 21:59 UTC

head link

[R] 1.8.1 and subsetting dataframes

The changes you point to were between 1.7.1 and 1.8.0, not between 1.8.0 
and 1.8.1.

1.7.1 was quite capable of producing invalid data frames from erroneous 
usages.

I think we really do need to see a reproducible example.

On Tue, 25 Nov 2003, Patrick Connolly wrote:
> I've encountered something that didn't arise using earlier versions
of
> R (Linux).
> 
> A dataframe is created and new columns added to it by doing
> calculations using apply with various functions on some of the
> original columns.  It's somewhat too involved to give a toy example
> that's reproducible.  However, the resulting phenemonon can be
> characterised by the following:
> 
> Browse[1]> dim(mod.df)
> [1] 409   5
> Browse[1]> object.size(mod.df)
> [1] 31520
> Browse[1]> is.array(mod.df)
> [1] FALSE
> Browse[1]> mod.df[1:5,]
> Error in as.data.frame.default(x[[i]], optional = TRUE) : 
> 	can't coerce array into a data.frame
> 
> The whole dataframe would display correctly, so I figured it couldn't
> have much wrong with it.  So I tried this:
> 
> 
> Browse[1]>     write.table(mod.df, "mod.tmp", quote = F, sep =
"\t", row.names = F)
> Browse[1]>     mod.df <- read.table("mod.tmp", T, sep =
"\t")
> Browse[1]> is.array(mod.df)
> [1] FALSE
> Browse[1]> object.size(mod.df)
> [1] 16164
> Browse[1]> mod.df[1:5,]
>         Site System Cultivar  Type CFU
> 1 Canterbury    ifp braeburn fruit 388
> 2 Canterbury    ifp braeburn fruit 920
> 3 Canterbury    ifp braeburn fruit 868
> 4 Canterbury    ifp braeburn fruit 328
> 5 Canterbury    ifp braeburn fruit 656
> 
> 
> The size of the object using R-1.8.0 (which had no subsetting
> problems) was
> 
> Browse[1]> object.size(mod.df)
> [1] 21160
> 
> 
> I suspect it could have something to do with some of the changes
> mentioned in this part of the NEWS file:
> 
>     o	Subscripting for data.frames has been rationalized:
> 
> But I'm not smart enough to see what in those dozen or so would have a
> bearing on this case.  I don't think the drop argument comes into what
> I've done.
> 
> If that's not sufficient to give anyone a hint what could be
> happening, I'll have another attempt to get a toy version.
> 
> 
> Thanks.
> 
> PS: Is there a more elegant way using a text connection instead or
> creating a temporary file in my work around?
Yes!  Use an anonymous file connection opened for rw.


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Peter Dalgaard

2003-Nov-24 22:36 UTC

head link

[R] 1.8.1 and subsetting dataframes

Patrick Connolly <p.connolly at hortresearch.co.nz> writes:
> I've encountered something that didn't arise using earlier versions
of
> R (Linux).
> 
> A dataframe is created and new columns added to it by doing
> calculations using apply with various functions on some of the
> original columns.  It's somewhat too involved to give a toy example
> that's reproducible.  
You can try harder, though. See below.
> However, the resulting phenemonon can be
> characterised by the following:
> 
> Browse[1]> dim(mod.df)
> [1] 409   5
> Browse[1]> object.size(mod.df)
> [1] 31520
> Browse[1]> is.array(mod.df)
> [1] FALSE
> Browse[1]> mod.df[1:5,]
> Error in as.data.frame.default(x[[i]], optional = TRUE) : 
> 	can't coerce array into a data.frame
Looks like one of the columns of mod.df is not what is should have
been. So what does str(mod.df) say?. Also, just before the subsetting,
try setting debug(as.data.frame.default) and see what its argument is
in the case that fails.
 > The whole dataframe would display correctly, so I figured it couldn't
> have much wrong with it.  
My bet is that it does...
> I suspect it could have something to do with some of the changes
> mentioned in this part of the NEWS file:
> 
>     o	Subscripting for data.frames has been rationalized:
> 
> But I'm not smart enough to see what in those dozen or so would have a
> bearing on this case.  I don't think the drop argument comes into what
> I've done.
Note that this was changed already in 1.8.0, which you say have no
problems... 

My guess is that the code is not quite smart enough yet, e.g.
> x <-data.frame(a=0:9,b=2:11)
> x$b <- array(1:10,10)
> xError in as.data.frame.default(x[[i]], optional = TRUE) :
        can't coerce array into a data.frame

but it's not like that has worked before (certainly not in 1.7.1
anyway). One difference is that indexing used to clean up this kind of
corrupted data frame, but now it gives you a data frame which is
corrupted in the same way:

1.8.0 (x as above):
> z <- x[1:5,]
> z  a b
1 0 1
2 1 2
3 2 3
4 3 4
5 4 5> xError in as.data.frame.default(x[[i]], optional = TRUE) :
        can't coerce array into a data.frame

1.8.1:
> z <- x[1:5,]
> zError in as.data.frame.default(x[[i]], optional = TRUE) :
        can't coerce array into a data.frame

> If that's not sufficient to give anyone a hint what could be
> happening, I'll have another attempt to get a toy version.
> 
> 
> Thanks.
> 
> PS: Is there a more elegant way using a text connection instead or
> creating a temporary file in my work around?
Not really. Shouldn't have to do it though. You probably want to put
an as.vector around those apply() calls instead.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907

Maybe Matching Threads

Search for more possibly parallel threads

R help - Nov 2003 - 1.8.1 and subsetting dataframes

[R] 1.8.1 and subsetting dataframes

[R] 1.8.1 and subsetting dataframes

[R] 1.8.1 and subsetting dataframes

Maybe Matching Threads