thr3ads.net - R help - [R] getting data frame rows out of a by object [Apr 2004]

If this information is useful, please help other people find it:
Share via:

Ed L Cashin

2004-Apr-08 00:14 UTC

[R] getting data frame rows out of a by object

Hi.  I can quickly create a by object that selects rows from a data
frame.  After that, though, I don't know how to merge the rows back
into a data frame that I can use.

Here is an example where there is a data frame with three columns, a,
b, and c.  I update it so that there are two rows for each combination
of a and b.  I use by to select the subgroups of rows that share the
same a and b values, and then I take only the row with the highest c
value.

I can see little data frames inside the by object, but I can't get a
new data frame containing only the rows with the highest c value.  In
the example below most of the by object is NULL, but it contains data
frames with the rows I'm interested in selecting.

  > d <- data.frame(a=1:4,b=4:1,c=31:34)
  > d
    a b  c
  1 1 4 31
  2 2 3 32
  3 3 2 33
  4 4 1 34
  > b <- by(d, list(d$a,d$b,d$c), function(x) x)
  > d <- data.frame(a=1:4,b=4:1,c=31:34)
  > d <- rbind(d, data.frame(a=1:4,b=4:1,c=41:44))
  > b <- by(d, list(d$a,d$b,d$c), function(x) x[x$c == max(x$c),])
  > b
  : 1
  : 1
  : 31
  NULL
  ------------------------------------------------------------ 
  : 2
  : 1
  : 31
  NULL
  ------------------------------------------------------------ 
...  
  ------------------------------------------------------------ 
  : 3
  : 2
  : 43
     a b  c
  31 3 2 43
...  
  > merge(b)
  Error in as.data.frame.default(x) : can't coerce by into a data.frame
  > 


Any help is most appreciated.  


-- 
--Ed L Cashin            |   PGP public key:
  ecashin at uga.edu        |   http://noserose.net/e/pgp/

Julian Taylor

2004-Apr-08 01:02 UTC

head link

[R] getting data frame rows out of a by object

Ed L Cashin wrote:> 
> Hi.  I can quickly create a by object that selects rows from a data
> frame.  After that, though, I don't know how to merge the rows back
> into a data frame that I can use.
> 
> Here is an example where there is a data frame with three columns, a,
> b, and c.  I update it so that there are two rows for each combination
> of a and b.  I use by to select the subgroups of rows that share the
> same a and b values, and then I take only the row with the highest c
> value.
> 
> I can see little data frames inside the by object, but I can't get a
> new data frame containing only the rows with the highest c value.  In
> the example below most of the by object is NULL, but it contains data
> frames with the rows I'm interested in selecting.
> 
>   > d <- data.frame(a=1:4,b=4:1,c=31:34)
>   > d
>     a b  c
>   1 1 4 31
>   2 2 3 32
>   3 3 2 33
>   4 4 1 34
>   > b <- by(d, list(d$a,d$b,d$c), function(x) x)
>   > d <- data.frame(a=1:4,b=4:1,c=31:34)
>   > d <- rbind(d, data.frame(a=1:4,b=4:1,c=41:44))
>   > b <- by(d, list(d$a,d$b,d$c), function(x) x[x$c == max(x$c),])
You are better off using other tools to give you the right subsets. Try

d <- do.call("rbind", lapply(split(d, factor(paste(d$a, d$b, sep
""))),
                            function(el) el[el$c == max(el$c), ]))

HTH,
Jules

-- 
---
Julian Taylor			phone: +61 8 8303 6751
ARC Research Associate            fax: +61 8 8303 6760
BiometricsSA,                  mobile: +61 4 1638 8180  
University of Adelaide/SARDI    email: julian.taylor at adelaide.edu.au
Private Mail Bag 1                www:
http://www.BiometricsSA.adelaide.edu.au
Glen Osmond SA 5064

"There is no spoon."   -- Orphan boy  
---

Maybe Matching Threads

Search for more possibly parallel threads

R help - Apr 2004 - getting data frame rows out of a by object

[R] getting data frame rows out of a by object

[R] getting data frame rows out of a by object

Maybe Matching Threads