Hello,
I very much enjoy "with" and "subset" semantics for data
frames and was
wondering if we could have something similar with split, basically by
evaluating the second argument "with" the data frame :
split.data.frame
function(x, f, drop = FALSE, ...){
call <- match.call( )
fcall <- call( "with", data = call[["x"]], expr
= call[["f"]] )
ff <- eval( fcall, parent.frame(1) )
lapply(split(seq_len(nrow(x)), ff, drop = drop, ...),
function(ind) x[ind, , drop = FALSE])
}
> split( df, y )
$`1`
x y
1 1 1
2 2 1
3 3 1
4 4 1
5 5 1
$`2`
x y
6 6 2
7 7 2
8 8 2
9 9 2
10 10 2
> split( df, x > 3 )
$`FALSE`
x y
1 1 1
2 2 1
3 3 1
$`TRUE`
x y
4 4 1
5 5 1
6 6 2
7 7 2
8 8 2
9 9 2
10 10 2
Romain
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/HlX9 : new package : bibtex
|- http://tr.im/Gq7i : ohloh
`- http://tr.im/FtUu : new package : highlight
I agree, I would definitely appreciate that.
A simpler implementation:
split.data.frame <- function(x, f, drop = FALSE, ...)
{
ff <- eval(substitute(f), x, parent.frame())
lapply(split(seq_len(nrow(x)), ff, drop = drop, ...),
function(ind) x[ind, , drop = FALSE])
}
df <- data.frame(x = 1:10, y = rep(1:2, each = 5))
split( df, df$y )
split( df, y )
split( df, x > 3 )
2009/12/15 Romain Francois <romain.francois at
dbmail.com>:> Hello,
>
> I very much enjoy "with" and "subset" semantics for
data frames and was
> wondering if we could have something similar with split, basically by
> evaluating the second argument "with" the data frame :
>
> split.data.frame
> function(x, f, drop = FALSE, ...){
> ? ? ? ?call <- match.call( )
> ? ? ? ?fcall <- call( "with", data = call[["x"]],
expr = call[["f"]] )
> ? ? ? ?ff <- eval( fcall, parent.frame(1) )
>
> ? ? ? ?lapply(split(seq_len(nrow(x)), ff, drop = drop, ...), function(ind)
> x[ind, , drop = FALSE])
> }
>
>
>> split( df, y )
> $`1`
> ?x y
> 1 1 1
> 2 2 1
> 3 3 1
> 4 4 1
> 5 5 1
>
> $`2`
> ? ?x y
> 6 ? 6 2
> 7 ? 7 2
> 8 ? 8 2
> 9 ? 9 2
> 10 10 2
>
>> split( df, x > 3 )
> $`FALSE`
> ?x y
> 1 1 1
> 2 2 1
> 3 3 1
>
> $`TRUE`
> ? ?x y
> 4 ? 4 1
> 5 ? 5 1
> 6 ? 6 2
> 7 ? 7 2
> 8 ? 8 2
> 9 ? 9 2
> 10 10 2
>
>
> Romain
>
> --
> Romain Francois
> Professional R Enthusiast
> +33(0) 6 28 91 30 30
> http://romainfrancois.blog.free.fr
> |- http://tr.im/HlX9 : new package : bibtex
> |- http://tr.im/Gq7i : ohloh
> `- http://tr.im/FtUu : new package : highlight
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Felix Andrews / ???
Postdoctoral Fellow
Integrated Catchment Assessment and Management (iCAM) Centre
Fenner School of Environment and Society [Bldg 48a]
The Australian National University
Canberra ACT 0200 Australia
M: +61 410 400 963
T: + 61 2 6125 4670
E: felix.andrews at anu.edu.au
CRICOS Provider No. 00120C
--
http://www.neurofractal.org/felix/
Romain Francois wrote:> Hello, > > I very much enjoy "with" and "subset" semantics for data frames and was > wondering if we could have something similar with split, basically by > evaluating the second argument "with" the data frame :I seem to recall that this idea was considered and rejected when the current split.data.frame was written (10 years ago!). The main reasons were that - it's not really THAT hard to evaluate a single splitting expression using with() or eval() - not all applications will have the splitting factor inside the df to split ( split(df[-1], df[[1]]) for a simple case) - if you need a computed splitting factor, there's a risk of inadvertent variable capture. I.e., if you inside a function do .... grp <- ...whatever... spl <- split(x, grp) .... and x has a variable called grp, what do you get? -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907