Karl Brand
2012-Dec-06  13:49 UTC
[R] function to filter identical data.fames using less than (<) and greater than (>)
Esteemed UseRs, I've got many biggish data frames which need a lot subsetting, like in this example: # example eg <- data.frame(A = rnorm(10), B = rnorm(10), C = rnorm(10), D = rnorm(10)) egsub <- eg[eg$A < 0 & eg$B < 1 & eg$C > 0, ] egsub egsub2 <- eg[eg$A > 1 & eg$B > 0, ] egsub2 # To make this clearer than 1000s of lines of extractions with [] # I tried to make a function like this: # func(data="eg", A="< 0", B="< 1", C="> 0") # Which would also need to be run as # func(data="eg", A="> 1", B="> 0", C=NA) #end Noteably: -the signs* "<" and ">" need to be flexible _and_ optional -the quantities also need to be flexible -column header names i.e, A, B and C don't need flexibility, i.e., can remain fixed * "less than" and "greater than" so google picks up this thread Once again i find just how limited my grasp of R is...Is do.call() the best way to call binary operators like < & > in a function? Is an ifelse statement needed for each column to make filtering on it optional? etc.... Any one with the patience to show their working version of such a funciton would receive my undying Rdulation. With thanks in advance, Karl -- Karl Brand Dept of Cardiology and Dept of Bioinformatics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam T +31 (0)10 703 2460 |M +31 (0)642 777 268 |F +31 (0)10 704 4161
Rui Barradas
2012-Dec-06  14:48 UTC
[R] function to filter identical data.fames using less than (<) and greater than (>)
Hello,
Something like this?
func <- function(data, A, B, C){
     f <- function(a)
         function(x) eval(parse(text = paste("x", a)))
     iA <- if(is.na(A)) TRUE else f(A)(data$A)
     iB <- if(is.na(B)) TRUE else f(B)(data$B)
     iC <- if(is.na(C)) TRUE else f(C)(data$C)
     data[iA & iB & iC, ]
}
func(eg, "> 0", NA, NA)
func(data=eg, A="< 0", B="< 1", C="> 0")
Hope this helps,
Rui Barradas
Em 06-12-2012 13:49, Karl Brand escreveu:> Esteemed UseRs,
>
> I've got many biggish data frames which need a lot subsetting, like in 
> this example:
>
> # example
> eg <- data.frame(A = rnorm(10), B = rnorm(10), C = rnorm(10), D = 
> rnorm(10))
> egsub <- eg[eg$A < 0 & eg$B < 1 & eg$C > 0, ]
> egsub
> egsub2 <- eg[eg$A > 1 & eg$B > 0, ]
> egsub2
>
> # To make this clearer than 1000s of lines of extractions with []
> # I tried to make a function like this:
>
> # func(data="eg", A="< 0", B="< 1",
C="> 0")
>
> # Which would also need to be run as
>
> # func(data="eg", A="> 1", B="> 0",
C=NA)
> #end
>
> Noteably:
> -the signs* "<" and ">" need to be flexible _and_
optional
> -the quantities also need to be flexible
> -column header names i.e, A, B and C don't need flexibility,
> i.e., can remain fixed
> * "less than" and "greater than" so google picks up
this thread
>
> Once again i find just how limited my grasp of R is...Is do.call() the 
> best way to call binary operators like < & > in a function? Is an
> ifelse statement needed for each column to make filtering on it 
> optional? etc....
>
> Any one with the patience to show their working version of such a 
> funciton would receive my undying Rdulation. With thanks in advance,
>
> Karl
>
Jeff Newmiller
2012-Dec-06  14:57 UTC
[R] function to filter identical data.fames using less than (<) and greater than (>)
You have not indicated why the subset function is insufficient for your needs...
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.
Karl Brand <k.brand at erasmusmc.nl> wrote:
>Esteemed UseRs,
>
>I've got many biggish data frames which need a lot subsetting, like in 
>this example:
>
># example
>eg <- data.frame(A = rnorm(10), B = rnorm(10), C = rnorm(10), D
>rnorm(10))
>egsub <- eg[eg$A < 0 & eg$B < 1 & eg$C > 0, ]
>egsub
>egsub2 <- eg[eg$A > 1 & eg$B > 0, ]
>egsub2
>
># To make this clearer than 1000s of lines of extractions with []
># I tried to make a function like this:
>
># func(data="eg", A="< 0", B="< 1",
C="> 0")
>
># Which would also need to be run as
>
># func(data="eg", A="> 1", B="> 0",
C=NA)
>#end
>
>Noteably:
>-the signs* "<" and ">" need to be flexible _and_
optional
>-the quantities also need to be flexible
>-column header names i.e, A, B and C don't need flexibility,
>i.e., can remain fixed
>* "less than" and "greater than" so google picks up this
thread
>
>Once again i find just how limited my grasp of R is...Is do.call() the 
>best way to call binary operators like < & > in a function? Is an
>ifelse 
>statement needed for each column to make filtering on it optional?
>etc....
>
>Any one with the patience to show their working version of such a 
>funciton would receive my undying Rdulation. With thanks in advance,
>
>Karl