Eric Fail
2012-Mar-29  08:28 UTC
[R] subtract a list of vectors from a list of data.frames in an elegant way
Dear R experts,
I've realized that it might not be possible to define a negative SELCET
statement in a SQL call so now I'm looking for the smoothest way to generate
a list of what I would like from my large database by first pulling all the
names with a query like this "SELECT top 1 * FROM your_table" (thank
you Bart Joosen for the idea) and then subtract the variables I am not allow to
pull manually ending up with a 'positive' definition of what I want,
something I can use in a SQL SELCT statement (see my email on this list from
yesterday for more on that).
When I query the database for the variable names I get something similar to
'DBquery' in my working example below, but considerable longer with over
2400 hundred variables. As I only need to remove two or three variables I would
like to define a lookup table (like the list 'lookup' in my example) and
subtract that from my data base query. Now to my question. Is there a way I can
subtract one list from another? Like setoff or alike?
I would like to end up with a list like the one shown in my example called
'result.' In short, I would like to subtract 'lookup' from
'DBquery' and end up with 'result,' please note that
'result' is a list fo vecktors and not a list of dataframes. In my real
life example DBquery is considerable longer so defining that by hand would make
a really really long syntax.
Hope someone know some smart function that I can use to solve my problem in an
elegant way.
Thanks for reading.
Erick
###### begin R code ######
DBquery ? ? <- list(tableA=data.frame(id = numeric(0), atwin = numeric(0),
atrout = numeric(0)),
 ? ? ? ? ? ? ? ? ? ?tableB=data.frame(id = numeric(0), mq ? = numeric(0), z =
numeric(0), m = numeric(0)),
 ? ? ? ? ? ? ? ? ? ?tableC=data.frame(V1 = numeric(0), mfn ? = numeric(0), iiff
= numeric(0)),
 ? ? ? ? ? ? ? ? ? ?tableD=data.frame(id ? ?= numeric(0), msf ? = numeric(0),
oom ?= numeric(0)))
lookup ? ? <- list(tableA= ? c('atwin', 'atrout'), 
 ? ? ? ? ? ? ? ? ? tableB= ? ?c('m', 'z'),
 ? ? ? ? ? ? ? ? ? tableC= ? ?'ALL')
### ...
result ? ? <- list(tableA= c('id'),
 ? ? ? ? ? ? ? ? ? ?tableB= c('id', 'mq'),
 ? ? ? ? ? ? ? ? ? ?tableC= c('V1', 'mfn', 'iiff'))
Jim Holtman
2012-Mar-29  11:07 UTC
[R] subtract a list of vectors from a list of data.frames in an elegant way
?setdiff Sent from my iPad On Mar 29, 2012, at 4:28, "Eric Fail" <eric.fail at gmx.us> wrote:> Dear R experts, > > I've realized that it might not be possible to define a negative SELCET statement in a SQL call so now I'm looking for the smoothest way to generate a list of what I would like from my large database by first pulling all the names with a query like this "SELECT top 1 * FROM your_table" (thank you Bart Joosen for the idea) and then subtract the variables I am not allow to pull manually ending up with a 'positive' definition of what I want, something I can use in a SQL SELCT statement (see my email on this list from yesterday for more on that). > > When I query the database for the variable names I get something similar to 'DBquery' in my working example below, but considerable longer with over 2400 hundred variables. As I only need to remove two or three variables I would like to define a lookup table (like the list 'lookup' in my example) and subtract that from my data base query. Now to my question. Is there a way I can subtract one list from another? Like setoff or alike? > > I would like to end up with a list like the one shown in my example called 'result.' In short, I would like to subtract 'lookup' from 'DBquery' and end up with 'result,' please note that 'result' is a list fo vecktors and not a list of dataframes. In my real life example DBquery is considerable longer so defining that by hand would make a really really long syntax. > > Hope someone know some smart function that I can use to solve my problem in an elegant way. > > Thanks for reading. > > Erick > > ###### begin R code ###### > > DBquery <- list(tableA=data.frame(id = numeric(0), atwin = numeric(0), atrout = numeric(0)), > tableB=data.frame(id = numeric(0), mq = numeric(0), z = numeric(0), m = numeric(0)), > tableC=data.frame(V1 = numeric(0), mfn = numeric(0), iiff = numeric(0)), > tableD=data.frame(id = numeric(0), msf = numeric(0), oom = numeric(0))) > > lookup <- list(tableA= c('atwin', 'atrout'), > tableB= c('m', 'z'), > tableC= 'ALL') > > ### ... > > result <- list(tableA= c('id'), > tableB= c('id', 'mq'), > tableC= c('V1', 'mfn', 'iiff')) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Gabor Grothendieck
2012-Mar-31  21:56 UTC
[R] subtract a list of vectors from a list of data.frames in an elegant way
On Thu, Mar 29, 2012 at 4:28 AM, Eric Fail <eric.fail at gmx.us> wrote:> Dear R experts, > > I've realized that it might not be possible to define a negative SELCET statement in a SQL call so now I'm looking for the smoothest way to generate a list of what I would like from my large database by first pulling all the names with a query like this "SELECT top 1 * FROM your_table" (thank you Bart Joosen for the idea) and then subtract the variables I am not allow to pull manually ending up with a 'positive' definition of what I want, something I can use in a SQL SELCT statement (see my email on this list from yesterday for more on that). > > When I query the database for the variable names I get something similar to 'DBquery' in my working example below, but considerable longer with over 2400 hundred variables. As I only need to remove two or three variables I would like to define a lookup table (like the list 'lookup' in my example) and subtract that from my data base query. Now to my question. Is there a way I can subtract one list from another? Like setoff or alike? > > I would like to end up with a list like the one shown in my example called 'result.' In short, I would like to subtract 'lookup' from 'DBquery' and end up with 'result,' please note that 'result' is a list fo vecktors and not a list of dataframes. In my real life example DBquery is considerable longer so defining that by hand would make a really really long syntax. >One can use except in sqlite and in a number of other database systems: library(sqldf) # first two rows of BOD BOD12 <- BOD[1:2, ] # all rows of BOD except those in BOD12 sqldf("select * from BOD except (select * from BOD12)") -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com