Is there a cheat-sheet anywhere that describes how to do SQL-like manipulations on a data frame? My knowledge of R is rather limited. But from my experience it seems as though one can think of data frames as being similar to tables in a database: there are rows, columns, and values. Also, one can perform similar manipulations on a data frame as one can on a table. For example: select * from foo where bar < 10 ; is similar to foo[foo["bar"] < 10,] I'm just wondering how many other SQL-like manipulations can be done on a data frame? As an extreme example, is it reasonable to assume there is an R equivalent to: select bar, bat, baz, baz*100 as 'pctbaz' from foo where bar like %xyz % order by bat, baz desc ; Regards, - Robert http://www.cwelug.org/downloads Help others get OpenSource software. Distribute FLOSS for Windows, Linux, *BSD, and MacOS X with BitTorrent
This goes the other way - all SQL manipulations are a subset of what can be done with R. Read up on indexing and see ?merge, ?aggregate, ?by, ?tapply, among others. (For the R equivalent to your query, check ?grep and ?order, and search the list if needed.) Also, this example might be a good start: gby <- function(var,BY,byname="BY") { if (!exists("summarize")) library(Hmisc) #you need to install Hmisc grouped <- summarize(var,BY,function(x) {c(count=length(x),min=min(x), max=max(x),mean=mean(x))}) colnames(grouped) <- c(byname,"count","min","max","mean") grouped } #--------------- x <- rnorm(1000) state <- sample(c("A","B","C","D"),1000,replace=TRUE) city <- sample(1:5,1000,replace=TRUE) gby(x,paste(state,city,sep="-"),"State-City")> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Robert Citek > Sent: Thursday, May 04, 2006 6:56 PM > To: r-help at stat.math.ethz.ch > Subject: [R] SQL like manipulations on data frames > > > Is there a cheat-sheet anywhere that describes how to do SQL-like > manipulations on a data frame? > > My knowledge of R is rather limited. But from my experience > it seems > as though one can think of data frames as being similar to tables in > a database: there are rows, columns, and values. Also, one can > perform similar manipulations on a data frame as one can on a > table. > For example: > > select * from foo where bar < 10 ; > > is similar to > > foo[foo["bar"] < 10,] > > I'm just wondering how many other SQL-like manipulations can be done > on a data frame? As an extreme example, is it reasonable to assume > there is an R equivalent to: > > select bar, bat, baz, baz*100 as 'pctbaz' from foo where bar > like %xyz > % order by bat, baz desc ; > > Regards, > - Robert > http://www.cwelug.org/downloads > Help others get OpenSource software. Distribute FLOSS > for Windows, Linux, *BSD, and MacOS X with BitTorrent > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >
Robert, In addition to the functions already suggested, see ?subset and ?with. Try also the package data.table. It does nothing new it appears, but just makes it a bit easier to call subset(), so for example : "select * from foo where bar < 10" foo[foo["bar"] < 10,] (your R code) foo[bar<10] (if foo is a data.table) "select bar, bat, baz, baz*100 as 'pctbaz' from foo where bar like %xyz% order by bat, baz desc" would be : ans = foo[grep('xyz',bar), c("bar","bat","baz")][desc(bat,baz)] ans$pctbaz = ans$baz*100 Regards, Mark Message: 37 Date: Fri, 5 May 2006 14:43:00 -0400 From: "bogdan romocea" <br44114@gmail.com> Subject: Re: [R] SQL like manipulations on data frames To: rwcitek@alum.calberkeley.org Cc: r-help < R-help@stat.math.ethz.ch> Message-ID: <8d5a36350605051143g311c85d9p917cc11e5c3a0d2@mail.gmail.com > Content-Type: text/plain; charset=ISO-8859-1; format=flowed This goes the other way - all SQL manipulations are a subset of what can be done with R. Read up on indexing and see ?merge, ?aggregate, ?by, ?tapply, among others. (For the R equivalent to your query, check ?grep and ?order, and search the list if needed.) Also, this example might be a good start: gby <- function(var,BY,byname="BY") { if (!exists("summarize")) library(Hmisc) #you need to install Hmisc grouped <- summarize(var,BY,function(x) {c(count=length(x),min=min(x), max=max(x),mean=mean(x))}) colnames(grouped) <- c(byname,"count","min","max","mean") grouped } #--------------- x <- rnorm(1000) state <- sample(c("A","B","C","D"),1000,replace=TRUE) city <- sample(1:5,1000,replace=TRUE) gby(x,paste(state,city,sep="-"),"State-City")> -----Original Message----- > From: r-help-bounces@stat.math.ethz.ch > [mailto: r-help-bounces@stat.math.ethz.ch] On Behalf Of Robert Citek > Sent: Thursday, May 04, 2006 6:56 PM > To: r-help@stat.math.ethz.ch > Subject: [R] SQL like manipulations on data frames > > > Is there a cheat-sheet anywhere that describes how to do SQL-like > manipulations on a data frame? > > My knowledge of R is rather limited. But from my experience > it seems > as though one can think of data frames as being similar to tables in > a database: there are rows, columns, and values. Also, one can > perform similar manipulations on a data frame as one can on a > table. > For example: > > select * from foo where bar < 10 ; > > is similar to > > foo[foo["bar"] < 10,] > > I'm just wondering how many other SQL-like manipulations can be done > on a data frame? As an extreme example, is it reasonable to assume > there is an R equivalent to: > > select bar, bat, baz, baz*100 as 'pctbaz' from foo where bar > like %xyz > % order by bat, baz desc ; > > Regards, > - Robert > http://www.cwelug.org/downloads > Help others get OpenSource software. Distribute FLOSS > for Windows, Linux, *BSD, and MacOS X with BitTorrent > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >[[alternative HTML version deleted]]
Do you know the RSQLite package? It uses the DBI package which gives a common interface to various DB engines. With it, you can explicitely treat data.frames as tables, and execute SQL querys on them. Antonio, Fabio Di Narzo. 2006/5/5, Robert Citek <rwcitek@alum.calberkeley.org>:> > > Is there a cheat-sheet anywhere that describes how to do SQL-like > manipulations on a data frame? > > My knowledge of R is rather limited. But from my experience it seems > as though one can think of data frames as being similar to tables in > a database: there are rows, columns, and values. Also, one can > perform similar manipulations on a data frame as one can on a table. > For example: > > select * from foo where bar < 10 ; > > is similar to > > foo[foo["bar"] < 10,] > > I'm just wondering how many other SQL-like manipulations can be done > on a data frame? As an extreme example, is it reasonable to assume > there is an R equivalent to: > > select bar, bat, baz, baz*100 as 'pctbaz' from foo where bar like %xyz > % order by bat, baz desc ; > > Regards, > - Robert > http://www.cwelug.org/downloads > Help others get OpenSource software. Distribute FLOSS > for Windows, Linux, *BSD, and MacOS X with BitTorrent > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >[[alternative HTML version deleted]]