Is there a cheat-sheet anywhere that describes how to do SQL-like manipulations on a data frame? My knowledge of R is rather limited. But from my experience it seems as though one can think of data frames as being similar to tables in a database: there are rows, columns, and values. Also, one can perform similar manipulations on a data frame as one can on a table. For example: select * from foo where bar < 10 ; is similar to foo[foo["bar"] < 10,] I'm just wondering how many other SQL-like manipulations can be done on a data frame? As an extreme example, is it reasonable to assume there is an R equivalent to: select bar, bat, baz, baz*100 as 'pctbaz' from foo where bar like %xyz % order by bat, baz desc ; Regards, - Robert http://www.cwelug.org/downloads Help others get OpenSource software. Distribute FLOSS for Windows, Linux, *BSD, and MacOS X with BitTorrent
This goes the other way - all SQL manipulations are a subset of what
can be done with R. Read up on indexing and see ?merge, ?aggregate,
?by, ?tapply, among others. (For the R equivalent to your query, check
?grep and ?order, and search the list if needed.) Also, this example
might be a good start:
gby <- function(var,BY,byname="BY")
{
if (!exists("summarize")) library(Hmisc) #you need to install Hmisc
grouped <- summarize(var,BY,function(x) {c(count=length(x),min=min(x),
max=max(x),mean=mean(x))})
colnames(grouped) <-
c(byname,"count","min","max","mean")
grouped
}
#---------------
x <- rnorm(1000)
state <-
sample(c("A","B","C","D"),1000,replace=TRUE)
city <- sample(1:5,1000,replace=TRUE)
gby(x,paste(state,city,sep="-"),"State-City")
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Robert Citek
> Sent: Thursday, May 04, 2006 6:56 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] SQL like manipulations on data frames
>
>
> Is there a cheat-sheet anywhere that describes how to do SQL-like
> manipulations on a data frame?
>
> My knowledge of R is rather limited. But from my experience
> it seems
> as though one can think of data frames as being similar to tables in
> a database: there are rows, columns, and values. Also, one can
> perform similar manipulations on a data frame as one can on a
> table.
> For example:
>
> select * from foo where bar < 10 ;
>
> is similar to
>
> foo[foo["bar"] < 10,]
>
> I'm just wondering how many other SQL-like manipulations can be done
> on a data frame? As an extreme example, is it reasonable to assume
> there is an R equivalent to:
>
> select bar, bat, baz, baz*100 as 'pctbaz' from foo where bar
> like %xyz
> % order by bat, baz desc ;
>
> Regards,
> - Robert
> http://www.cwelug.org/downloads
> Help others get OpenSource software. Distribute FLOSS
> for Windows, Linux, *BSD, and MacOS X with BitTorrent
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
Robert,
In addition to the functions already suggested, see ?subset and ?with.
Try also the package data.table. It does nothing new it appears, but just
makes it a bit easier to call subset(), so for example :
"select * from foo where bar < 10"
foo[foo["bar"] < 10,] (your R code)
foo[bar<10] (if foo is a data.table)
"select bar, bat, baz, baz*100 as 'pctbaz' from foo where bar like
%xyz%
order by bat, baz desc"
would be :
ans = foo[grep('xyz',bar),
c("bar","bat","baz")][desc(bat,baz)]
ans$pctbaz = ans$baz*100
Regards,
Mark
Message: 37
Date: Fri, 5 May 2006 14:43:00 -0400
From: "bogdan romocea" <br44114@gmail.com>
Subject: Re: [R] SQL like manipulations on data frames
To: rwcitek@alum.calberkeley.org
Cc: r-help < R-help@stat.math.ethz.ch>
Message-ID:
<8d5a36350605051143g311c85d9p917cc11e5c3a0d2@mail.gmail.com >
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
This goes the other way - all SQL manipulations are a subset of what
can be done with R. Read up on indexing and see ?merge, ?aggregate,
?by, ?tapply, among others. (For the R equivalent to your query, check
?grep and ?order, and search the list if needed.) Also, this example
might be a good start:
gby <- function(var,BY,byname="BY")
{
if (!exists("summarize")) library(Hmisc) #you need to install Hmisc
grouped <- summarize(var,BY,function(x) {c(count=length(x),min=min(x),
max=max(x),mean=mean(x))})
colnames(grouped) <-
c(byname,"count","min","max","mean")
grouped
}
#---------------
x <- rnorm(1000)
state <-
sample(c("A","B","C","D"),1000,replace=TRUE)
city <- sample(1:5,1000,replace=TRUE)
gby(x,paste(state,city,sep="-"),"State-City")
> -----Original Message-----
> From: r-help-bounces@stat.math.ethz.ch
> [mailto: r-help-bounces@stat.math.ethz.ch] On Behalf Of Robert Citek
> Sent: Thursday, May 04, 2006 6:56 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] SQL like manipulations on data frames
>
>
> Is there a cheat-sheet anywhere that describes how to do SQL-like
> manipulations on a data frame?
>
> My knowledge of R is rather limited. But from my experience
> it seems
> as though one can think of data frames as being similar to tables in
> a database: there are rows, columns, and values. Also, one can
> perform similar manipulations on a data frame as one can on a
> table.
> For example:
>
> select * from foo where bar < 10 ;
>
> is similar to
>
> foo[foo["bar"] < 10,]
>
> I'm just wondering how many other SQL-like manipulations can be done
> on a data frame? As an extreme example, is it reasonable to assume
> there is an R equivalent to:
>
> select bar, bat, baz, baz*100 as 'pctbaz' from foo where bar
> like %xyz
> % order by bat, baz desc ;
>
> Regards,
> - Robert
> http://www.cwelug.org/downloads
> Help others get OpenSource software. Distribute FLOSS
> for Windows, Linux, *BSD, and MacOS X with BitTorrent
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>
[[alternative HTML version deleted]]
Do you know the RSQLite package? It uses the DBI package which gives a common interface to various DB engines. With it, you can explicitely treat data.frames as tables, and execute SQL querys on them. Antonio, Fabio Di Narzo. 2006/5/5, Robert Citek <rwcitek@alum.calberkeley.org>:> > > Is there a cheat-sheet anywhere that describes how to do SQL-like > manipulations on a data frame? > > My knowledge of R is rather limited. But from my experience it seems > as though one can think of data frames as being similar to tables in > a database: there are rows, columns, and values. Also, one can > perform similar manipulations on a data frame as one can on a table. > For example: > > select * from foo where bar < 10 ; > > is similar to > > foo[foo["bar"] < 10,] > > I'm just wondering how many other SQL-like manipulations can be done > on a data frame? As an extreme example, is it reasonable to assume > there is an R equivalent to: > > select bar, bat, baz, baz*100 as 'pctbaz' from foo where bar like %xyz > % order by bat, baz desc ; > > Regards, > - Robert > http://www.cwelug.org/downloads > Help others get OpenSource software. Distribute FLOSS > for Windows, Linux, *BSD, and MacOS X with BitTorrent > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >[[alternative HTML version deleted]]