My late friend Morven Gentleman, not long after he stepped down from being chair
of Computer Science at Waterloo, said that it seemed computer scientists had to
create
a new computer language for every new problem they encountered.
If we could use least squares to measure this approximation, we'd likely be
suspicious
of a terribly small error measure or overly high R^2.
JN
On 2024-12-11 11:11, avi.e.gross at gmail.com wrote:> Akshay,
>
> Your question has way too many answers.
>
> SQL has a long history and early versions came long before R arrived on the
> scene. There is a huge embedded base of hardware and software dedicated to
> managing databases. It has some features that most R programs do not even
> dream of doing. Besides easily handling massive amounts of data or
sometimes
> tweaking queries to possibly run more efficiently, there are all kinds of
> issue of how to manage multiple people accessing and changing the data at
> about the same time, or rolling the data back to an earlier checkpoint.
>
> R came along later and, as Ben pointed out, adds all kinds of things SQL
> does not have and likely does not need, or alternate ways to do things.
>
> For many people now, the workload is to use a programming language, and R
is
> not the only one used, which has enhanced with packages or modules that
> allow access in a fairly general way to one or many databases running
> various versions of SQL. The programmer uses this API in many ways.
>
> In some ways, it is just a way to tell the database what to do without much
> other processing. You can ask to open a connection to the server, do a
query
> that gets translated to SQL (or you can provide the actual SQL) and let
the
> remote (or local) machine do much of the work. For example, imagine a
> database with terabytes of data and all you want is a few rows/columns that
> meet your query. In R, you might have to open a collection of huge CSV
files
> and fill more memory than you have and do the query somehow. If the data is
> remote, we are talking about a huge receiving of data. Using SQL divides
the
> work so you do parts here and parts there.
>
> Why use a local MYSQL? Part of the answer is that you have a fairly
> optimized and debugged system that does it well and lets the programmer
> focus on the parts they need to add within R like complex analyses. Part is
> portability, as you can later move the data outside your machine and with
> minor changes, your program should still work. And, there are many other
> scenarios such as wanting to gather data from different sources such as
> connecting to multiple remote databases and getting filtered data and doing
> an analysis across that data and perhaps updating them.
>
> R used in ways like this provides lots of flexibility. But part of the
> question is like asking why there are a hundred programming languages still
> in use out there. Why do we need so many? In short, we don't
necessarily
> need all or even most of them but they are there because various people
> developed them and used them and it is not trivial to get people to switch
> and maybe abandon all the older software or try to rewrite it.
>
> Having said that, I think a large fraction of R users have never had any
> particular reason to learn SQL. Many have never used it directly or even
> indirectly. I know someone who I have programmed for who calls some expert
> to do a SQL query and save the results in CSV files and then works directly
> in R on those files. I have pointed out to them that their life could be
> even easier if they got a more focused dump of the SQL data with some of
the
> added processing done in SQL and then a smaller amount of data coming into
> the R side.
>
> I also note that languages like R and python can have parts that run fairly
> slowly. Arguably, most versions of SQL have been tuned over decades ...
>
>
> -----Original Message-----
> From: R-help <r-help-bounces at r-project.org> On Behalf Of akshay
kulkarni
> Sent: Wednesday, December 11, 2024 8:17 AM
> To: R help Mailing list <r-help at r-project.org>
> Subject: [R] SQL and R
>
> dear Members,
> I have recently started studying SQL and
MySQL.
> My question is, what exactly is SQL used for? That is, whatever can be done
> by SQL, like subsetting and filtering of data sets, can also be done by R.
> What's, then, the advantage of SQL? It is OK if you tag this question
as
> offtopic, but I could'nt find any info on the web. Can you please refer
me
> to some online resources that shed some light on this? Finally, how does
SQL
> complement R? Are both dependent?
>
> THanking you,
> Yours sincerely,
> AKSHAY M KULKARNI
>
>
[https://s-install.avcdn.net/ipm/preview/icons/icon-envelope-tick-round-oran
>
ge-animated-no-repeat-v1.gif]<https://www.avast.com/sig-email?utm_medium=ema
>
il&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>
Virus-free.www.avast.com<https://www.avast.com/sig-email?utm_medium=email&ut
> m_source=link&utm_campaign=sig-email&utm_content=webmail>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.