From my point of view, the logic is this:
If the external database is Oracle, use ROracle
If the external database is MySQL, use RMySQL
and similarly for other databases
If there is no R package specific to the database, then you drop back to RODBC
or RJDBC. Hopefully you can get the necessary drivers or java files to support
the database
Your steps look good (I do them all the time with Oracle and MySQL), and realize
that you don't have to grab an entire table; you can send SQL queries that
join tables and subset rows, etc. You can also write results back to the
database if that's useful.
I prefer to use packages that are based on the DBI package.
-Don
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
?On 7/25/18, 7:57 AM, "R-help on behalf of Doran, Harold"
<r-help-bounces at r-project.org on behalf of HDoran at air.org> wrote:
I'm doing some work now to learn which SQL database package is the most
optimal for the task I am working on. There are many packages, and I'm
reviewing the documentation on some of them now. I am seeking advice from those
of you who might suggest a package to use for the task I am currently working
with.
The work is currently as follows. My users currently use another tool to
extract tables from a server, save those tables as .csv files, and then those
csv files are read into R and stuff is done on the data in those files. This
adds overhead that can be bypassed if users instead can directly access the
database from within R and grab the tables they need and then those tables are
data frames in the R session and available to do stuff.
The sequence of work (I think) I just this:
Step 1: Connect to the remote server (connection string and authenticate the
user)
Step 2: Have a SQL query statement that grabs the tables from the remote
server
Step 3: Close the connection
The two packages I have narrowed my studies to are Dbplyr and RODBC, both of
which seem to be similar.
Any experiences out there to suggest these two packages are in fact right
for this task, or would there be other packages that might be more optimal for
this?
Thanks,
Harold
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Thanks for this. I'm using the RODBC stuff now. It works well and is
currently embedded in a shiny app. So, the entire SQL stuff is transparent to
the user who simply interacts with the UI. It appears to be working in a local
windows version. That is, I can successfully open the connection, do my
sqlQuery, and save those data as objects in the R session.
But when I run the same code on my dev server (which runs Centos 7), the code is
breaking and it is seemingly related to the driver. It just cannot open the
connection. That portion of my code is (with certain things blanked out for
security):
cn <- odbcDriverConnect(connection="Driver={SQL Server Native Client
11.0};
server=1.1.1.1;
database=xyz;
uid=*****;
pwd=***;"
)
I'm doing my homework now on the right drivers that might be appropriate for
centos, but if anyone happens to know, hints are appreciated
Harold
-----Original Message-----
From: MacQueen, Don [mailto:macqueen1 at llnl.gov]
Sent: Thursday, July 26, 2018 11:26 AM
To: Doran, Harold <HDoran at air.org>; 'r-help at r-project.org'
<r-help at r-project.org>
Subject: Re: [R] SQL Database
From my point of view, the logic is this:
If the external database is Oracle, use ROracle
If the external database is MySQL, use RMySQL and similarly for other
databases
If there is no R package specific to the database, then you drop back to RODBC
or RJDBC. Hopefully you can get the necessary drivers or java files to support
the database
Your steps look good (I do them all the time with Oracle and MySQL), and realize
that you don't have to grab an entire table; you can send SQL queries that
join tables and subset rows, etc. You can also write results back to the
database if that's useful.
I prefer to use packages that are based on the DBI package.
-Don
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
?On 7/25/18, 7:57 AM, "R-help on behalf of Doran, Harold"
<r-help-bounces at r-project.org on behalf of HDoran at air.org> wrote:
I'm doing some work now to learn which SQL database package is the most
optimal for the task I am working on. There are many packages, and I'm
reviewing the documentation on some of them now. I am seeking advice from those
of you who might suggest a package to use for the task I am currently working
with.
The work is currently as follows. My users currently use another tool to
extract tables from a server, save those tables as .csv files, and then those
csv files are read into R and stuff is done on the data in those files. This
adds overhead that can be bypassed if users instead can directly access the
database from within R and grab the tables they need and then those tables are
data frames in the R session and available to do stuff.
The sequence of work (I think) I just this:
Step 1: Connect to the remote server (connection string and authenticate the
user)
Step 2: Have a SQL query statement that grabs the tables from the remote
server
Step 3: Close the connection
The two packages I have narrowed my studies to are Dbplyr and RODBC, both of
which seem to be similar.
Any experiences out there to suggest these two packages are in fact right
for this task, or would there be other packages that might be more optimal for
this?
Thanks,
Harold
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Harold,
I don't have much experience with ODBC/RODBC, but given that it's
working on Win, a driver problem seems plausible.
-Don
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
?On 7/26/18, 9:37 AM, "Doran, Harold" <HDoran at air.org> wrote:
Thanks for this. I'm using the RODBC stuff now. It works well and is
currently embedded in a shiny app. So, the entire SQL stuff is transparent to
the user who simply interacts with the UI. It appears to be working in a local
windows version. That is, I can successfully open the connection, do my
sqlQuery, and save those data as objects in the R session.
But when I run the same code on my dev server (which runs Centos 7), the
code is breaking and it is seemingly related to the driver. It just cannot open
the connection. That portion of my code is (with certain things blanked out for
security):
cn <- odbcDriverConnect(connection="Driver={SQL Server Native Client
11.0};
server=1.1.1.1;
database=xyz;
uid=*****;
pwd=***;"
)
I'm doing my homework now on the right drivers that might be appropriate
for centos, but if anyone happens to know, hints are appreciated
Harold
-----Original Message-----
From: MacQueen, Don [mailto:macqueen1 at llnl.gov]
Sent: Thursday, July 26, 2018 11:26 AM
To: Doran, Harold <HDoran at air.org>; 'r-help at
r-project.org' <r-help at r-project.org>
Subject: Re: [R] SQL Database
From my point of view, the logic is this:
If the external database is Oracle, use ROracle
If the external database is MySQL, use RMySQL and similarly for other
databases
If there is no R package specific to the database, then you drop back to
RODBC or RJDBC. Hopefully you can get the necessary drivers or java files to
support the database
Your steps look good (I do them all the time with Oracle and MySQL), and
realize that you don't have to grab an entire table; you can send SQL
queries that join tables and subset rows, etc. You can also write results back
to the database if that's useful.
I prefer to use packages that are based on the DBI package.
-Don
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
On 7/25/18, 7:57 AM, "R-help on behalf of Doran, Harold"
<r-help-bounces at r-project.org on behalf of HDoran at air.org> wrote:
I'm doing some work now to learn which SQL database package is the
most optimal for the task I am working on. There are many packages, and I'm
reviewing the documentation on some of them now. I am seeking advice from those
of you who might suggest a package to use for the task I am currently working
with.
The work is currently as follows. My users currently use another tool to
extract tables from a server, save those tables as .csv files, and then those
csv files are read into R and stuff is done on the data in those files. This
adds overhead that can be bypassed if users instead can directly access the
database from within R and grab the tables they need and then those tables are
data frames in the R session and available to do stuff.
The sequence of work (I think) I just this:
Step 1: Connect to the remote server (connection string and authenticate
the user)
Step 2: Have a SQL query statement that grabs the tables from the remote
server
Step 3: Close the connection
The two packages I have narrowed my studies to are Dbplyr and RODBC,
both of which seem to be similar.
Any experiences out there to suggest these two packages are in fact
right for this task, or would there be other packages that might be more optimal
for this?
Thanks,
Harold
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.