Stephan Lindner
2009-Mar-30 16:19 UTC
[R] Importing csv file with character values into sqlite3 and subsequent problem in R / RSQLite
Dear all, I'm trying to import a csv file into sqlite3 and from there into R. Everything looks fine exepct that R outputs the character values in an odd fashion: they are shown as "\"CHARACTER\"" instead of "CHARACTER", but only if I show the character variable as a vector. Does someone know why this happens? Below is a sample code. The first part is written in bash. Of course I could just read.csv for the spreadsheet, but the real datasets are more than 3 GB, that's why I'm using RSQLite (which is really awesome!). Also, I could get rid of the "" in the csv file (the csv file has only numbers, but it is easier for my to use identifiers such as v1 as character strings), but I thought I'd first see whether there is a different way to solve this issue. Thanks! Stephan <-- bash$ more example.csv bash$ echo -e "\"001074034\",90,1,7,89,12\n\"001074034\",90,1,1,90,12\n\"001074034\",90,1,2,90,12\n\"001074034\",90,1,3,90,12" > example.csv bash$ echo "create table t(v1,v2,v3,v4,v5,v6);" > example.sql bash$ sqlite3 example.db < example.sql bash$ echo -e ".separator , \n.import example.csv t" | sqlite3 example.db bash$ R> library(RSQLite)Loading required package: DBI> example.db <- dbConnect(SQLite(),"example.db") > x <- dbGetQuery(example.db,"select * from t") > xv1 v2 v3 v4 v5 v6 1 "001074034" 90 1 7 89 12 2 "001074034" 90 1 1 90 12 3 "001074034" 90 1 2 90 12 4 "001074034" 90 1 3 90 12> x$v1[1] "\"001074034\"" "\"001074034\"" "\"001074034\"" "\"001074034\"" --> Only the codes: <-- more example.csv echo -e "\"001074034\",90,1,7,89,12\n\"001074034\",90,1,1,90,12\n\"001074034\",90,1,2,90,12\n\"001074034\",90,1,3,90,12" > example.csv echo "create table t(v1,v2,v3,v4,v5,v6);" > example.sql sqlite3 example.db < example.sql echo -e ".separator , \n.import example.csv t" | sqlite3 example.db R library(RSQLite) example.db <- dbConnect(SQLite(),"example.db") x <- dbGetQuery(example.db,"select * from t") x x$v1 --> -- ----------------------- Stephan Lindner University of Michigan
Gabor Grothendieck
2009-Mar-30 18:51 UTC
[R] Importing csv file with character values into sqlite3 and subsequent problem in R / RSQLite
There are some examples of reading files into sqlite on the sqldf home page: http://sqldf.googlecode.com On Mon, Mar 30, 2009 at 12:19 PM, Stephan Lindner <lindners at umich.edu> wrote:> Dear all, > > > I'm trying to import a csv file into sqlite3 and from there into > R. Everything looks fine exepct that R outputs the character values in > an odd fashion: they are shown as "\"CHARACTER\"" instead of > "CHARACTER", but only if I show the character variable as a > vector. Does someone know why this happens? Below is a sample > code. The first part is written in bash. Of course I could just > read.csv for the spreadsheet, but the real datasets are more than 3 > GB, that's why I'm using RSQLite (which is really awesome!). Also, I > could get rid of the "" in the csv file (the csv file has only > numbers, but it is easier for my to use identifiers such as v1 as > character strings), but I thought I'd first see whether there is a > different way to solve this issue. > > > Thanks! > > > ? ? ? ?Stephan > > > <-- > > bash$ more example.csv > bash$ echo -e "\"001074034\",90,1,7,89,12\n\"001074034\",90,1,1,90,12\n\"001074034\",90,1,2,90,12\n\"001074034\",90,1,3,90,12" > example.csv > bash$ echo "create table t(v1,v2,v3,v4,v5,v6);" > example.sql > bash$ sqlite3 example.db < example.sql > bash$ echo -e ".separator , \n.import example.csv t" | sqlite3 example.db > bash$ R >> library(RSQLite) > Loading required package: DBI >> example.db <- dbConnect(SQLite(),"example.db") >> x <- dbGetQuery(example.db,"select * from t") >> x > ? ? ? ? ? v1 v2 v3 v4 v5 v6 > 1 "001074034" 90 ?1 ?7 89 12 > 2 "001074034" 90 ?1 ?1 90 12 > 3 "001074034" 90 ?1 ?2 90 12 > 4 "001074034" 90 ?1 ?3 90 12 > >> x$v1 > ?[1] "\"001074034\"" "\"001074034\"" "\"001074034\"" "\"001074034\"" > > --> > > > Only the codes: > > > <-- > > more example.csv > echo -e "\"001074034\",90,1,7,89,12\n\"001074034\",90,1,1,90,12\n\"001074034\",90,1,2,90,12\n\"001074034\",90,1,3,90,12" > example.csv > echo "create table t(v1,v2,v3,v4,v5,v6);" > example.sql > sqlite3 example.db < example.sql > echo -e ".separator , \n.import example.csv t" | sqlite3 example.db > R > > library(RSQLite) > example.db <- dbConnect(SQLite(),"example.db") > x <- dbGetQuery(example.db,"select * from t") > x > x$v1 > > --> > > > > > -- > ----------------------- > Stephan Lindner > University of Michigan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >