Note: I did raise report the issue below to r-sig-db at r-project.org, but didn't see any reply. I'm hoping somebody on r-help can help me devise a workaround for a problem I'm having with RODB: I use RODBC to read and write a good deal of data to SQL Server and I'd be extremely grateful if anyone has found a workaround in order to be able to write dataframes to SQL Server using RODBC dynamically created SQL tables and read the data from those tables, or indeed any arbitrary SQL Server table with "float" datatypes and end up with numeric columns instead of "factor" columns in a dataframe in R. I have found that when RODBC creates a Microsoft SQL Server data table from a dataFrame using sqlSave(....append=FALSE), RODBC uses the SQL "float" datatype to store R numeric data in a dynamically-created table on the server. However, when RODBC reads any SQL Server "float" datatype from SQL Server via sqlQuery it interprets float columns as "factor" data. I created a standalone sample below to demonstrate the odd behavior of RODBC that I hope to overcome: # Assuming the reader has access to SQL Server the code below is self-contained and repeatable # I believe it demonstrates unexpected and undesirable behavior in RODBC library(RODBC) library(fPortfolio) library(timeSeries) head(SWX.RET$SBI) str(SWX.RET$SBI) mydata<-as.timeSeries(SWX.RET) head(mydata) df2beSavedByRODBC =as.data.frame(mydata) str(df2beSavedByRODBC) # shows the numeric data in the dataframe # # data.frame': 1916 obs. of 6 variables: # $ SBI : num -0.002088 -0.000105 -0.00136 0.000419 0 ... # $ SPI : num -0.03439 -0.01041 0.01212 0.02246 0.00211 ... ... # Let's save the dataframe to SQL Server: dbconn<-odbcDriverConnect(connection="Driver={SQL Server};server=_YOURSERVERNAMEHER_;database=_YOURDBNAME_;Trusted_Connection=True;"); sqlSave(channel=dbconn,dat=df2beSavedByRODBC,tablename="testTable",rownames=TRUE,append=FALSE,addPK=FALSE,verbose=FALSE) # The sqlSave above works very well. The new table is create in the Microsoft SQL database and the ddl for the table is: # # [dbo].[testTable]( # [rownames] [varchar](255) NULL, # [SBI] [float] NULL, # [SPI] [float] NULL, # [SII] [float] NULL, # [LP25] [float] NULL, # [LP40] [float] NULL, # [LP60] [float] NULL # ) # The numeric values from the dataframe are stored as float (i.e. numeric) in SQL server -- good! ## now let's read back the data RODBC stored in SQL server from a SQL table RODBC created: sqlString = "select * from testTable" dataFrameFromDB = sqlQuery(dbconn,sqlString,errors=TRUE); str(dataFrameFromDB) # # 'data.frame': 1916 obs. of 7 variables: # $ rownames: Factor w/ 1916 levels "2000-01-04","2000-01-05",..: 1 2 3 4 5 6 7 8 9 10 ... # $ SBI : Factor w/ 1742 levels "-0.00041080415489958",..: 349 42 161 1418 828 48 49 1419 1024 135 ... # $ SPI : Factor w/ 1848 levels "-0.0020169904194276",..: 445 48 970 883 1187 377 1157 1065 951 1840 ... ... #********* RODBC wrote numeric data to SQL Server as float, but read the same data back as Factor ! ******** I could use some help to create a robust and flexible workaround for RODBC's asymmetric treatment of numeric data. If there were some way to force RODBC sqlQuery to interpret all SQL Server float datatypes as numeric my problem would be solved. FWIW: RODBC does interpret the SQL Server "real" datatype as numeric. Thank you, Andrew [[alternative HTML version deleted]]
Bos, Roger
2014-Oct-03 18:04 UTC
[R] Workaround for RODBC asymmetric numeric data treatment
Andrew, I ran your code using my SQL Server database and it seems like it worked okay for me, in that I end up with "num" data types when I read the data back in. So it may be a setting on your database. I don't claim to know which one. BTW, I had to install 5 or 6 separate packages to get fPortfolio to load. Anyone know why install.packages("fPortfolio",repos="http://R-Forge.R-project.org") can't install all the dependencies automatically? Thanks, Roger> library(RODBC)> library(fPortfolio)Loading required package: timeSeries Attaching package: ?timeSeries? The following object is masked from ?package:zoo?: time<- Loading required package: fBasics Rmetrics Package fBasics Analysing Markets and calculating Basic Statistics Copyright (C) 2005-2014 Rmetrics Association Zurich Educational Software for Financial Engineering and Computational Science Rmetrics is free software and comes with ABSOLUTELY NO WARRANTY. https://www.rmetrics.org --- Mail to: info at rmetrics.org Attaching package: ?fBasics? The following object is masked from ?package:TTR?: volatility Loading required package: fAssets Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : there is no package called ?DEoptimR? Error: package ?fAssets? could not be loaded> source(.trPaths[5], echo=TRUE, max.deparse.length=150)> library(RODBC)> library(fPortfolio)Loading required package: fAssets Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called ?sn? Error: package ?fAssets? could not be loaded> library(RODBC) > library(fPortfolio)Loading required package: fAssets Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called ?sn? Error: package ?fAssets? could not be loaded> library(timeSeries) > head(SWX.RET$SBI)Error in head(SWX.RET$SBI) : error in evaluating the argument 'x' in selecting a method for function 'head': Error: object 'SWX.RET' not found> library(fPortfolio)Loading required package: fAssets Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called ?sn? Error: package ?fAssets? could not be loaded> library(fPortfolio)Loading required package: fAssets Rmetrics Package fAssets Analysing and Modeling Financial Assets Copyright (C) 2005-2014 Rmetrics Association Zurich Educational Software for Financial Engineering and Computational Science Rmetrics is free software and comes with ABSOLUTELY NO WARRANTY. https://www.rmetrics.org --- Mail to: info at rmetrics.org Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called ?slam? Error: package or namespace load failed for ?fPortfolio?> library(fPortfolio)Package Rsolnp (1.14) loaded. To cite, see citation("Rsolnp") Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called ?kernlab? Error: package or namespace load failed for ?fPortfolio?> library(fPortfolio)Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called ?rneos? Error: package or namespace load failed for ?fPortfolio?> library(fPortfolio)Rmetrics Package fPortfolio Portfolio Optimization Copyright (C) 2005-2014 Rmetrics Association Zurich Educational Software for Financial Engineering and Computational Science Rmetrics is free software and comes with ABSOLUTELY NO WARRANTY. https://www.rmetrics.org --- Mail to: info at rmetrics.org> head(SWX.RET$SBI)[1] -0.0020881194 -0.0001045205 -0.0013597617 0.0004185852 0.0000000000 -0.0010467917> str(SWX.RET$SBI)num [1:1916] -0.002088 -0.000105 -0.00136 0.000419 0 ...> source(.trPaths[5], echo=TRUE, max.deparse.length=150)> mydata<-as.timeSeries(SWX.RET)> head(mydata)GMT SBI SPI SII LP25 LP40 LP60 2000-01-04 -0.0020881194 -0.034390059 0.00001367381 -0.011994298 -0.018013035 -0.026155259 2000-01-05 -0.0001045205 -0.010408271 -0.00495530624 -0.003657054 -0.005837489 -0.009011403 2000-01-06 -0.0013597617 0.012119128 0.00381289851 -0.001323897 -0.001644737 -0.002395959 2000-01-07 0.0004185852 0.022461656 -0.00061621046 0.007714991 0.011660151 0.017062613 2000-01-10 0.0000000000 0.002107677 0.00238057889 0.003029081 0.004565523 0.006948020 2000-01-11 -0.0010467917 -0.002773654 -0.00029384531 -0.002422531 -0.003142903 -0.004183466> source(.trPaths[5], echo=TRUE, max.deparse.length=150)> df2beSavedByRODBC =as.data.frame(mydata)> str(df2beSavedByRODBC)'data.frame': 1916 obs. of 6 variables: $ SBI : num -0.002088 -0.000105 -0.00136 0.000419 0 ... $ SPI : num -0.03439 -0.01041 0.01212 0.02246 0.00211 ... $ SII : num 0.0000137 -0.0049553 0.0038129 -0.0006162 0.0023806 ... $ LP25: num -0.01199 -0.00366 -0.00132 0.00771 0.00303 ... $ LP40: num -0.01801 -0.00584 -0.00164 0.01166 0.00457 ... $ LP60: num -0.02616 -0.00901 -0.0024 0.01706 0.00695 ...> sqlSave(xf, dat=df2beSavedByRODBC,tablename="testTable",rownames=TRUE,append=FALSE,addPK=FALSE,verbose=FALSE) > sqlString = "select * from testTable" > dataFrameFromDB = sqlQuery(xf, sqlString,errors=TRUE); > str(dataFrameFromDB)'data.frame': 1916 obs. of 7 variables: $ rownames: chr "2000-01-04" "2000-01-05" "2000-01-06" "2000-01-07" ... $ SBI : num -0.002088 -0.000105 -0.00136 0.000419 0 ... $ SPI : num -0.03439 -0.01041 0.01212 0.02246 0.00211 ... $ SII : num 0.0000137 -0.0049553 0.0038129 -0.0006162 0.0023806 ... $ LP25 : num -0.01199 -0.00366 -0.00132 0.00771 0.00303 ... $ LP40 : num -0.01801 -0.00584 -0.00164 0.01166 0.00457 ... $ LP60 : num -0.02616 -0.00901 -0.0024 0.01706 0.00695 ... *************************************************************** This message and any attachments are for the named person's use only. This message may contain confidential, proprietary or legally privileged information. No right to confidential or privileged treatment of this message is waived or lost by an error in transmission. If you have received this message in error, please immediately notify the sender by e-mail, delete the message, any attachments and all copies from your system and destroy any hard copies. You must not, directly or indirectly, use, disclose, distribute, print or copy any part of this message or any attachments if you are not the intended recipient. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Andrew Sent: Friday, October 03, 2014 1:40 PM To: r-help at r-project.org Subject: [R] Workaround for RODBC asymmetric numeric data treatment Note: I did raise report the issue below to r-sig-db at r-project.org, but didn't see any reply. I'm hoping somebody on r-help can help me devise a workaround for a problem I'm having with RODB: I use RODBC to read and write a good deal of data to SQL Server and I'd be extremely grateful if anyone has found a workaround in order to be able to write dataframes to SQL Server using RODBC dynamically created SQL tables and read the data from those tables, or indeed any arbitrary SQL Server table with "float" datatypes and end up with numeric columns instead of "factor" columns in a dataframe in R. I have found that when RODBC creates a Microsoft SQL Server data table from a dataFrame using sqlSave(....append=FALSE), RODBC uses the SQL "float" datatype to store R numeric data in a dynamically-created table on the server. However, when RODBC reads any SQL Server "float" datatype from SQL Server via sqlQuery it interprets float columns as "factor" data. I created a standalone sample below to demonstrate the odd behavior of RODBC that I hope to overcome: # Assuming the reader has access to SQL Server the code below is self-contained and repeatable # I believe it demonstrates unexpected and undesirable behavior in RODBC library(RODBC) library(fPortfolio) library(timeSeries) head(SWX.RET$SBI) str(SWX.RET$SBI) mydata<-as.timeSeries(SWX.RET) head(mydata) df2beSavedByRODBC =as.data.frame(mydata) str(df2beSavedByRODBC) # shows the numeric data in the dataframe # # data.frame': 1916 obs. of 6 variables: # $ SBI : num -0.002088 -0.000105 -0.00136 0.000419 0 ... # $ SPI : num -0.03439 -0.01041 0.01212 0.02246 0.00211 ... ... # Let's save the dataframe to SQL Server: dbconn<-odbcDriverConnect(connection="Driver={SQL Server};server=_YOURSERVERNAMEHER_;database=_YOURDBNAME_;Trusted_Connection=True;"); sqlSave(channel=dbconn,dat=df2beSavedByRODBC,tablename="testTable",rownames=TRUE,append=FALSE,addPK=FALSE,verbose=FALSE) # The sqlSave above works very well. The new table is create in the Microsoft SQL database and the ddl for the table is: # # [dbo].[testTable]( # [rownames] [varchar](255) NULL, # [SBI] [float] NULL, # [SPI] [float] NULL, # [SII] [float] NULL, # [LP25] [float] NULL, # [LP40] [float] NULL, # [LP60] [float] NULL # ) # The numeric values from the dataframe are stored as float (i.e. numeric) in SQL server -- good! ## now let's read back the data RODBC stored in SQL server from a SQL table RODBC created: sqlString = "select * from testTable" dataFrameFromDB = sqlQuery(dbconn,sqlString,errors=TRUE); str(dataFrameFromDB) # # 'data.frame': 1916 obs. of 7 variables: # $ rownames: Factor w/ 1916 levels "2000-01-04","2000-01-05",..: 1 2 3 4 5 6 7 8 9 10 ... # $ SBI : Factor w/ 1742 levels "-0.00041080415489958",..: 349 42 161 1418 828 48 49 1419 1024 135 ... # $ SPI : Factor w/ 1848 levels "-0.0020169904194276",..: 445 48 970 883 1187 377 1157 1065 951 1840 ... ... #********* RODBC wrote numeric data to SQL Server as float, but read the same data back as Factor ! ******** I could use some help to create a robust and flexible workaround for RODBC's asymmetric treatment of numeric data. If there were some way to force RODBC sqlQuery to interpret all SQL Server float datatypes as numeric my problem would be solved. FWIW: RODBC does interpret the SQL Server "real" datatype as numeric. Thank you, Andrew [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.