Frederik Vanrenterghem
2012-Jan-03 20:13 UTC
[R] sqldf and not converting integers to floating point in SQLite
Hi, I have following 2 tables: Table 1: POSTAL | VALUE 1000|49 1010|100 1020|50 Table 2: INSEE | POSTAL A|1000 B|1000 C|1010 D|1020 I would like to convert this to the following: INSEE | VALUE_SPREAD A|24.5 B|24.5 C|100 D|50 I can achieve this with a nested SQL query (through counting the number of POSTAL that belong to any given INSEE, and diving the value of the postal in that INSEE by that number). library(sqldf) table1 <- read.csv("c:/R/table1.csv", sep=";") table2 <- read.csv("c:/R/table2.csv", sep=";") table3 <- sqldf("select table2.INSEE, table1.VALUE / counts.nPostals as value_spread from table1, table2,(select POSTAL, count(INSEE) as nPostals from table2 group by POSTAL) counts where table1.POSTAL counts.POSTAL and table1.POSTAL=table2.POSTAL") Unfortunately, the value I'm working with is an integer. In SQLite, this results in the computed value also not being a float - so it gets rounded up or down. In this case, I'm getting 24 for A & B instead of 24.5. Is there a way to take care of this using other R concepts, avoiding that problem (for instance using melt & cast)? Thanks, Frederik
jim holtman
2012-Jan-03 21:51 UTC
[R] sqldf and not converting integers to floating point in SQLite
try this:> library(sqldf) > table1 <- read.csv(text = "POSTAL | VALUE+ 1000|49 + 1010|100 + 1020|50", sep="|")> table2 <- read.csv(text = "INSEE | POSTAL+ A|1000 + B|1000 + C|1010 + D|1020", sep="|")> table3 <- sqldf("+ select table2.INSEE + , 1.0 * table1.VALUE / counts.nPostals as value_spread + from table1 + , table2 + ,(select POSTAL + , count(INSEE) as nPostals + from table2 + group by POSTAL) counts + where table1.POSTAL = counts.POSTAL + and table1.POSTAL=table2.POSTAL + ")> table3INSEE value_spread 1 A 24.5 2 B 24.5 3 C 100.0 4 D 50.0>On Tue, Jan 3, 2012 at 3:13 PM, Frederik Vanrenterghem <frederik at vanrenterghem.biz> wrote:> Hi, > > I have following 2 tables: > > Table 1: > POSTAL | VALUE > 1000|49 > 1010|100 > 1020|50 > > Table 2: > INSEE | POSTAL > A|1000 > B|1000 > C|1010 > D|1020 > > I would like to convert this to the following: > > INSEE | VALUE_SPREAD > A|24.5 > B|24.5 > C|100 > D|50 > > I can achieve this with a nested SQL query (through counting the > number of POSTAL that belong to any given INSEE, and diving the value > of the postal in that INSEE by that number). > > library(sqldf) > table1 <- read.csv("c:/R/table1.csv", sep=";") > table2 <- read.csv("c:/R/table2.csv", sep=";") > table3 <- sqldf("select table2.INSEE, table1.VALUE / counts.nPostals > as value_spread from table1, table2,(select POSTAL, count(INSEE) as > nPostals from table2 group by POSTAL) counts where table1.POSTAL > counts.POSTAL and table1.POSTAL=table2.POSTAL") > > Unfortunately, the value I'm working with is an integer. In SQLite, > this results in the computed value also not being a float - so it gets > rounded up or down. In this case, I'm getting 24 for A & B instead of > 24.5. > > Is there a way to take care of this using other R concepts, avoiding > that problem (for instance using melt & cast)? > > Thanks, > Frederik > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.