Frederik Vanrenterghem
2012-Jan-03 20:13 UTC
[R] sqldf and not converting integers to floating point in SQLite
Hi,
I have following 2 tables:
Table 1:
POSTAL | VALUE
1000|49
1010|100
1020|50
Table 2:
INSEE | POSTAL
A|1000
B|1000
C|1010
D|1020
I would like to convert this to the following:
INSEE | VALUE_SPREAD
A|24.5
B|24.5
C|100
D|50
I can achieve this with a nested SQL query (through counting the
number of POSTAL that belong to any given INSEE, and diving the value
of the postal in that INSEE by that number).
library(sqldf)
table1 <- read.csv("c:/R/table1.csv", sep=";")
table2 <- read.csv("c:/R/table2.csv", sep=";")
table3 <- sqldf("select table2.INSEE, table1.VALUE / counts.nPostals
as value_spread from table1, table2,(select POSTAL, count(INSEE) as
nPostals from table2 group by POSTAL) counts where table1.POSTAL counts.POSTAL
and table1.POSTAL=table2.POSTAL")
Unfortunately, the value I'm working with is an integer. In SQLite,
this results in the computed value also not being a float - so it gets
rounded up or down. In this case, I'm getting 24 for A & B instead of
24.5.
Is there a way to take care of this using other R concepts, avoiding
that problem (for instance using melt & cast)?
Thanks,
Frederik
jim holtman
2012-Jan-03 21:51 UTC
[R] sqldf and not converting integers to floating point in SQLite
try this:> library(sqldf) > table1 <- read.csv(text = "POSTAL | VALUE+ 1000|49 + 1010|100 + 1020|50", sep="|")> table2 <- read.csv(text = "INSEE | POSTAL+ A|1000 + B|1000 + C|1010 + D|1020", sep="|")> table3 <- sqldf("+ select table2.INSEE + , 1.0 * table1.VALUE / counts.nPostals as value_spread + from table1 + , table2 + ,(select POSTAL + , count(INSEE) as nPostals + from table2 + group by POSTAL) counts + where table1.POSTAL = counts.POSTAL + and table1.POSTAL=table2.POSTAL + ")> table3INSEE value_spread 1 A 24.5 2 B 24.5 3 C 100.0 4 D 50.0>On Tue, Jan 3, 2012 at 3:13 PM, Frederik Vanrenterghem <frederik at vanrenterghem.biz> wrote:> Hi, > > I have following 2 tables: > > Table 1: > POSTAL | VALUE > 1000|49 > 1010|100 > 1020|50 > > Table 2: > INSEE | POSTAL > A|1000 > B|1000 > C|1010 > D|1020 > > I would like to convert this to the following: > > INSEE | VALUE_SPREAD > A|24.5 > B|24.5 > C|100 > D|50 > > I can achieve this with a nested SQL query (through counting the > number of POSTAL that belong to any given INSEE, and diving the value > of the postal in that INSEE by that number). > > library(sqldf) > table1 <- read.csv("c:/R/table1.csv", sep=";") > table2 <- read.csv("c:/R/table2.csv", sep=";") > table3 <- sqldf("select table2.INSEE, table1.VALUE / counts.nPostals > as value_spread from table1, table2,(select POSTAL, count(INSEE) as > nPostals from table2 group by POSTAL) counts where table1.POSTAL > counts.POSTAL and table1.POSTAL=table2.POSTAL") > > Unfortunately, the value I'm working with is an integer. In SQLite, > this results in the computed value also not being a float - so it gets > rounded up or down. In this case, I'm getting 24 for A & B instead of > 24.5. > > Is there a way to take care of this using other R concepts, avoiding > that problem (for instance using melt & cast)? > > Thanks, > Frederik > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.