Perttu Muurimäki
2001-Jan-17 09:06 UTC
[R] Huge memory comsumption with foreign and RPgSQL
I know this is something R isn't meant to do well but I tried it anyway :) I have this SPSS-datafile (size 31 MB). When I converted it to a R object with read.spss("datafile.sav") I ended up with a .RData-file which was 229 MB big. Is this considered normal? Then I tried to dump that object into a database with RPgSQL-package function db.write.table(object) (Memory ran out first time I tried to convert SPSS-file into a R-object so I was quite prepared for the database manouvre ; I increased the size of swap (working with linux) to 2500 MB) The process kept going and going and getting bigger and bigger. After 6 hours and 30 minutes I aborted it. At that time the process had grown into 1400 MB:s. Again, is this considered normal? And further more, am I likely to succeed if I'm patient enough? -perttu- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Peter Dalgaard BSA
2001-Jan-17 11:10 UTC
[R] Huge memory comsumption with foreign and RPgSQL
Perttu Muurim?ki <Perttu.Muurimaki at Helsinki.Fi> writes:> I know this is something R isn't meant to do well but I tried it anyway :) > > I have this SPSS-datafile (size 31 MB). When I converted it to a R object > with read.spss("datafile.sav") I ended up with a .RData-file which was 229 > MB big. Is this considered normal?Doesn't sound completely unreasonable: If all your fields fit in a single byte to begin with and get converted to double in the process, you'll have an inflation by a factor of 8.> Then I tried to dump that object into a database with RPgSQL-package > function db.write.table(object) (Memory ran out first time I tried to > convert SPSS-file into a R-object so I was quite prepared for the > database manouvre ; I increased the size of swap (working with linux) to > 2500 MB) The process kept going and going and getting bigger and bigger. > After 6 hours and 30 minutes I aborted it. At that time the process had > grown into 1400 MB:s. Again, is this considered normal? And further more, > am I likely to succeed if I'm patient enough?This, however, sounds a bit excessive, although I wouldn't know exactly what goes on inside RPgSQL... If it is converting every field in the entire data frame to string form before sending it to the database, then I might understand. Might it be possible to send it in smaller blocks? -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._