Izmirlian, Grant (NIH/NCI)
2005-Nov-19 00:27 UTC
[R] Batchjob creates small object but large workspace ???
I ran into an interesting problem that I think I have solved. I ran a batch job as "--no-save", electing only to save all objects there before the job started and one reasonably small object created as the result of the job, ( ~ 19K ). During the course of the job several large objects are generated, but not among the list of things which are saved. The interesting problem is that the .RData file ends up being around 250 MB in size larger than it was previously. Inside of R, "object.size" returns a reasonably accurate estimate, 19K, but somehow there is hidden junk. I am trying, as a write, the solution the problem. Before the "save" command at the bottome of the batch file I should delete the un-needed large objects and then call gc() . My thought is that the save operation needs to generate a large temporary file and then copy only the parts of it that are requested to be saved and this takes nearly all of my 1GB of system memory so that the poor "gc" program is shoved off the stack (or some semblence of this reasoning at least). Oh... (not so) great news... the job just finished and it looks like my idea was incorrect. So I am open to suggestions! .RData before run: 88951444 bytes .RData after run: 345671147 bytes size of object saved: 190588 bytes and by the way, I have good reason to trust the 19K estimate since the information contained in the object fits on about 3 screens.