I am using BackgrounDRb to resize images after they are accepted into our system. However, it has become clear that this creates a memory leak. I''m not sure exactly where the leak exists, but I don''t think it''s in my own code as it is presently being used in production and has resized thousands of images without a leak occurring, thanks to calling GC.start after every RMagick image read. I have it working like this. First of all, I changed the start script to include environment.rb and therefore the ENTIRE Rails environment, because my Image model is an engine plugin. Then, I have it perform this upon acceptance: MiddleMan.new_worker :class => :resize_worker, :args => self.id And here are the contents of ResizeWorker: class ResizeWorker < BackgrounDRb::Rails def do_work(args) image = Image.find(args) image.resize_all terminate end end If I run this on a few dozen images simultaneously I watch as top starts slowly requiring more memory for backgroundrb. Dropping into script/console tells me two things. For one, MiddleMan.jobs still has every job, despite my terminate() call. Second, calling destroy_worker on any of these calls, or gc! (Time.now), does not free up any of the memory, even though the jobs disappear. GC.start also does not free up any memory. If you have any advice on this, that would be great. Thanks, Joel
On Aug 7, 2006, at 3:51 PM, Joel Hayhurst wrote:> I am using BackgrounDRb to resize images after they are accepted into > our system. However, it has become clear that this creates a memory > leak. I''m not sure exactly where the leak exists, but I don''t think > it''s in my own code as it is presently being used in production and > has resized thousands of images without a leak occurring, thanks to > calling GC.start after every RMagick image read. > > I have it working like this. First of all, I changed the start > script to include environment.rb and therefore the ENTIRE Rails > environment, because my Image model is an engine plugin. Then, I > have it perform this upon acceptance: > > MiddleMan.new_worker :class => :resize_worker, :args => > self.id > > And here are the contents of ResizeWorker: > > class ResizeWorker < BackgrounDRb::Rails > def do_work(args) > image = Image.find(args) > image.resize_all > terminate > end > end > > If I run this on a few dozen images simultaneously I watch as top > starts slowly requiring more memory for backgroundrb. Dropping into > script/console tells me two things. > > For one, MiddleMan.jobs still has every job, despite my terminate() > call. Second, calling destroy_worker on any of these calls, or gc! > (Time.now), does not free up any of the memory, even though the jobs > disappear. GC.start also does not free up any memory. > > If you have any advice on this, that would be great. > > Thanks, > JoelHey Joel- Calling terminate does not immediately terminate the worker class. All it does is flag the thread as ok to terminate when a delete_worker call is made. So you will still need to call delete_worker or gc! in order to clear your worker out of @jobs. As far as actually getting the process memory back, you need to realize that even if you call GC.start, that doesn''t mean that ruby will automatically clean up all garbage. What platform are you on? And can you confirm whether the real memory or the virtual memory decrease after a delete_worker and a GC.start ? Where did you call GC.start from in your experiment? If you called it from the command line then it will not effect the drb server. Try this, add GC.start to the gc! method in the backgroundrb.rb file in the plugin root and restarting your drb server. like this: def gc!(age) @timestamps.each do |job_key, timestamp| if timestamp[timestamp[:expire_type]] < age delete_worker(job_key) end end GC.start end And now run a bunch of workers until memory usage grows. Then run MiddleMan.gc! Time.now from script/console and wait a little bit to see if your memory usage goes down. Please report back if this does solve your problem and I will add it to the distro. Also RMagick is know to leak sometimes so I have been using MiniMagick myself when I have needed to do image resizing in a worker. Let me know if none of these things solve your issue and I will try to get to the bottom of this. THanks -Ezra
Joel- OK I think that is the behavior we are looking for. With the GC.start in the gc! method I think you will be ok. I will add an option to completely terminate a worker and call GC.start from within a worker when it is done with its job. It will be in the next release. I will experiment with RMagick and drb myself here to see if I can get it to behave correctly. But MiniMagick works great sine it shells out to the Magick command line tools and therefore doesn''t ever leak any memory. You may want to try that if you are memory constrained at all. Sometimes when a process on linux eats up a lot of memory and tries to free it, it will not go away immediately in top. You said the virtual memory went down , what about the real memory? Where is that at after the GC.start call? -Ezra On Aug 7, 2006, at 4:27 PM, Joel Hayhurst wrote:> I''m not sure how to make it call delete_worker or gc! reliably; I''d > rather just have the worker die completely when the do_work method > was finished. Alternatively, a nice way to queue up calls to one > worker would do just as well, but I found that making a singleton > and later getting it and calling do_work(newarg) made me wait until > do_work was completed in the request. > > Anyway -- RMagick does have a "leak" of sorts in that you must call > GC.start after every Image.read() call. However after doing this I > have had no leak issues, and my resize_all method in the Image > class merely calls my resize method multiple times for all of the > sizes I need; there are no missing GC.start calls there. I would > be willing to try MiniMagick. > > I added your suggestion, restarted backgroundrb and mongrel, and am > presently having it resize a bunch of images. It started at around > ~20 megs virtual memory and after resizing about 100 images it is > at ~229 megs. Real memory usage is just about as high, and it is > using up 11.1% of my available RAM. The platform is Gentoo Linux. > Doing multiple hundreds of images I have seen it get up to around > 800 megs before I stopped. > > OK, I just ran gc!(Time.now) with the new GC.start call, and it > immediately went down to 143 megs of virtual memory. That is > unexpected; I expected it to either stay the same or drop down > nearer to 20 megs. > > On Aug 7, 2006, at 4:11 PM, Ezra Zygmuntowicz wrote: > >> >> On Aug 7, 2006, at 3:51 PM, Joel Hayhurst wrote: >> >>> I am using BackgrounDRb to resize images after they are accepted >>> into >>> our system. However, it has become clear that this creates a memory >>> leak. I''m not sure exactly where the leak exists, but I don''t think >>> it''s in my own code as it is presently being used in production and >>> has resized thousands of images without a leak occurring, thanks to >>> calling GC.start after every RMagick image read. >>> >>> I have it working like this. First of all, I changed the start >>> script to include environment.rb and therefore the ENTIRE Rails >>> environment, because my Image model is an engine plugin. Then, I >>> have it perform this upon acceptance: >>> >>> MiddleMan.new_worker :class => :resize_worker, :args => >>> self.id >>> >>> And here are the contents of ResizeWorker: >>> >>> class ResizeWorker < BackgrounDRb::Rails >>> def do_work(args) >>> image = Image.find(args) >>> image.resize_all >>> terminate >>> end >>> end >>> >>> If I run this on a few dozen images simultaneously I watch as top >>> starts slowly requiring more memory for backgroundrb. Dropping into >>> script/console tells me two things. >>> >>> For one, MiddleMan.jobs still has every job, despite my terminate() >>> call. Second, calling destroy_worker on any of these calls, or gc! >>> (Time.now), does not free up any of the memory, even though the jobs >>> disappear. GC.start also does not free up any memory. >>> >>> If you have any advice on this, that would be great. >>> >>> Thanks, >>> Joel >> >> >> Hey Joel- >> >> Calling terminate does not immediately terminate the worker >> class. All it does is flag the thread as ok to terminate when a >> delete_worker call is made. So you will still need to call >> delete_worker or gc! in order to clear your worker out of @jobs. >> As far as actually getting the process memory back, you need to >> realize that even if you call GC.start, that doesn''t mean that >> ruby will automatically clean up all garbage. What platform are >> you on? And can you confirm whether the real memory or the virtual >> memory decrease after a delete_worker and a GC.start ? >> >> Where did you call GC.start from in your experiment? If you >> called it from the command line then it will not effect the drb >> server. Try this, add GC.start to the gc! method in the >> backgroundrb.rb file in the plugin root and restarting your drb >> server. like this: >> >> def gc!(age) >> @timestamps.each do |job_key, timestamp| >> if timestamp[timestamp[:expire_type]] < age >> delete_worker(job_key) >> end >> end >> GC.start >> end >> >> And now run a bunch of workers until memory usage grows. Then run >> MiddleMan.gc! Time.now from script/console and wait a little bit >> to see if your memory usage goes down. Please report back if this >> does solve your problem and I will add it to the distro. >> >> Also RMagick is know to leak sometimes so I have been using >> MiniMagick myself when I have needed to do image resizing in a >> worker. >> >> Let me know if none of these things solve your issue and I will >> try to get to the bottom of this. >> >> THanks >> -Ezra >> >> >> >> >> >
Erza, Similar to this, I have case, where though worker thread will be invoked by user from page, but user will not interact with any further.So, this thread, will run and once its done with its job, it should quit. Now, is it possible? Without a explicit cron job, which will delete older worker threads? The job, may take any amount of time.If "terminate" doesnt delete the thread, then there should be a mechanism for self-deleting(or a method, which we can call explicitly) the thread from worker threads, once they are done with their job. On 8/8/06, Ezra Zygmuntowicz <ezmobius at gmail.com> wrote:> > > Joel- > > OK I think that is the behavior we are looking for. With the > GC.start in the gc! method I think you will be ok. I will add an > option to completely terminate a worker and call GC.start from within > a worker when it is done with its job. It will be in the next > release. I will experiment with RMagick and drb myself here to see if > I can get it to behave correctly. But MiniMagick works great sine it > shells out to the Magick command line tools and therefore doesn''t > ever leak any memory. You may want to try that if you are memory > constrained at all. Sometimes when a process on linux eats up a lot > of memory and tries to free it, it will not go away immediately in > top. You said the virtual memory went down , what about the real > memory? Where is that at after the GC.start call? > > -Ezra > > > On Aug 7, 2006, at 4:27 PM, Joel Hayhurst wrote: > > > I''m not sure how to make it call delete_worker or gc! reliably; I''d > > rather just have the worker die completely when the do_work method > > was finished. Alternatively, a nice way to queue up calls to one > > worker would do just as well, but I found that making a singleton > > and later getting it and calling do_work(newarg) made me wait until > > do_work was completed in the request. > > > > Anyway -- RMagick does have a "leak" of sorts in that you must call > > GC.start after every Image.read() call. However after doing this I > > have had no leak issues, and my resize_all method in the Image > > class merely calls my resize method multiple times for all of the > > sizes I need; there are no missing GC.start calls there. I would > > be willing to try MiniMagick. > > > > I added your suggestion, restarted backgroundrb and mongrel, and am > > presently having it resize a bunch of images. It started at around > > ~20 megs virtual memory and after resizing about 100 images it is > > at ~229 megs. Real memory usage is just about as high, and it is > > using up 11.1% of my available RAM. The platform is Gentoo Linux. > > Doing multiple hundreds of images I have seen it get up to around > > 800 megs before I stopped. > > > > OK, I just ran gc!(Time.now) with the new GC.start call, and it > > immediately went down to 143 megs of virtual memory. That is > > unexpected; I expected it to either stay the same or drop down > > nearer to 20 megs. > > > > On Aug 7, 2006, at 4:11 PM, Ezra Zygmuntowicz wrote: > > > >> > >> On Aug 7, 2006, at 3:51 PM, Joel Hayhurst wrote: > >> > >>> I am using BackgrounDRb to resize images after they are accepted > >>> into > >>> our system. However, it has become clear that this creates a memory > >>> leak. I''m not sure exactly where the leak exists, but I don''t think > >>> it''s in my own code as it is presently being used in production and > >>> has resized thousands of images without a leak occurring, thanks to > >>> calling GC.start after every RMagick image read. > >>> > >>> I have it working like this. First of all, I changed the start > >>> script to include environment.rb and therefore the ENTIRE Rails > >>> environment, because my Image model is an engine plugin. Then, I > >>> have it perform this upon acceptance: > >>> > >>> MiddleMan.new_worker :class => :resize_worker, :args => > >>> self.id > >>> > >>> And here are the contents of ResizeWorker: > >>> > >>> class ResizeWorker < BackgrounDRb::Rails > >>> def do_work(args) > >>> image = Image.find(args) > >>> image.resize_all > >>> terminate > >>> end > >>> end > >>> > >>> If I run this on a few dozen images simultaneously I watch as top > >>> starts slowly requiring more memory for backgroundrb. Dropping into > >>> script/console tells me two things. > >>> > >>> For one, MiddleMan.jobs still has every job, despite my terminate() > >>> call. Second, calling destroy_worker on any of these calls, or gc! > >>> (Time.now), does not free up any of the memory, even though the jobs > >>> disappear. GC.start also does not free up any memory. > >>> > >>> If you have any advice on this, that would be great. > >>> > >>> Thanks, > >>> Joel > >> > >> > >> Hey Joel- > >> > >> Calling terminate does not immediately terminate the worker > >> class. All it does is flag the thread as ok to terminate when a > >> delete_worker call is made. So you will still need to call > >> delete_worker or gc! in order to clear your worker out of @jobs. > >> As far as actually getting the process memory back, you need to > >> realize that even if you call GC.start, that doesn''t mean that > >> ruby will automatically clean up all garbage. What platform are > >> you on? And can you confirm whether the real memory or the virtual > >> memory decrease after a delete_worker and a GC.start ? > >> > >> Where did you call GC.start from in your experiment? If you > >> called it from the command line then it will not effect the drb > >> server. Try this, add GC.start to the gc! method in the > >> backgroundrb.rb file in the plugin root and restarting your drb > >> server. like this: > >> > >> def gc!(age) > >> @timestamps.each do |job_key, timestamp| > >> if timestamp[timestamp[:expire_type]] < age > >> delete_worker(job_key) > >> end > >> end > >> GC.start > >> end > >> > >> And now run a bunch of workers until memory usage grows. Then run > >> MiddleMan.gc! Time.now from script/console and wait a little bit > >> to see if your memory usage goes down. Please report back if this > >> does solve your problem and I will add it to the distro. > >> > >> Also RMagick is know to leak sometimes so I have been using > >> MiniMagick myself when I have needed to do image resizing in a > >> worker. > >> > >> Let me know if none of these things solve your issue and I will > >> try to get to the bottom of this. > >> > >> THanks > >> -Ezra > >> > >> > >> > >> > >> > > > > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel >-- nothing much to talk -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20060808/25253f19/attachment-0001.html
Yes, please see my [ANN] thread. I just pushed a minor new release last night that solves many issues including this one. -Ezra On Aug 7, 2006, at 11:39 PM, hemant wrote:> Erza, > > Similar to this, I have case, where though worker thread will be > invoked by user from page, but user will not interact with any > further.So, this thread, will run and once its done with its job, > it should quit. > > Now, is it possible? Without a explicit cron job, which will delete > older worker threads? > > The job, may take any amount of time.If "terminate" doesnt delete > the thread, then there should be a mechanism for self-deleting(or a > method, which we can call explicitly) the thread from worker > threads, once they are done with their job. > > > On 8/8/06, Ezra Zygmuntowicz <ezmobius at gmail.com> wrote: > > Joel- > > OK I think that is the behavior we are looking for. With the > GC.start in the gc! method I think you will be ok. I will add an > option to completely terminate a worker and call GC.start from within > a worker when it is done with its job. It will be in the next > release. I will experiment with RMagick and drb myself here to see if > I can get it to behave correctly. But MiniMagick works great sine it > shells out to the Magick command line tools and therefore doesn''t > ever leak any memory. You may want to try that if you are memory > constrained at all. Sometimes when a process on linux eats up a lot > of memory and tries to free it, it will not go away immediately in > top. You said the virtual memory went down , what about the real > memory? Where is that at after the GC.start call? > > -Ezra > > > On Aug 7, 2006, at 4:27 PM, Joel Hayhurst wrote: > > > I''m not sure how to make it call delete_worker or gc! reliably; I''d > > rather just have the worker die completely when the do_work method > > was finished. Alternatively, a nice way to queue up calls to one > > worker would do just as well, but I found that making a singleton > > and later getting it and calling do_work(newarg) made me wait until > > do_work was completed in the request. > > > > Anyway -- RMagick does have a "leak" of sorts in that you must call > > GC.start after every Image.read() call. However after doing this I > > have had no leak issues, and my resize_all method in the Image > > class merely calls my resize method multiple times for all of the > > sizes I need; there are no missing GC.start calls there. I would > > be willing to try MiniMagick. > > > > I added your suggestion, restarted backgroundrb and mongrel, and am > > presently having it resize a bunch of images. It started at around > > ~20 megs virtual memory and after resizing about 100 images it is > > at ~229 megs. Real memory usage is just about as high, and it is > > using up 11.1% of my available RAM. The platform is Gentoo Linux. > > Doing multiple hundreds of images I have seen it get up to around > > 800 megs before I stopped. > > > > OK, I just ran gc!(Time.now ) with the new GC.start call, and it > > immediately went down to 143 megs of virtual memory. That is > > unexpected; I expected it to either stay the same or drop down > > nearer to 20 megs. > > > > On Aug 7, 2006, at 4:11 PM, Ezra Zygmuntowicz wrote: > > > >> > >> On Aug 7, 2006, at 3:51 PM, Joel Hayhurst wrote: > >> > >>> I am using BackgrounDRb to resize images after they are accepted > >>> into > >>> our system. However, it has become clear that this creates a > memory > >>> leak. I''m not sure exactly where the leak exists, but I don''t > think > >>> it''s in my own code as it is presently being used in production > and > >>> has resized thousands of images without a leak occurring, > thanks to > >>> calling GC.start after every RMagick image read. > >>> > >>> I have it working like this. First of all, I changed the start > >>> script to include environment.rb and therefore the ENTIRE Rails > >>> environment, because my Image model is an engine plugin. Then, I > >>> have it perform this upon acceptance: > >>> > >>> MiddleMan.new_worker :class => :resize_worker, :args => > >>> self.id > >>> > >>> And here are the contents of ResizeWorker: > >>> > >>> class ResizeWorker < BackgrounDRb::Rails > >>> def do_work(args) > >>> image = Image.find(args) > >>> image.resize_all > >>> terminate > >>> end > >>> end > >>> > >>> If I run this on a few dozen images simultaneously I watch as top > >>> starts slowly requiring more memory for backgroundrb. Dropping > into > >>> script/console tells me two things. > >>> > >>> For one, MiddleMan.jobs still has every job, despite my > terminate() > >>> call. Second, calling destroy_worker on any of these calls, or > gc! > >>> (Time.now), does not free up any of the memory, even though the > jobs > >>> disappear. GC.start also does not free up any memory. > >>> > >>> If you have any advice on this, that would be great. > >>> > >>> Thanks, > >>> Joel > >> > >> > >> Hey Joel- > >> > >> Calling terminate does not immediately terminate the worker > >> class. All it does is flag the thread as ok to terminate when a > >> delete_worker call is made. So you will still need to call > >> delete_worker or gc! in order to clear your worker out of @jobs. > >> As far as actually getting the process memory back, you need to > >> realize that even if you call GC.start, that doesn''t mean that > >> ruby will automatically clean up all garbage. What platform are > >> you on? And can you confirm whether the real memory or the virtual > >> memory decrease after a delete_worker and a GC.start ? > >> > >> Where did you call GC.start from in your experiment? If you > >> called it from the command line then it will not effect the drb > >> server. Try this, add GC.start to the gc! method in the > >> backgroundrb.rb file in the plugin root and restarting your drb > >> server. like this: > >> > >> def gc!(age) > >> @timestamps.each do |job_key, timestamp| > >> if timestamp[timestamp[:expire_type]] < age > >> delete_worker(job_key) > >> end > >> end > >> GC.start > >> end > >> > >> And now run a bunch of workers until memory usage grows. > Then run > >> MiddleMan.gc! Time.now from script/console and wait a little bit > >> to see if your memory usage goes down. Please report back if this > >> does solve your problem and I will add it to the distro. > >> > >> Also RMagick is know to leak sometimes so I have been using > >> MiniMagick myself when I have needed to do image resizing in a > >> worker. > >> > >> Let me know if none of these things solve your issue and I > will > >> try to get to the bottom of this. > >> > >> THanks > >> -Ezra > >> > >> > >> > >> > >> > > > > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > -- > nothing much to talk > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20060808/5f68d8d0/attachment.html