thr3ads.net - Backgroundrb devel - [Backgroundrb-devel] Memory leak and long process problem [Jan 2008]

If this information is useful, please help other people find it:
Share via:

Dave Dupre

2008-Jan-03 20:10 UTC

[Backgroundrb-devel] Memory leak and long process problem

I use backgroundrb for many long tasks in my system, but I''m having
issues
with one in particular.  Two large tasks for me are importing people and
updating companies.

  def import_contacts(args = nil)
    thread_pool.defer(args) do |job_id|
      begin
        job = ImportJob.find(job_id)
        job.process_job
      rescue => err
        logger.error "MscWorker#import_contacts failed! #{err.class}:
#{err}"
      end
    end
  end

  def update_company_from_vendor(args = nil)
    thread_pool.defer(args) do |company_id|
      begin
        company = Company.find(company_id)
        info = company.firm_info_from_vendor  # webservice call to vendor
        if info && info.size == 1
         
company.update_from_vendor!(Company.find_firm_info_details_from_vendor(info[0]))
# webservice call to vendor
        end
      rescue => err
        logger.error "MscWorker#update_company_from_vendor failed! #{
err.class}: #{err}"
      end
    end
  end

Part of import_contacts will result in many ask_work calls to
update_company_from_vendor while it is processing.  Importing contacts is
heavily db dependent, but not very code intensive.  If I upload two files
with > 1000 contacts each (two ask_work calls to import_contacts), things
will progress along and then pause for 20-40 seconds.  There is no DB
activity during the pause, but the backgroundrb process is using most of CPU
(98-99%).  There are no deadlock errors when things startup again, but it
really slows things down.  Are you using polling somewhere?

Also, on my Mac, Activity Monitor is only showing 1 thread and 1.2 Gig(!!)
of memory used.  I expected to see many threads due to my use of
thread_pool.

Since all of my processing code is in models, it is very easy to switch to
synchronous execution.  When I execute job.process_job (see
import_contacts), things never pause, and the ruby process never gets over
120Meg in size.

This all leaves me with two questions:

1. Sure looks like there is a serious memory leak someplace, but I
don''t
think it is my code.

2. What is the recommended method for this processing.  Currently, I have a
single worker for my web app to call for background tasks -- each task is
implemented as a thread pool.  I don''t have much need for status since
I can
get the status from the database.  Should I change things to dynamically
create workers?

3. I should repeat that I never saw multiple threads being created even
though update_company_from_vendor was called 1500 times during one call to
import_contacts.  update_company_from_vendor takes several seconds to
execute so I know calls should have queued up.

Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20080103/912d7699/attachment.html

hemant

2008-Jan-03 21:32 UTC

head link

[Backgroundrb-devel] Memory leak and long process problem

Hi Dave,

On Jan 4, 2008 1:40 AM, Dave Dupre <gobigdave at gmail.com>
wrote:> I use backgroundrb for many long tasks in my system, but I''m
having issues
> with one in particular.  Two large tasks for me are importing people and
> updating companies.
>
So method below is invoked from rails using ask_work command, right?
Are you by any chance, passing uploaded  file itself to worker? If
yes, preferably don''t do that. Save the file somewhere or in db and
pas the location.

And why do you need, thread_pool there? Do you want concurrent
execution of tasks, or you just want a worker queue?
>   def import_contacts(args = nil)
>     thread_pool.defer(args) do |job_id|
>       begin
>         job = ImportJob.find(job_id)
>         job.process_job
>       rescue => err
>         logger.error "MscWorker#import_contacts failed! #{err.class}:
> #{err}"
>       end
>     end
>   end
>
Similar doubts as previous worker method.
>   def update_company_from_vendor(args = nil)
>     thread_pool.defer(args) do |company_id|
>       begin
>         company = Company.find(company_id)
>         info = company.firm_info_from_vendor  # webservice call to vendor
>         if info && info.size == 1
>           company.update_from_vendor!(Company.find_firm_info_details
> _from_vendor(info[0]))  # webservice call to vendor
>         end
>       rescue => err
>         logger.error "MscWorker#update_company_from_vendor failed!
> #{err.class}: #{err}"
>       end
>     end
>   end
>
> Part of import_contacts will result in many ask_work calls to
> update_company_from_vendor while it is processing.  Importing contacts is
> heavily db dependent, but not very code intensive.  If I upload two files
> with > 1000 contacts each (two ask_work calls to import_contacts),
things
> will progress along and then pause for 20-40 seconds.  There is no DB
> activity during the pause, but the backgroundrb process is using most of
CPU
> (98-99%).  There are no deadlock errors when things startup again, but it
> really slows things down.  Are you using polling somewhere?
>
> Also, on my Mac, Activity Monitor is only showing 1 thread and 1.2 Gig(!!)
> of memory used.  I expected to see many threads due to my use of
> thread_pool.
>
> Since all of my processing code is in models, it is very easy to switch to
> synchronous execution.  When I execute job.process_job (see
> import_contacts), things never pause, and the ruby process never gets over
> 120Meg in size.
>
> This all leaves me with two questions:
>
> 1. Sure looks like there is a serious memory leak someplace, but I
don''t
> think it is my code.
>
> 2. What is the recommended method for this processing.  Currently, I have a
> single worker for my web app to call for background tasks -- each task is
> implemented as a thread pool.  I don''t have much need for status
since I can
> get the status from the database.  Should I change things to dynamically
> create workers?
>
> 3. I should repeat that I never saw multiple threads being created even
> though update_company_from_vendor was called 1500 times during one call to
> import_contacts.  update_company_from_vendor takes several seconds to
> execute so I know calls should have queued up.
Ruby uses green threads, so I don''t think Activity Monitor will show
multiple created threads.Also, thread pool reaches its pool size
depending upon number of tasks in the queue. If queue is empty, only
one thread will be actually created initially.




-- 
Let them talk of their oriental summer climes of everlasting
conservatories; give me the privilege of making my own summer with my
own coals.

http://gnufied.org

Seemingly Similar Threads

Search for more maybe matching threads

Backgroundrb devel - Jan 2008 - Memory leak and long process problem

[Backgroundrb-devel] Memory leak and long process problem

[Backgroundrb-devel] Memory leak and long process problem

Seemingly Similar Threads