Hello - I am observing that calls to MiddleMan.worker return only after the worker has completed its work. This puzzles me, and I presume that I am doing something wrong. Can anyone make suggestions? Snippets from my code are: class ReportController < SecurityController ... def create_xml_report(report,start_time) constraints = get_constraints(report,start_time,false) args = { :constraints => constraints, :start_time => start_time, :report_params => report } # For reasons unknown, the following seems to block! session[:xml_report_job] MiddleMan.new_worker(:class => :xml_report_worker, :args => args) end def get_progress ... worker = MiddleMan.worker(session[job]) end ... end class XmlReportWorker < BackgrounDRb::Worker::RailsBase include DRbUndumped ... def do_work(args) @progress = 0 @report_start_time = args[:start_time] @report_params = args[:report_params] constraints = args[:constraints] dtd_only = args[:dtd_only] io = Tmpfile.new(''csv_report'',''/var/tmp'') @path = io.path xml = Builder::XmlMarkup.new(:indent => 2, :target => io) xml.instruct! (dtd_only.nil? or !dtd_only) ? data(xml,constraints) : dtd(xml) io.close @progress = 100 end ... end Within the run_worker, the data method is where the bulk of the time is spent, though to my understanding, what I do in that method should not be able to impact the blocking state of the queries to get_progress. The data method does ALOT of postgresql interaction to extract the data needed to build the report, all of which is read-only. Does anyone have suggestions for what may be causing the blocking situation, and/or how to figure out and fix that issue? Thanks in advance - Marc
I see the I place a comment in my posting in the wrong place. See in context below: On Wed, 7 Mar 2007, Marc Evans wrote:> Hello - > > I am observing that calls to MiddleMan.worker return only after the worker > has completed its work. This puzzles me, and I presume that I am doing > something wrong. Can anyone make suggestions? > > Snippets from my code are: > > class ReportController < SecurityController > ... > def create_xml_report(report,start_time) > constraints = get_constraints(report,start_time,false) > args = { :constraints => constraints, > :start_time => start_time, > :report_params => report > } > # For reasons unknown, the following seems to block! > session[:xml_report_job] > MiddleMan.new_worker(:class => :xml_report_worker, :args => args)The above line is NOT the problem line...> end > > def get_progress > ... > worker = MiddleMan.worker(session[job])The above line IS the line that I am finding blocks when called, until the worker is complete with its work.> end > ... > end > > class XmlReportWorker < BackgrounDRb::Worker::RailsBase > include DRbUndumped > ... > def do_work(args) > @progress = 0 > @report_start_time = args[:start_time] > @report_params = args[:report_params] > constraints = args[:constraints] > dtd_only = args[:dtd_only] > io = Tmpfile.new(''csv_report'',''/var/tmp'') > @path = io.path > xml = Builder::XmlMarkup.new(:indent => 2, :target => io) > xml.instruct! > (dtd_only.nil? or !dtd_only) ? data(xml,constraints) : dtd(xml) > io.close > @progress = 100 > end > ... > end > > Within the run_worker, the data method is where the bulk of the time is > spent, though to my understanding, what I do in that method should not be > able to impact the blocking state of the queries to get_progress. The data > method does ALOT of postgresql interaction to extract the data needed to > build the report, all of which is read-only. > > Does anyone have suggestions for what may be causing the blocking > situation, and/or how to figure out and fix that issue? > > Thanks in advance - Marc > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel >
# For reasons unknown, the following seems to block! session[:xml_report_job] MiddleMan.new_worker(:class => :xml_report_worker, :args => args) The above looks right to me. It should not matter at all what is in the worker class. When you say it is blocking -- how do you know? What is the symptom? Is is just slow? If you create a test worker that loops indefinitely (or for a sufficiently long time, "sleep 120" should do) -- does it that block when you create a new instance with MiddleMan.new_worker? If it is just slow to return, I find there is some lag when spawning a new process, which can make the MiddleMan.new_worker call sluggish to return. But in my experience it does not block until the worker is done. It *does* block until the process is created. A way to speed up the new_worker calls is to set an explicit job_key to call the do_work method on a long running process, rather than starting a new process. Mason -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070307/6ad37081/attachment.html
On Wed, 7 Mar 2007, Mason Hale wrote:> # For reasons unknown, the following seems to block! > session[:xml_report_job] > MiddleMan.new_worker(:class => :xml_report_worker, :args => args) > > The above looks right to me. It should not matter at all what is in the > worker class.You may be seen my follow up post, where I indicated that I put the comment aboev the wrong line of code. The line that is blocking is the line for MiddleMan.worker.> When you say it is blocking -- how do you know? What is the symptom? Is is > just slow?The symptom is that when I try to obtain the worker in order to then obtain the progress value, the call to MiddleMan.worker blocks until the worker has completed its work.> If you create a test worker that loops indefinitely (or for a sufficiently > long time, "sleep 120" should do) -- does it that block when you create a > new instance with MiddleMan.new_worker?In the example code I sent, I replaced one line in do_work as shown here: # (dtd_only.nil? or !dtd_only) ? data(xml,constraints) : dtd(xml) i=0 while (i < 100) i += 10 sleep 10 @progress = i end By doing so, I no longer experiencing the blocking behavior. The clearly tells me that something in the data method is the direct/indirect cause.> If it is just slow to return, I find there is some lag when spawning a new > process, which can make the MiddleMan.new_worker call sluggish to return. > But in my experience it does not block until the worker is done. It *does* > block until the process is created. > > A way to speed up the new_worker calls is to set an explicit job_key to call > the do_work method on a long running process, rather than starting a new > process. > > Mason >Thanks for your followup message. - Marc
On Wed, 7 Mar 2007, Mason Hale wrote:> # For reasons unknown, the following seems to block! > session[:xml_report_job] > MiddleMan.new_worker(:class => :xml_report_worker, :args => args) > > The above looks right to me. It should not matter at all what is in the > worker class. > > When you say it is blocking -- how do you know? What is the symptom? Is is > just slow? > > If you create a test worker that loops indefinitely (or for a sufficiently > long time, "sleep 120" should do) -- does it that block when you create a > new instance with MiddleMan.new_worker? > > If it is just slow to return, I find there is some lag when spawning a new > process, which can make the MiddleMan.new_worker call sluggish to return. > But in my experience it does not block until the worker is done. It *does* > block until the process is created. > > A way to speed up the new_worker calls is to set an explicit job_key to call > the do_work method on a long running process, rather than starting a new > process. > > MasonI am discovering more information on this, but don''t understand the underlying cause still. Below is my data method and a method it calls within scope od do_work: def data(xml,constraints) dtd(xml) @progress = 10 xml.telemetry do telem_to_xml(CacheTelemetry,:cache_telem_to_xml,constraints,xml) @progress = 30 telem_to_xml(CountTelemetry,:count_telem_to_xml,constraints,xml) @progress = 50 telem_to_xml(StreamTelemetry,:stream_telem_to_xml,constraints,xml) @progress = 70 telem_to_xml(PolledTelemetry,:polled_telem_to_xml,constraints,xml) @progress = 90 end end def telem_to_xml(klass,xml_method,constraints,xml) opts = constraints.dup opts[:limit] = 5000 offset = 0 while true do opts[:offset] = offset offset += opts[:limit] found = klass.find_restricted(:all,opts) break if (found.nil? or found.size < 1) found.each { |dct| self.method(xml_method).call(dct,xml) ; sleep 0.001 } end end Notice in the above the "sleep 0.001". Without that sleep, I get the blocking behavior I described in the original post. With it, I get the behavior I would expect, which is that I can retrieve the progress reasonably quickly and repeatedly. Again, any suggestions? - Marc
On 3/7/07, Marc Evans <Marc at softwarehackery.com> wrote:> Notice in the above the "sleep 0.001". Without that sleep, I get the > blocking behavior I described in the original post. With it, I get the > behavior I would expect, which is that I can retrieve the progress > reasonably quickly and repeatedly. > > Again, any suggestions?That is very odd. My gut says it must have something to do with locking between threads. I''ve previously run into thread-related issues around the worker results feature, finding that it is unreliable. This smells like a similar problem. more detail: http://rubyforge.org/pipermail/backgroundrb-devel/2007-January/000638.html http://backgroundrb.devjavu.com/projects/backgroundrb/ticket/43 My suggestion would be to store your worker state externally in a database or other store. That "sleep 0.0001" works at all is weird, and I''d be hesitant to rely on it continuing to work. Mason -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070307/587066bc/attachment.html
On Mar 7, 2007, at 1:46 PM, Mason Hale wrote:> > On 3/7/07, Marc Evans <Marc at softwarehackery.com> wrote: > Notice in the above the "sleep 0.001". Without that sleep, I get the > blocking behavior I described in the original post. With it, I get the > behavior I would expect, which is that I can retrieve the progress > reasonably quickly and repeatedly. > > Again, any suggestions? > > That is very odd. My gut says it must have something to do with > locking between threads. > I''ve previously run into thread-related issues around the worker > results feature, finding > that it is unreliable. This smells like a similar problem. > > more detail: > http://rubyforge.org/pipermail/backgroundrb-devel/2007-January/ > 000638.html > http://backgroundrb.devjavu.com/projects/backgroundrb/ticket/43 > > My suggestion would be to store your worker state externally in a > database or other store. > > That "sleep 0.0001" works at all is weird, and I''d be hesitant to > rely on it continuing to work. > > Mason >Actually the sleep makes sense. It would seem that your code is blocking the thread which makes all the other threads block. Putting a sleep in there gives the scheduler time to schedule the other threads to be run. Cheers- -- Ezra Zygmuntowicz -- Lead Rails Evangelist -- ez at engineyard.com -- Engine Yard, Serious Rails Hosting -- (866) 518-YARD (9273)
On Wed, 7 Mar 2007, Mason Hale wrote:> On 3/7/07, Marc Evans <Marc at softwarehackery.com> wrote: > >> Notice in the above the "sleep 0.001". Without that sleep, I get the >> blocking behavior I described in the original post. With it, I get the >> behavior I would expect, which is that I can retrieve the progress >> reasonably quickly and repeatedly. >> >> Again, any suggestions? > > > That is very odd. My gut says it must have something to do with locking > between threads. > I''ve previously run into thread-related issues around the worker results > feature, finding > that it is unreliable. This smells like a similar problem. > > more detail: > http://rubyforge.org/pipermail/backgroundrb-devel/2007-January/000638.html > http://backgroundrb.devjavu.com/projects/backgroundrb/ticket/43 > > My suggestion would be to store your worker state externally in a database > or other store. > > That "sleep 0.0001" works at all is weird, and I''d be hesitant to rely on it > continuing to work. > > Mason >Thanks again for the follow up. I have to agree with your observation. I have also since found that "sleep 0" removes the blocking problem, which further supports your assertions. One last thing I will try, based on other feedback about ruby 1.8.5 problems, is to try going back to 1.8.4 to see if that makes any difference. For reference, I am using the following: FreeBSD-6 AMD Opteron ruby-1.8.5 DBbackgroundRB-0.2.1 - Marc
On 3/7/07, Ezra Zygmuntowicz <ezmobius at gmail.com> wrote:> > That "sleep 0.0001" works at all is weird, and I''d be hesitant to > > rely on it continuing to work. > > Actually the sleep makes sense. It would seem that your code is > blocking the thread which makes all the other threads block. Putting > a sleep in there gives the scheduler time to schedule the other > threads to be run. >Yeah, I''d buy that. Would also explain ''sleep 0'' working. Mason -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070307/aa10c2cc/attachment-0001.html
On Wed, 7 Mar 2007, Ezra Zygmuntowicz wrote:> > On Mar 7, 2007, at 1:46 PM, Mason Hale wrote: > >> >> On 3/7/07, Marc Evans <Marc at softwarehackery.com> wrote: >> Notice in the above the "sleep 0.001". Without that sleep, I get the >> blocking behavior I described in the original post. With it, I get the >> behavior I would expect, which is that I can retrieve the progress >> reasonably quickly and repeatedly. >> >> Again, any suggestions? >> >> That is very odd. My gut says it must have something to do with locking >> between threads. >> I''ve previously run into thread-related issues around the worker results >> feature, finding >> that it is unreliable. This smells like a similar problem. >> >> more detail: >> http://rubyforge.org/pipermail/backgroundrb-devel/2007-January/000638.html >> http://backgroundrb.devjavu.com/projects/backgroundrb/ticket/43 >> >> My suggestion would be to store your worker state externally in a database >> or other store. >> >> That "sleep 0.0001" works at all is weird, and I''d be hesitant to rely on >> it continuing to work. >> >> Mason >> > > Actually the sleep makes sense. It would seem that your code is > blocking the thread which makes all the other threads block. Putting a sleep > in there gives the scheduler time to schedule the other threads to be run. > > Cheers- > > -- Ezra Zygmuntowicz-- Lead Rails Evangelist > -- ez at engineyard.com > -- Engine Yard, Serious Rails Hosting > -- (866) 518-YARD (9273) >So, is there a "best practice" method of dealing with this? My worker code is in a tight loop, marshalling a few million records into XML, outputting the marshalled results to a file. I would have expected that each database query (I grab 5000 rows at a time) and each IO write would be a context switch candidate, but aparenetly not. Is there a call I should make to explicitely yield periodically? - Marc
On Mar 7, 2007, at 2:07 PM, Marc Evans wrote:> > On Wed, 7 Mar 2007, Ezra Zygmuntowicz wrote: > >> >> On Mar 7, 2007, at 1:46 PM, Mason Hale wrote: >> >>> On 3/7/07, Marc Evans <Marc at softwarehackery.com> wrote: >>> Notice in the above the "sleep 0.001". Without that sleep, I get the >>> blocking behavior I described in the original post. With it, I >>> get the >>> behavior I would expect, which is that I can retrieve the progress >>> reasonably quickly and repeatedly. >>> Again, any suggestions? >>> That is very odd. My gut says it must have something to do with >>> locking between threads. >>> I''ve previously run into thread-related issues around the worker >>> results feature, finding >>> that it is unreliable. This smells like a similar problem. >>> more detail: >>> http://rubyforge.org/pipermail/backgroundrb-devel/2007-January/ >>> 000638.html >>> http://backgroundrb.devjavu.com/projects/backgroundrb/ticket/43 >>> My suggestion would be to store your worker state externally in a >>> database or other store. >>> That "sleep 0.0001" works at all is weird, and I''d be hesitant to >>> rely on it continuing to work. >>> Mason >> >> Actually the sleep makes sense. It would seem that your code is >> blocking the thread which makes all the other threads block. >> Putting a sleep in there gives the scheduler time to schedule the >> other threads to be run. >> >> Cheers- >> >> -- Ezra Zygmuntowicz-- Lead Rails Evangelist >> -- ez at engineyard.com >> -- Engine Yard, Serious Rails Hosting >> -- (866) 518-YARD (9273) >> > > So, is there a "best practice" method of dealing with this? My > worker code is in a tight loop, marshalling a few million records > into XML, outputting the marshalled results to a file. I would have > expected that each database query (I grab 5000 rows at a time) and > each IO write would be a context switch candidate, but aparenetly > not. Is there a call I should make to explicitely yield periodically? > > - MarcMarc- It''s hard to say anything about a best practice here. I have never had to do a sleep0 to get other threads scheduled but I am carefull to not use anything that is blocking IO. Ruby''s green threads are admittedly weak sauce :( If you do a blocking IO call in one thread *all* threads will block and none will schedule until it finishes. When you say marshalling millions of records to xml I think you may just bee swamping the hell out of the interpreter and its blocking in that thread for some reason. I think in your case putting the sleep 0 in there to get other threads scheduled is a hack at best but I think it''s the best hack in this case right now. Cheers- -- Ezra Zygmuntowicz -- Lead Rails Evangelist -- ez at engineyard.com -- Engine Yard, Serious Rails Hosting -- (866) 518-YARD (9273)