Mike Evans
2008-Jun-10 14:42 UTC
[Backgroundrb-devel] Backgroundrb fixes for transfering large amounts of data
Hemant We''ve continued testing our application with backgroundrb and found a couple of other problems when transfering large amounts of data. Both of these problems are still present in the github version of code. The first problem is an exception in the Marshal.load call in the receive_data method in the Packet::MetaPimp class. The root cause is in the BinParser module, in the arm of code handling the parser state 1 (reading in the data). The issue is that, at the marked line of code the pack_data array will be at most @numeric_length entries because of the format string passed to the unpack call. This results in the code dropping a chunk of data and then hitting the exception in a subsequent Marshal.load call. elsif @parser_state == 1 pack_data,remaining = new_data.unpack("a#{@numeric_length}a*") if pack_data.length < @numeric_length @data << pack_data @numeric_length = @numeric_length - pack_data.length elsif pack_data.length == @numeric_length <======== this should be "elsif remaining.length == 0" @data << pack_data extracter_block.call(@data.join) @data = [] @parser_state = 0 @length_string = "" @numeric_length = 0 else @data << pack_data extracter_block.call(@data.join) @data = [] @parser_state = 0 @length_string = "" @numeric_length = 0 extract(remaining,&extracter_block) end end The second problem we hit was ask_status repeatedly returning nil. The root cause of this problem is in the read_object method of the BackgrounDRb::WorkerProxy class when a data record is large enough to cause connection.read_nonblock to throw the Errno::EAGAIN exception multiple times. We changed the code to make sure read_nonblock is called repeatedly until the tokenizer finds a complete record, and this fixed the problem. def read_object begin while (true) sock_data = "" begin while(sock_data << @connection.read_nonblock(1023)); end rescue Errno::EAGAIN @tokenizer.extract(sock_data) { |b_data| return b_data } end end rescue raise BackgrounDRb::BdrbConnError.new("Not able to connect") end end Regards, Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20080610/d02a94bb/attachment.html>
Hemant Kumar
2008-Jun-10 15:01 UTC
[Backgroundrb-devel] Backgroundrb fixes for transfering large amounts of data
Mike Evans wrote:> > Hemant > > We''ve continued testing our application with backgroundrb and found a > couple of other problems when transfering large amounts of data. Both > of these problems are still present in the github version of code. > > The first problem is an exception in the Marshal.load call in the > receive_data method in the Packet::MetaPimp class. The root cause is > in the BinParser module, in the arm of code handling the parser state > 1 (reading in the data). The issue is that, at the marked line of > code the pack_data array will be at most @numeric_length entries > because of the format string passed to the unpack call. This results > in the code dropping a chunk of data and then hitting the exception in > a subsequent Marshal.load call. > > elsif @parser_state == 1 > pack_data,remaining = new_data.unpack("a#{@numeric_length}a*") > if pack_data.length < @numeric_length > @data << pack_data > @numeric_length = @numeric_length - pack_data.length > elsif pack_data.length == @numeric_length <======== this > should be "elsif remaining.length == 0" > @data << pack_data > extracter_block.call(@data.join) > @data = [] > @parser_state = 0 > @length_string = "" > @numeric_length = 0 > else > @data << pack_data > extracter_block.call(@data.join) > @data = [] > @parser_state = 0 > @length_string = "" > @numeric_length = 0 > extract(remaining,&extracter_block) > end > end > > The second problem we hit was ask_status repeatedly returning nil. > The root cause of this problem is in the read_object method of the > BackgrounDRb::WorkerProxy class when a data record is large enough to > cause connection.read_nonblock to throw the Errno::EAGAIN exception > multiple times. We changed the code to make sure read_nonblock is > called repeatedly until the tokenizer finds a complete record, and > this fixed the problem. > > def read_object > begin > while (true) > sock_data = "" > begin > while(sock_data << @connection.read_nonblock(1023)); end > rescue Errno::EAGAIN > @tokenizer.extract(sock_data) { |b_data| return b_data } > end > end > rescue > raise BackgrounDRb::BdrbConnError.new("Not able to connect") > end > end > > Regards, Mike >If you update to THE latest github version of BackgrounDRb, you will find that above thing is already fixed and read is no more nonblocking for clients ( blocking read makes much more sense for clients ). Also, I have made BinParser class to be iterative ( just for better stability, assuming your data is large enough to throw StackLevel Too deep errors), but thats not yet pushed to master version of packet. I will implement your fix tonight, when I push the changes to master repository of packet.
Mike Evans
2008-Jun-10 15:06 UTC
[Backgroundrb-devel] Backgroundrb fixes for transfering large amounts of data
Hemant Thanks for the quick response - I''ll admit I hadn''t resynced for a week or so! Agreed that non-blocking reads make more sense for the client. We''re still testing with a patched version of packet-0.1.5 and the svn release of backgroundrb because of the problems we hit getting RoR model objects to load properly on the worker tasks with the latest code - have you found a fix for this yet? Mike -----Original Message----- From: Hemant Kumar [mailto:gethemant at gmail.com] Sent: 10 June 2008 16:01 To: Mike Evans Cc: backgroundrb-devel at rubyforge.org Subject: Re: Backgroundrb fixes for transfering large amounts of data Mike Evans wrote:> > Hemant > > We''ve continued testing our application with backgroundrb and found a > couple of other problems when transfering large amounts of data. Both> of these problems are still present in the github version of code. > > The first problem is an exception in the Marshal.load call in the > receive_data method in the Packet::MetaPimp class. The root cause is > in the BinParser module, in the arm of code handling the parser state > 1 (reading in the data). The issue is that, at the marked line of > code the pack_data array will be at most @numeric_length entries > because of the format string passed to the unpack call. This results > in the code dropping a chunk of data and then hitting the exception in> a subsequent Marshal.load call. > > elsif @parser_state == 1 > pack_data,remaining = new_data.unpack("a#{@numeric_length}a*") > if pack_data.length < @numeric_length > @data << pack_data > @numeric_length = @numeric_length - pack_data.length > elsif pack_data.length == @numeric_length <======== this > should be "elsif remaining.length == 0" > @data << pack_data > extracter_block.call(@data.join) > @data = [] > @parser_state = 0 > @length_string = "" > @numeric_length = 0 > else > @data << pack_data > extracter_block.call(@data.join) > @data = [] > @parser_state = 0 > @length_string = "" > @numeric_length = 0 > extract(remaining,&extracter_block) > end > end > > The second problem we hit was ask_status repeatedly returning nil. > The root cause of this problem is in the read_object method of the > BackgrounDRb::WorkerProxy class when a data record is large enough to > cause connection.read_nonblock to throw the Errno::EAGAIN exception > multiple times. We changed the code to make sure read_nonblock is > called repeatedly until the tokenizer finds a complete record, and > this fixed the problem. > > def read_object > begin > while (true) > sock_data = "" > begin > while(sock_data << @connection.read_nonblock(1023)); end > rescue Errno::EAGAIN > @tokenizer.extract(sock_data) { |b_data| return b_data } > end > end > rescue > raise BackgrounDRb::BdrbConnError.new("Not able to connect") > end > end > > Regards, Mike >If you update to THE latest github version of BackgrounDRb, you will find that above thing is already fixed and read is no more nonblocking for clients ( blocking read makes much more sense for clients ). Also, I have made BinParser class to be iterative ( just for better stability, assuming your data is large enough to throw StackLevel Too deep errors), but thats not yet pushed to master version of packet. I will implement your fix tonight, when I push the changes to master repository of packet.
hemant
2008-Jun-10 17:40 UTC
[Backgroundrb-devel] Backgroundrb fixes for transfering large amounts of data
On Tue, Jun 10, 2008 at 8:36 PM, Mike Evans <mike at metaswitch.com> wrote:> Hemant > > Thanks for the quick response - I''ll admit I hadn''t resynced for a week > or so! Agreed that non-blocking reads make more sense for the client.You meant blocking!> > We''re still testing with a patched version of packet-0.1.5 and the svn > release of backgroundrb because of the problems we hit getting RoR model > objects to load properly on the worker tasks with the latest code - have > you found a fix for this yet? >Yeah, you will find model loading behavior much more reliable in current git version of backgroundrb. Also, I did get in your fix, so no need to worry. Basically, there are only two ways: 1. Get that damn regexp correct. 2. Load entire models, plugins everything explicitly, because just loading environment.rb, doesn''t load models by default in rails. I went for option#1, to keep your bdrb worker process lean. All, in all, with git version of bdrb and packet, everything should be much more smoother.