Hi guys, I wanted to discuss the new Queueing API for those of us who are implementing an out-of-process version. In my case, I write Sidekiq [1] and would like to support the new API once Rails 4 is released. My issue is that because the API is object-oriented rather than message-oriented, implementation of out-of-process workers is difficult. The API is Queue#push(job) where job has a run method. Ruby doesn''t have a great solution for serializing a Ruby object across the wire. Marshal limits the API to Ruby solutions (which rules out RabbitMQ, et al), JSON can''t fully serialize Ruby objects (e.g. symbols) and YAML has a number of issues in practice that make it painful to use (e.g. see the monkeypatches DelayedJob has to use [2]). So I love the simplicity of the API but think it will lead to painful implementation issues. What do you think about defining a simpler message format that can be fully serialized and deserialized via JSON / YAML / etc instead of using a Ruby object? mike [1] http://mperham.github.com/sidekiq/ [2] https://github.com/collectiveidea/delayed_job/blob/master/lib/delayed/psych_ext.rb -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en.
I would agree. I would prefer a message-oriented API to a Ruby-object-oriented API. A Ruby object can of course be serialized to a message and deserialized from a message. But the key point is that it''s messages over the wire and in the database, rather than objects, and then we layer on top a convenient object-message mapper - quite like an object-relation mapper or an object-document mapper; also quite like we layer on top of the HTTP request a Hash or an ActionDispatch::Request and likewise on top of the HTTP response a 3-item Array or an ActionDispatch::Response. The nature of applications with message queues is that they''re more likely to need to interoperate with applications in other languages. Any serialized message should be interoperable across languages, so that rules out Marshal and YAML with embedded Ruby class names, but YAML without embedded Ruby class names would be fine, as would JSON, XML, MessagePack, ProtocolBuffers, etc. The Rails API should not dictate the serialization format, but should permit any set of formats chosen by the application, as well as opaque binary strings. Note that this has been a problem with Rails session cookies as well - they are not interoperable across applications written in different languages, because they are serialized from Ruby objects into bytes via Marshal. On Saturday, April 28, 2012 10:31:27 PM UTC-4, Mike Perham wrote:> > Hi guys, I wanted to discuss the new Queueing API for those of us who > are implementing an out-of-process version. In my case, I write > Sidekiq [1] and would like to support the new API once Rails 4 is > released. My issue is that because the API is object-oriented rather > than message-oriented, implementation of out-of-process workers is > difficult. > > The API is Queue#push(job) where job has a run method. Ruby doesn''t > have a great solution for serializing a Ruby object across the wire. > Marshal limits the API to Ruby solutions (which rules out RabbitMQ, et > al), JSON can''t fully serialize Ruby objects (e.g. symbols) and YAML > has a number of issues in practice that make it painful to use (e.g. > see the monkeypatches DelayedJob has to use [2]). > > So I love the simplicity of the API but think it will lead to painful > implementation issues. What do you think about defining a simpler > message format that can be fully serialized and deserialized via JSON > / YAML / etc instead of using a Ruby object? > > mike > > [1] http://mperham.github.com/sidekiq/ > [2] > https://github.com/collectiveidea/delayed_job/blob/master/lib/delayed/psych_ext.rb >-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To view this discussion on the web visit https://groups.google.com/d/msg/rubyonrails-core/-/x7Mzz36FbQgJ. To post to this group, send email to rubyonrails-core@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en.
On Sat, Apr 28, 2012 at 07:31:27PM -0700, Mike Perham wrote:> Hi guys, I wanted to discuss the new Queueing API for those of us who > are implementing an out-of-process version. In my case, I write > Sidekiq [1] and would like to support the new API once Rails 4 is > released. My issue is that because the API is object-oriented rather > than message-oriented, implementation of out-of-process workers is > difficult. > > The API is Queue#push(job) where job has a run method. Ruby doesn''t > have a great solution for serializing a Ruby object across the wire. > Marshal limits the API to Ruby solutions (which rules out RabbitMQ, et > al), JSON can''t fully serialize Ruby objects (e.g. symbols) and YAML > has a number of issues in practice that make it painful to use (e.g. > see the monkeypatches DelayedJob has to use [2]).I don''t understand what you mean by this. Marshal returns a string. If your producers and consumers are both Ruby, why would this rule out anything? Are you saying you want to mix languages between producers and consumers?> So I love the simplicity of the API but think it will lead to painful > implementation issues. What do you think about defining a simpler > message format that can be fully serialized and deserialized via JSON > / YAML / etc instead of using a Ruby object?I don''t think the serialization format is something that Rails should define. It''s an implementation detail of the queuing system, and is something a user must understand when using a queue. For example: * an in-memory queue has the advantage doesn''t require serialization but is volatile. But maybe that''s all the programmer needs. * a DRb based queue can be distributed and uses marshal, so the user doesn''t need to understand serialization as much. * An object that has references to an IO object must take precautions when being serialized. The other problem is that maybe the JSON format that sidekiq requires could possibly be different than the JSON format that some other queuing system requires. tl;dr every queue has unique aspects regarding transactions, wire protocol, and storage facility. Users should take this in to account when selecting a queue for their application''s requirements. -- Aaron Patterson http://tenderlovemaking.com/
Rails should enforce the best practices required to keep a user from making a mistake that would be hard to recover from as their application scales. An In-Memory queue is good, but it''s trivial to build an implementation that can''t be offloaded to another process. So I would like to see this API *enforce* that all queued objects be serializable/marshal-able at a minimum. This is a best practice and users who don''t like it can trivially build their own simple queueing solution. Regarding serializing to non-ruby protocols, I agree this is best left as an implementation detail. Chris On Mon, Apr 30, 2012 at 11:48 AM, Aaron Patterson <tenderlove@ruby-lang.org>wrote:> On Sat, Apr 28, 2012 at 07:31:27PM -0700, Mike Perham wrote: > > Hi guys, I wanted to discuss the new Queueing API for those of us who > > are implementing an out-of-process version. In my case, I write > > Sidekiq [1] and would like to support the new API once Rails 4 is > > released. My issue is that because the API is object-oriented rather > > than message-oriented, implementation of out-of-process workers is > > difficult. > > > > The API is Queue#push(job) where job has a run method. Ruby doesn''t > > have a great solution for serializing a Ruby object across the wire. > > Marshal limits the API to Ruby solutions (which rules out RabbitMQ, et > > al), JSON can''t fully serialize Ruby objects (e.g. symbols) and YAML > > has a number of issues in practice that make it painful to use (e.g. > > see the monkeypatches DelayedJob has to use [2]). > > I don''t understand what you mean by this. Marshal returns a string. If > your producers and consumers are both Ruby, why would this rule out > anything? Are you saying you want to mix languages between producers > and consumers? > > > So I love the simplicity of the API but think it will lead to painful > > implementation issues. What do you think about defining a simpler > > message format that can be fully serialized and deserialized via JSON > > / YAML / etc instead of using a Ruby object? > > I don''t think the serialization format is something that Rails should > define. It''s an implementation detail of the queuing system, and is > something a user must understand when using a queue. > > For example: > > * an in-memory queue has the advantage doesn''t require serialization > but is volatile. But maybe that''s all the programmer needs. > > * a DRb based queue can be distributed and uses marshal, so the user > doesn''t need to understand serialization as much. > > * An object that has references to an IO object must take precautions > when being serialized. > > The other problem is that maybe the JSON format that sidekiq requires > could possibly be different than the JSON format that some other queuing > system requires. > > tl;dr every queue has unique aspects regarding transactions, wire > protocol, and storage facility. Users should take this in to account > when selecting a queue for their application''s requirements. > > -- > Aaron Patterson > http://tenderlovemaking.com/ >-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en.
I agree that Rails shouldn''t force a specific serialization scheme, and I definitely think that it should be message-based. So many of the problems people already encounter with queue have to do with poor design in this regard so it''d be nice if Rails pushed the design standard forward. On Mon, Apr 30, 2012 at 3:53 PM, Chris Eppstein <chris@eppsteins.net> wrote:> Rails should enforce the best practices required to keep a user from > making a mistake that would be hard to recover from as their application > scales. An In-Memory queue is good, but it''s trivial to build an > implementation that can''t be offloaded to another process. So I would like > to see this API *enforce* that all queued objects be > serializable/marshal-able at a minimum. This is a best practice and users > who don''t like it can trivially build their own simple queueing solution. > > Regarding serializing to non-ruby protocols, I agree this is best left as > an implementation detail. > > Chris > > On Mon, Apr 30, 2012 at 11:48 AM, Aaron Patterson < > tenderlove@ruby-lang.org> wrote: > >> On Sat, Apr 28, 2012 at 07:31:27PM -0700, Mike Perham wrote: >> > Hi guys, I wanted to discuss the new Queueing API for those of us who >> > are implementing an out-of-process version. In my case, I write >> > Sidekiq [1] and would like to support the new API once Rails 4 is >> > released. My issue is that because the API is object-oriented rather >> > than message-oriented, implementation of out-of-process workers is >> > difficult. >> > >> > The API is Queue#push(job) where job has a run method. Ruby doesn''t >> > have a great solution for serializing a Ruby object across the wire. >> > Marshal limits the API to Ruby solutions (which rules out RabbitMQ, et >> > al), JSON can''t fully serialize Ruby objects (e.g. symbols) and YAML >> > has a number of issues in practice that make it painful to use (e.g. >> > see the monkeypatches DelayedJob has to use [2]). >> >> I don''t understand what you mean by this. Marshal returns a string. If >> your producers and consumers are both Ruby, why would this rule out >> anything? Are you saying you want to mix languages between producers >> and consumers? >> >> > So I love the simplicity of the API but think it will lead to painful >> > implementation issues. What do you think about defining a simpler >> > message format that can be fully serialized and deserialized via JSON >> > / YAML / etc instead of using a Ruby object? >> >> I don''t think the serialization format is something that Rails should >> define. It''s an implementation detail of the queuing system, and is >> something a user must understand when using a queue. >> >> For example: >> >> * an in-memory queue has the advantage doesn''t require serialization >> but is volatile. But maybe that''s all the programmer needs. >> >> * a DRb based queue can be distributed and uses marshal, so the user >> doesn''t need to understand serialization as much. >> >> * An object that has references to an IO object must take precautions >> when being serialized. >> >> The other problem is that maybe the JSON format that sidekiq requires >> could possibly be different than the JSON format that some other queuing >> system requires. >> >> tl;dr every queue has unique aspects regarding transactions, wire >> protocol, and storage facility. Users should take this in to account >> when selecting a queue for their application''s requirements. >> >> -- >> Aaron Patterson >> http://tenderlovemaking.com/ >> > > -- > You received this message because you are subscribed to the Google Groups > "Ruby on Rails: Core" group. > To post to this group, send email to rubyonrails-core@googlegroups.com. > To unsubscribe from this group, send email to > rubyonrails-core+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/rubyonrails-core?hl=en. >-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en.
On Mon, Apr 30, 2012 at 12:53:41PM -0700, Chris Eppstein wrote:> Rails should enforce the best practices required to keep a user from making > a mistake that would be hard to recover from as their application scales. > An In-Memory queue is good, but it''s trivial to build an implementation > that can''t be offloaded to another process. So I would like to see this API > *enforce* that all queued objects be serializable/marshal-able at a > minimum. This is a best practice and users who don''t like it can trivially > build their own simple queueing solution.I''m OK with making the default Queue enforce that an object is marshalable. It seems like a good constraint. If they really want an in-memory queue, then they can just switch to ::Queue.> Regarding serializing to non-ruby protocols, I agree this is best left as > an implementation detail.:-) -- Aaron Patterson http://tenderlovemaking.com/
Yehuda Katz (ph) 718.877.1325 On Sat, Apr 28, 2012 at 7:31 PM, Mike Perham <mperham@gmail.com> wrote:> Hi guys, I wanted to discuss the new Queueing API for those of us who > are implementing an out-of-process version. In my case, I write > Sidekiq [1] and would like to support the new API once Rails 4 is > released. My issue is that because the API is object-oriented rather > than message-oriented, implementation of out-of-process workers is > difficult. > > The API is Queue#push(job) where job has a run method. Ruby doesn''t > have a great solution for serializing a Ruby object across the wire. > Marshal limits the API to Ruby solutions (which rules out RabbitMQ, et > al),Not particularly? It just requires a Ruby consumer on the other end, which seems like an acceptable constraint. Queues with special serialization requirements (because they cannot use a Ruby consumer) can add additional constraints on serialization, and communicate those constraints to the users of their API.> JSON can''t fully serialize Ruby objects (e.g. symbols) and YAML > has a number of issues in practice that make it painful to use (e.g. > see the monkeypatches DelayedJob has to use [2]). > > So I love the simplicity of the API but think it will lead to painful > implementation issues. What do you think about defining a simpler > message format that can be fully serialized and deserialized via JSON > / YAML / etc instead of using a Ruby object? > > mike > > [1] http://mperham.github.com/sidekiq/ > [2] > https://github.com/collectiveidea/delayed_job/blob/master/lib/delayed/psych_ext.rb > > -- > You received this message because you are subscribed to the Google Groups > "Ruby on Rails: Core" group. > To post to this group, send email to rubyonrails-core@googlegroups.com. > To unsubscribe from this group, send email to > rubyonrails-core+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/rubyonrails-core?hl=en. > >-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en.