Clive
2009-Aug-15 10:08 UTC
Isn''t there any performance issue when saving serialized attributes every time?
In edge rails, serialized attributes are saved every time no matter they are changed or not: def update_with_dirty if partial_updates? # Serialized attributes should always be written in case they''ve been # changed in place. update_without_dirty(changed | (attributes.keys & self.class.serialized_attributes.keys)) else update_without_dirty end end In out app, User model has a serialized attribute friend_ids, which is the id array of the user''s friends (used to be a friendships table, but when that table grew to tens of millions of records, we refactored it to a User''s attribute). The User model in our app is saved very frenquently, and many users have more than 1 handred friends, so the friend_ids may has 1kB long. Some of my colleages opposed to use serialized friend_ids, and suggested to used a comma separated string instead because saving 1kB each time would have performance issue.
Frederick Cheung
2009-Aug-15 10:15 UTC
Re: Isn''t there any performance issue when saving serialized attributes every time?
On Aug 15, 11:08 am, Clive <lin....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> In edge rails, serialized attributes are saved every time no matter > they are changed or not: > def update_with_dirty > if partial_updates? > # Serialized attributes should always be written in case > they''ve been > # changed in place. > update_without_dirty(changed | (attributes.keys & > self.class.serialized_attributes.keys)) > else > update_without_dirty > end > end > > In out app, User model has a serialized attribute friend_ids, which is > the id array of the user''s friends (used to be a friendships table, > but when that table grew to tens of millions of records, we refactored > it to a User''s attribute). The User model in our app is saved very > frenquently, and many users have more than 1 handred friends, so the > friend_ids may has 1kB long. Some of my colleages opposed to use > serialized friend_ids, and suggested to used a comma separated string > instead because saving 1kB each time would have performance issue.Well before rails 2.1 partial updates didn''t exist at all and the world continued to function :-) The simple to solution is to benchmark it - monkey patch stuff so that this does happen, measure, monkey patch it so that is doesn''t, measure again. Another possible way would be to stick the friend_ids data in its own table. This might also save you loading it unecessarily Fred
clive
2009-Aug-15 12:39 UTC
Re: Isn''t there any performance issue when saving serialized attributes every time?
The world continued to function because few people used rails in high profile website at that time. Our app has more than 50 million HTTP requests per day, so we have to hack a lot of rails plugins that have performance issues under such load:( The patch proposed here would fix this issue: https://rails.lighthouseapp.com/projects/8994/tickets/2764-supporting-partial-updates-for-serialized-columns I hope it will be merged to edge rails On Aug 15, 6:15 pm, Frederick Cheung <frederick.che...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> On Aug 15, 11:08 am, Clive <lin....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > > In edge rails, serialized attributes are saved every time no matter > > they are changed or not: > > def update_with_dirty > > if partial_updates? > > # Serialized attributes should always be written in case > > they''ve been > > # changed in place. > > update_without_dirty(changed | (attributes.keys & > > self.class.serialized_attributes.keys)) > > else > > update_without_dirty > > end > > end > > > In out app, User model has a serialized attribute friend_ids, which is > > the id array of the user''s friends (used to be a friendships table, > > but when that table grew to tens of millions of records, we refactored > > it to a User''s attribute). The User model in our app is saved very > > frenquently, and many users have more than 1 handred friends, so the > > friend_ids may has 1kB long. Some of my colleages opposed to use > > serialized friend_ids, and suggested to used a comma separated string > > instead because saving 1kB each time would have performance issue. > > Well before rails 2.1 partial updates didn''t exist at all and the > world continued to function :-) > The simple to solution is to benchmark it - monkey patch stuff so that > this does happen, measure, monkey patch it so that is doesn''t, measure > again. > > Another possible way would be to stick the friend_ids data in its own > table. This might also save you loading it unecessarily > > Fred
Matt Jones
2009-Aug-15 15:32 UTC
Re: Isn''t there any performance issue when saving serialized attributes every time?
On Aug 15, 8:39 am, clive <lin....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> The world continued to function because few people used rails in high > profile website at that time. Our app has more than 50 million HTTP > requests per day, so we have to hack a lot of rails plugins that have > performance issues under such load:( > The patch proposed here would fix this issue:https://rails.lighthouseapp.com/projects/8994/tickets/2764-supporting... > I hope it will be merged to edge rails >No, actually, it won''t. Unless you''re claiming that the bottleneck is in transmitting the data to the DB, that patch still converts every serialized attribute to YAML almost every time that the record is saved (note the calls to object_to_yaml). Serialized fields that are changed will get converted *again*. So if anything, the overhead may be worse than before... I also find the idea of taking one of the primary parts of a social networking site (friend linking) and making it opaque through serialization to be very odd. Isn''t the whole point of the "social media" fad to be analyzing and drawing conclusions from the network graph? Meh. --Matt Jones
clive
2009-Aug-16 02:45 UTC
Re: Isn''t there any performance issue when saving serialized attributes every time?
Transforming to/from yml consumes CPU cycles, but storing a large field to database consumes network bandwidth and disk IO:( In this case, I am afraid my colleague''s solution is right: save that list as a comma separate string. Your 2nd point is right, I will consider not storing friend list later. But besides friend list, there are other fields that need to stored as serialized attr. Thanks On Aug 15, 11:32 pm, Matt Jones <al2o...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> On Aug 15, 8:39 am, clive <lin....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > The world continued to function because few people used rails in high > > profile website at that time. Our app has more than 50 million HTTP > > requests per day, so we have to hack a lot of rails plugins that have > > performance issues under such load:( > > The patch proposed here would fix this issue:https://rails.lighthouseapp.com/projects/8994/tickets/2764-supporting... > > I hope it will be merged to edge rails > > No, actually, it won''t. Unless you''re claiming that the bottleneck is > in transmitting the data to the DB, that patch still converts every > serialized attribute to YAML almost every time that the record is > saved (note the calls to object_to_yaml). Serialized fields that are > changed will get converted *again*. So if anything, the overhead may > be worse than before... > > I also find the idea of taking one of the primary parts of a social > networking site (friend linking) and making it opaque through > serialization to be very odd. Isn''t the whole point of the "social > media" fad to be analyzing and drawing conclusions from the network > graph? Meh. > > --Matt Jones