I have made additional progress with regard to performance. My latest data: configuration 1: r121.latest configuration 2: r121p.latest page c1 real c2 real c1 r/s c2 r/s c1/c2 /empty/index 6.75525 1.71983 148.0 581.5 3.93 /welcome/index 6.89044 1.89244 145.1 528.4 3.64 /rezept/index 4.99573 1.97025 200.2 507.5 2.54 /rezept/myknzlpzl 4.99592 1.96929 200.2 507.8 2.54 /rezept/show/713 18.72658 4.52896 53.4 220.8 4.13 /rezept/cat/Hauptspeise 22.67481 4.79957 44.1 208.4 4.72 /rezept/cat/Hauptspeise?page=5 23.03755 4.86045 43.4 205.7 4.74 /rezept/letter/G 21.66753 4.84487 46.2 206.4 4.47 I have not upgraded my patches yet, this will take some time. So please refrain from applying them right now. Some observations: Session handling is really fast now (as can be seen from the empty/welcome requests), and has moved to the end of my list of stuff to improve. Reading things from the DB accounts for only 2% on "cat" and "letter" requests. I have tried optimizing by using eager loading, but this made these requests about 10 times slower. Instead I have resorted to piggy back data and modified find and pagination to deal with it properly. Patches pending. On the other hand, creating the request parameter hash takes around 5.6% on cat and letter and almost 20% on the empty request. IMO, this is to much. I have looked at the code, but there seems to be no simple way of optimizing it. Maybe a C extension and dropping CGI support in favor of fcgi would be worthwhile. But 200 requests per second is pretty fast already, so maybe this should be postponed for a 1.x release. I have experimented with a modified GC algorithm, but it turned out that manual control of GC is still faster. Hopefully ruby2 will improve this. -- stefan
Stefan Kaes wrote:> On the other hand, creating the request parameter hash takes around 5.6% > on cat and letter and almost 20% on the empty request.I have brought it down to 8% on empty. Among other things, I dropped usage of HashWithIndifferentAccess. Can someone explain to me why HashWithIndifferentAccess has to be used for the @params hash? As far as I could deduce it, only routing returns symbols as hash keys. Couldn''t that be changed? Regards, Stefan configuration 1: r121.latest configuration 2: r121p.freitag2 page c1 real c2 real c1 r/s c2 r/s c1/c2 /empty/index 6.75525 1.65293 148.0 605.0 4.09 /welcome/index 6.89044 1.83338 145.1 545.4 3.76 /rezept/index 4.99573 1.91278 200.2 522.8 2.61 /rezept/myknzlpzl 4.99592 1.91279 200.2 522.8 2.61 /rezept/show/713 18.72658 4.47052 53.4 223.7 4.19 /rezept/cat/Hauptspeise 22.67481 4.75549 44.1 210.3 4.77 /rezept/cat/Hauptspeise?page=5 23.03755 4.80336 43.4 208.2 4.80 /rezept/letter/G 21.66753 4.79739 46.2 208.4 4.52
On Saturday 14 May 2005 08:48, Stefan Kaes wrote:> I have brought it down to 8% on empty. Among other things, I dropped > usage of HashWithIndifferentAccess. Can someone explain to me why > HashWithIndifferentAccess has to be used for the @params hash? As far as > I could deduce it, only routing returns symbols as hash keys. Couldn''t > that be changed?Sure it could be, but a lot of people use params[:key] instead of params[''key''] Why is HashWithIndifferentAccess so slow? -- Nicholas Seckar aka. Ulysses
> Sure it could be, but a lot of people use params[:key] instead of > params[''key'']This is also the default coming from the scaffold generator. -- Tobi http://www.snowdevil.ca - Snowboards that don''t suck http://www.hieraki.org - Open source book authoring http://blog.leetsoft.com - Technical weblog
On Saturday 14 May 2005 17:43, Tobias Luetke wrote:> This is also the default coming from the scaffold generator.Plus, it''s just plain sexy. -- Nicholas Seckar aka. Ulysses
> I have not upgraded my patches yet, this will take some time. So > please > refrain from applying them right now.Cool stuff indeed. Could you rename the patches that need more work to [XPATCH]? That way they won''t show up in the list of stuff ready to apply, but only on the Needy Patches list. -- David Heinemeier Hansson http://www.loudthinking.com -- Broadcasting Brain http://www.basecamphq.com -- Online project management http://www.backpackit.com -- Personal information manager http://www.rubyonrails.com -- Web-application framework
Nicholas Seckar wrote:>On Saturday 14 May 2005 08:48, Stefan Kaes wrote: > > > >>I have brought it down to 8% on empty. Among other things, I dropped >>usage of HashWithIndifferentAccess. Can someone explain to me why >>HashWithIndifferentAccess has to be used for the @params hash? As far as >>I could deduce it, only routing returns symbols as hash keys. Couldn''t >>that be changed? >> >> > >Sure it could be, but a lot of people use params[:key] instead of >params[''key''] > >Why is HashWithIndifferentAccess so slow? > > >First, building a HashWithIndifferentAccess is slow because the whole hash passed in needs to be traversed recursively (including hashes in the range of the passed in hash) and new hashes have to be constructed. This in itself is slow, and produces garbage as well. Second, when you access h[:key], ruby checks whether ''key'' exists in the global symbol hash table and then passes the symbol value for ''key'' to the [] member of HashWithIndifferentAccess. The implementation of [] then converts the :key parameter back into a string (which involves creating a new string), and uses it to access h[''key''], so in fact, every access gets slower, since 2 hash lookups are performed instead of 1, and nothing has been gained in terms of memory consumption either. Moreover, the semantics is bit dubious too: if you call HashWithIndifferentAccess.new({:key => v1, ''key'' => v2}), which value will be assigned to ''key''/:key; v1 or v2? With the current implementation of HashWithIndifferentAccess.new, the result depends on the internals of the Hash.each implementation, which clearly leaves a lot to be desired. In fact, you have just been lucky that everything seems to work. If you want to resolve the ambiguity, the creation of a HashWithIndifferentAccess will get slower again. All in all, I think the extra flexibility gained, is not worth the increased implementation cost. -- stefan
On May 14, 2005, at 11:26, Nicholas Seckar wrote:> Sure it could be, but a lot of people use params[:key] instead of > params[''key'']If you''re going to support just one, I''d really favor the symbol form. It''s faster, easier to type, and somehow _tighter_. But if we start optimizing this kind of thing, then having the :action => ''fred'' style parameters will have to go too: after all we''re constructing a Hash object for now good reason. Cheers Dave
Stefan Kaes wrote:> Nicholas Seckar wrote: > >> Why is HashWithIndifferentAccess so slow? >>Another reason: hash methods are built in C functions, whereas HashWithIndifferentAccess methods are not. -- stefan
On 5/16/05, Stefan Kaes <skaes@gmx.net> wrote:> Stefan Kaes wrote: > > > Nicholas Seckar wrote: > > > >> Why is HashWithIndifferentAccess so slow? > >> > Another reason: hash methods are built in C functions, whereas > HashWithIndifferentAccess methods are not.Is there another way to implement the functionality? Perhaps always trying a straight lookup first, falling back to the typecasting stuff iff it fails?> -- stefan > _______________________________________________ > Rails-core mailing list > Rails-core@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails-core >-- Cheers Koz
>> Another reason: hash methods are built in C functions, whereas >> HashWithIndifferentAccess methods are not. >> > > Is there another way to implement the functionality? Perhaps always > trying a straight lookup first, falling back to the typecasting stuff > iff it fails?That sounds like a good idea. If we could convert the params access to use symbol keys internally, then the expensive translation would only be necessary in case one didn''t use symbols. And in that case, we could use a .intern or something to turn the string into a symbol. -- David Heinemeier Hansson http://www.loudthinking.com -- Broadcasting Brain http://www.basecamphq.com -- Online project management http://www.backpackit.com -- Personal information manager http://www.rubyonrails.com -- Web-application framework
David Heinemeier Hansson wrote:>>> Another reason: hash methods are built in C functions, whereas >>> HashWithIndifferentAccess methods are not. >>> >> >> Is there another way to implement the functionality? Perhaps always >> trying a straight lookup first, falling back to the typecasting stuff >> iff it fails? > > > That sounds like a good idea. If we could convert the params access > to use symbol keys internally, then the expensive translation would > only be necessary in case one didn''t use symbols. And in that case, > we could use a .intern or something to turn the string into a symbol.This will require rewriting and testing query_parameters and request_parameters and HashWithIndifferentAccess. And all values returned from the cgi parts are in fact strings, which would need to be converted into symbols. This will add even more overhead than the current implementation. In fact, any ruby class implementation will always be slower than using a native hash of (string, value) pairs. I propose to use strings and normal hashes and avoid HashWithIndifferentAccess altogether; you can''t get faster and simpler than that. People who insist on using symbols can add a ''before action'' in application_controller.rb that does @params = @params.with_indifferent_access. Or one could add a global option to ApplicationController::Base to specify that with_indifferent_access should be used, which is probably faster than the first method. I really don''t understand the need for writing @params[:id] instead of @params[''id'']. What is gained? It is really only an alternate notation which is one character shorter and requires more computing resources. Why do you want to keep it? I don''t get it. -- stefan
On May 16, 2005, at 10:29 AM, Stefan Kaes wrote:> I propose to use strings and normal hashes and avoid > HashWithIndifferentAccess altogether; you can''t get faster and simpler > than that. People who insist on using symbols can add a ''before > action'' in application_controller.rb that does @params = > @params.with_indifferent_access. Or one could add a global option to > ApplicationController::Base to specify that with_indifferent_access > should be used, which is probably faster than the first method. > > I really don''t understand the need for writing @params[:id] instead of > @params[''id'']. What is gained? It is really only an alternate notation > which is one character shorter and requires more computing resources. > Why do you want to keep it? I don''t get it. >Symbols are semantically far closer to keys than strings--they represent the atomic and immutable concept of "a name". I STRONGLY support the use of symbols throughout Rails whenever we''re using the concept of a name. I''d like to see it used more. The action of interning an incoming CGI parameter name happens once. It''s cost is identical to that of calculating the hash of a string. Once done, there is no further inefficiency in using symbols throughout the code. If folks want to use strings instead, then there''ll be an extra step. I''ll pay for this minor performance hit by offering a performance improvement of my own. Line 270 of ActionView::Base constructs an unnecessary Builder object for every single rhtml template rendered... There: can I keep my symbols now... :) Cheers Dave
On 5/16/05, Dave Thomas <dave@pragprog.com> wrote:> Symbols are semantically far closer to keys than strings--they > represent the atomic and immutable concept of "a name". I STRONGLY > support the use of symbols throughout Rails whenever we''re using the > concept of a name. I''d like to see it used more.I mentioned this on IRC and thought I''d post it here as well: I''d be happy if we dropped HashWithIndifferentAccess post-1.0 and required the use of symbol keys everywhere. Sam
On May 16, 2005, at 11:44 AM, Sam Stephenson wrote:> On 5/16/05, Dave Thomas <dave@pragprog.com> wrote: > >> Symbols are semantically far closer to keys than strings--they >> represent the atomic and immutable concept of "a name". I STRONGLY >> support the use of symbols throughout Rails whenever we''re using the >> concept of a name. I''d like to see it used more. > > I''d be happy if we dropped HashWithIndifferentAccess post-1.0 and > required the use of symbol keys everywhere.I agree; this would be a very nice streamlining. -- Ryan Platte -------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails-core/attachments/20050516/75e1f45c/attachment-0001.html
On Monday 16 May 2005 08:18, David Heinemeier Hansson wrote:> That sounds like a good idea. If we could convert the params access > to use symbol keys internally, then the expensive translation would > only be necessary in case one didn''t use symbols. And in that case, > we could use a .intern or something to turn the string into a symbol.The problem with using symbol keys internally is that symbols are never GC''ed. So I could give you memory problems by sending a bunch of un-used query variables on each request. If we could figure out how to gc some symbols, we could do that for the query params. I was hoping deletion from Symbol.all_symbols might do the trick, but no dice.. -- Nicholas Seckar aka. Ulysses
On May 16, 2005, at 11:55 AM, Nicholas Seckar wrote:> The problem with using symbol keys internally is that symbols are > never GC''ed. > So I could give you memory problems by sending a bunch of un-used query > variables on each request. >I though symbols were immediate values. We don''t worry about using integers, so why should we worry about Symbols? Cheers Dave
On May 16, 2005, at 11:04 AM, Dave Thomas wrote:> > On May 16, 2005, at 11:55 AM, Nicholas Seckar wrote: > > >> The problem with using symbol keys internally is that symbols are >> never GC''ed. >> So I could give you memory problems by sending a bunch of un-used >> query >> variables on each request. >> >> > > I though symbols were immediate values. We don''t worry about using > integers, so why should we worry about Symbols? > > Cheers > > > Dave >Symbols are each internalized into Ruby''s symbol table, which is a bunch of strings. Each symbol you create adds another entry to that table. I would *much* rather use symbols than strings, for various reasons, but Nicholas raises a good point. I wonder how difficult it would be to exploit the fact that Ruby doesn''t GC symbols, or even allow you to explicitly "make room" in the symbol table. - Jamis
On May 16, 2005, at 12:15 PM, Jamis Buck wrote:>> I though symbols were immediate values. We don''t worry about using >> integers, so why should we worry about Symbols? > Symbols are each internalized into Ruby''s symbol table, which is a > bunch of strings. Each symbol you create adds another entry to that > table. > > I would *much* rather use symbols than strings, for various reasons, > but Nicholas raises a good point. I wonder how difficult it would be > to exploit the fact that Ruby doesn''t GC symbols, or even allow you to > explicitly "make room" in the symbol table. >Sure, but the set of symbols in an application is presumably fixed: saying :dave 1,000,000 times only adds ''dave'' once to the table. Dave
On May 16, 2005, at 11:37 AM, Dave Thomas wrote:> > On May 16, 2005, at 12:15 PM, Jamis Buck wrote: > > >>> I though symbols were immediate values. We don''t worry about >>> using integers, so why should we worry about Symbols? >>> >> Symbols are each internalized into Ruby''s symbol table, which is a >> bunch of strings. Each symbol you create adds another entry to >> that table. >> >> I would *much* rather use symbols than strings, for various >> reasons, but Nicholas raises a good point. I wonder how difficult >> it would be to exploit the fact that Ruby doesn''t GC symbols, or >> even allow you to explicitly "make room" in the symbol table. >> >> > > Sure, but the set of symbols in an application is presumably > fixed: saying :dave 1,000,000 times only adds ''dave'' once to the > table. > > Dave >The exploit Nicholas pointed it is where someone passes a bunch of arbitrary query parameters to your app. Each of them is converted to a symbol and never deleted. If someone were to do this enough, sending enough query parameters, eventually your app would run out of memory. True, in _normal_ usage, this is not a concern. But I''m a little worried, now, about how many such requests it would take before someone was able to cause your app to go belly up. - Jamis
On May 16, 2005, at 1:01 PM, Jamis Buck wrote:> The exploit Nicholas pointed it is where someone passes a bunch of > arbitrary query parameters to your app. Each of them is converted to a > symbol and never deleted. If someone were to do this enough, sending > enough query parameters, eventually your app would run out of memory. > > True, in _normal_ usage, this is not a concern. But I''m a little > worried, now, about how many such requests it would take before > someone was able to cause your app to go belly up. >Wouldn''t you test that the column name given by the user was valid? In that case you''d just be reusing an existing symbol. I''m not sure I can envisage a case where you''d generate arbitrary symbols in response to user requests. Dave
On May 16, 2005, at 12:09 PM, Dave Thomas wrote:> > On May 16, 2005, at 1:01 PM, Jamis Buck wrote: > > >> The exploit Nicholas pointed it is where someone passes a bunch of >> arbitrary query parameters to your app. Each of them is converted >> to a symbol and never deleted. If someone were to do this enough, >> sending enough query parameters, eventually your app would run out >> of memory. >> >> True, in _normal_ usage, this is not a concern. But I''m a little >> worried, now, about how many such requests it would take before >> someone was able to cause your app to go belly up. >> >> > > Wouldn''t you test that the column name given by the user was valid? > In that case you''d just be reusing an existing symbol. > > I''m not sure I can envisage a case where you''d generate arbitrary > symbols in response to user requests. >Heh, I think we''ve been talking past each other. :) Let me be a bit more explicit about my own assumptions. The current implementation is safe, because, as you said, the symbol is only created when you reference it. However, Sam (I think it was) suggested just dropping the HashWithIndifferentAccess and always accessing the parameters via symbol keys. In that case, rails would have to explicitly convert all string keys to symbols, whether they were used or not, and that''s where the scenario I was describing would come into play. It could be done lazily, but I wonder if that would be any better (performance-wise) than HashWithIndifferentAccess? Sorry, not enough sleep, I guess. :) - Jamis
Dave Thomas wrote:> > On May 16, 2005, at 10:29 AM, Stefan Kaes wrote: > >> I propose to use strings and normal hashes and avoid >> HashWithIndifferentAccess altogether; you can''t get faster and >> simpler than that. People who insist on using symbols can add a >> ''before action'' in application_controller.rb that does @params = >> @params.with_indifferent_access. Or one could add a global option to >> ApplicationController::Base to specify that with_indifferent_access >> should be used, which is probably faster than the first method. >> >> I really don''t understand the need for writing @params[:id] instead >> of @params[''id'']. What is gained? It is really only an alternate >> notation which is one character shorter and requires more computing >> resources. Why do you want to keep it? I don''t get it. >> > > Symbols are semantically far closer to keys than strings--they > represent the atomic and immutable concept of "a name". I STRONGLY > support the use of symbols throughout Rails whenever we''re using the > concept of a name. I''d like to see it used more. > > The action of interning an incoming CGI parameter name happens once.> It''s cost is identical to that of calculating the hash of a string. > Once done, there is no further inefficiency in using symbols > throughout the code. If folks want to use strings instead, then > there''ll be an extra step. > > I''ll pay for this minor performance hit by offering a performance > improvement of my own. Line 270 of ActionView::Base constructs an > unnecessary Builder object for every single rhtml template rendered...This has long been fixed by one of my patches.> There: can I keep my symbols now... :) >Sure. Never suggested you couldn''t :-) But the doc says: [RW] params Holds a hash of all the GET, POST, and Url parameters passed to the action. Accessed like @params["post_id"] to get the post_id. No type casts are made, so all values are returned as strings. No mention of symbols at all. And all examples use "key", never :key. So you were using hidden implementation details. Shame on you ;-) Anyway, either symbols or strings could be used. But the HashWithIndifferentAccess has to go, IMO. -- stefan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails-core/attachments/20050516/1dc942f1/attachment.html
Dave Thomas wrote:> > On May 14, 2005, at 11:26, Nicholas Seckar wrote: > >> Sure it could be, but a lot of people use params[:key] instead of >> params[''key''] > > > If you''re going to support just one, I''d really favor the symbol form. > It''s faster, easier to type, and somehow _tighter_. > > But if we start optimizing this kind of thing, then having the :action > => ''fred'' style parameters will have to go too: after all we''re > constructing a Hash object for now good reason. >Not really, it depends on how often :action => ''fred'' gets evaluated. If it isn''t a hot spot, then there is no need to optimise it. Please note that all of my proposals are based on measuring performance of a real application. I didn''t browse the source and said: my, this looks inefficient, let''s change it. -- stefan
Jamis Buck wrote:> > I would *much* rather use symbols than strings, for various reasons, > but Nicholas raises a good point. I wonder how difficult it would be > to exploit the fact that Ruby doesn''t GC symbols, or even allow you > to explicitly "make room" in the symbol table. >It is easily exploited: one can write a simple script to bombard the server with new requests, each of them creating many new symbols, by feeding them into POST data, for example. So using symbols will allow very simple DOS attacks. -- stefan
On 5/17/05, Sam Stephenson <sstephenson@gmail.com> wrote:> On 5/16/05, Dave Thomas <dave@pragprog.com> wrote: > > Symbols are semantically far closer to keys than strings--they > > represent the atomic and immutable concept of "a name". I STRONGLY > > support the use of symbols throughout Rails whenever we''re using the > > concept of a name. I''d like to see it used more. > > I mentioned this on IRC and thought I''d post it here as well: > > I''d be happy if we dropped HashWithIndifferentAccess post-1.0 and > required the use of symbol keys everywhere.I''d only be happy with this if it was in a 2.0 branch. Once we release 1.0 we''re commiting to keeping the thing API compatible throughout the 1.x release. We can''t ask our users to just "change all your strings to symbols" when we make a new release. Similarly, I don''t think we can turn off symbols this late in the game. Whatever the docs say, *lots* of people are using them.> Sam > _______________________________________________ > Rails-core mailing list > Rails-core@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails-core >-- Cheers Koz
> Similarly, I don''t think we can turn off symbols this late in the > game. Whatever the docs say, *lots* of people are using them.This is definitely a 2.0 discussion if it involves any API changes. We''re more or less at a freeze with the current line. Backwards incompatible changes should have extremely good reasons. But we should definitely start collecting ideas for 2.0. Such that we don''t loose the good ones there. -- David Heinemeier Hansson http://www.loudthinking.com -- Broadcasting Brain http://www.basecamphq.com -- Online project management http://www.backpackit.com -- Personal information manager http://www.rubyonrails.com -- Web-application framework