Christian Niles
2007-Jul-07 21:36 UTC
[Betternestedset-talk] Benchmarks for update_conditions.patch
I spent some time this afternoon developing benchmarks for the patch I submitted yesterday. I tested both MySQL and PostgreSQL, using topic names from dmoz.org. The results are pretty striking: # 667 Topics (3 Levels Deep) * MySQL, unpatched: 12.5s * MySQL, patched: 12.1s * PgSQL, unpatched: 1457.7s * PgSQL, patched: 14.1s # 8308 Topics (4 Levels Deep) * MySQL, unpatched: 660.1s * MySQL, patched: 533.8s * PgSQL, unpatched: 226469.1s (estimated*) * PgSQL, patched: 478.4s As you can see, there is a modest improvement for MySQL, but a *huge* difference for PostgreSQL. I had thought that it was due to the indices, but now I''m sure it''s due to the fact that PostgreSQL writes unchanged rows in the UPDATE clause. As a result, everytime an object is moved in the tree, every single row gets rewritten to disk, and performance deteriorates quickly. It''s worth noting that my tests add Topics in depth-first order, so each UPDATE should only need to update a few rows. So the results showcase the best possible improvement. The extra conditions won''t help at all when the database actually needs to change every row in the database. I''ve uploaded copies of the test applications here: http://unit12.net/hot_topics.tar.gz http://unit12.net/hot_topics_patched.tar.gz The tests can be run using script/runner like so: $ ./script/runner ./script/benchmark_bns.rb The data used is in db/topics.yml, which is generated (via rake) from the data in db/dmoz_topics.txt: $ rake db/topics.yml The db/dmoz_topics.txt file was produced from the Open Directory (dmoz.org) data at: http://rdf.dmoz.org/rdf/structure.rdf.u8.gz I just grep''d the uncompressed file for the <Topic> start tags. best, christian. * This is a rough estimate based on calculating the number of rows changed during the entire script execution. Since each UPDATE causes every row to be rewritten, there will be (N^2 - N)/2 total rows updated. So 8308 topics would cause a little over 155 times more row updates than creating 667 topics.
Jean-Christophe Michel
2007-Jul-16 08:47 UTC
[Betternestedset-talk] Benchmarks for update_conditions.patch
Hi Christian, Le 7 juil. 07 ? 23:36, Christian Niles a ?crit :> I spent some time this afternoon developing benchmarks for the patch > I submitted yesterday. I tested both MySQL and PostgreSQL, using > topic names from dmoz.org. The results are pretty striking: > > As you can see, there is a modest improvement for MySQL, but a *huge* > difference for PostgreSQL.I''ll try to find time to review your changes and integrate them. Thanks a lot for contributing ! Jean-Christophe Michel -- symetrie.com Better Nested Set for rails: http://opensource.symetrie.com/trac/better_nested_set