Olly Betts
2008-Jun-04 22:47 UTC
[Xapian-devel] [Xapian-commits] 10683: trunk/xapian-core/ trunk/xapian-core/common/ trunk/xapian-core/queryparser/ trunk/xapian-core/tests/
On Wed, Jun 04, 2008 at 01:48:37PM +0100, richard wrote:> queryparser/queryparser.lemony: Fix various cases where queries > were constructed pair-wise within a loop, which leads to O(N*N) > scaling behaviour (because each intermediate query construction > is O(M) where M is the size of that query, and there are N of > them).Um, the real bug is in query construction then! Pairwise construction shouldn't depend on the size of the subqueries (it used to, but I thought that had already been fixed). That aside, it may be more efficient to build up subqueries in an array and construct the query object in one (as your patch does), particularly in the case where we're using new to create the query object. But we really should fix this in Query, not just work around the issue in QueryParser. Cheers, Olly
Richard Boulton
2008-Jun-05 06:21 UTC
[Xapian-devel] [Xapian-commits] 10683: trunk/xapian-core/ trunk/xapian-core/common/ trunk/xapian-core/queryparser/ trunk/xapian-core/tests/
Olly Betts wrote:> On Wed, Jun 04, 2008 at 01:48:37PM +0100, richard wrote: >> queryparser/queryparser.lemony: Fix various cases where queries >> were constructed pair-wise within a loop, which leads to O(N*N) >> scaling behaviour (because each intermediate query construction >> is O(M) where M is the size of that query, and there are N of >> them). > > Um, the real bug is in query construction then! Pairwise construction > shouldn't depend on the size of the subqueries (it used to, but I > thought that had already been fixed).I haven't looked at the code in any detail, but I think the problem is that each pairwise construction causes a full copy of the subquery (perhaps due to the simplification part of the construction). Perhaps the end_construction() part needs to be made lazy to avoid calling this step repeatedly.> That aside, it may be more efficient to build up subqueries in an array > and construct the query object in one (as your patch does), particularly > in the case where we're using new to create the query object. But we > really should fix this in Query, not just work around the issue in > QueryParser.The test for checking the behaviour of QueryParser should be applicable to Query with a few modifications, so we at least have an easy way to check the performance now. :) -- Richard