Hello, I have found that the use of sort_by_value very slow. 16800 result, return to the previous 10, sorting takes about 25ms. And if you do not sort, returns 10, need only about 0.3ms. How to make the sort faster? -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.xapian.org/pipermail/xapian-devel/attachments/20140326/790430b0/attachment-0002.html>
On Wed, Mar 26, 2014 at 10:09:15AM +0800, ???? wrote:> Hello, I have found that the use of sort_by_value very slow. > 16800 result, return to the previous 10, sorting takes about 25ms.25ms is "very slow"?> And if you do not sort, returns 10, need only about 0.3ms.It's not that sorting 10 results is taking 24.7ms - we need to find the top 10 results when sorting by value, which is generally a different 10 to when sorting by relevance. Part of the difference is that when sorting by relevance, Xapian has various clever optimisations it can apply which reduce the number of documents which need to be considered, but most of these aren't applicable when sorting by value. So really, it's sorting by relevance that is very fast. Another factor is that when sorting by value we need to actually get the value for each matching document so that we can sort them, which is extra data to read. We may save from not reading the document lengths, but that is much smaller than most values.> How to make the sort faster?You don't say what version of Xapian you're using, or what the values are you're sorting on, so it's hard to give very specific advice. But a couple of general points: You definitely want to use a backend with values streams (so chert or newer, which means Xapian 1.2.x). If you can store the values to sort on more compactly, that will help. E.g. use Xapian::sortable_serialise() rather than an ASCII string like "0001234.5678" if the sort keys are floating point numbers. Cheers, Olly
Thank you very much for your reply. My English is not very good, maybe I did not describe the problem clearly. I need to return 10 results, but Circulation of 16800 times in the code?so it is time-consuming. Can a one-time reading all the value, sort later return to the previous 10? I'm using version 1.2.17. my code like this: Index: for (int i = 0; i <16800; i + +) { Xapian :: Document doc; char buf [256]; snprintf (buf, sizeof (buf), "% d", i); doc.add_term ("term"); doc.add_value (200, buf); docid = db.add_document (doc); } Search: Xapian :: Database db ("../Data.."); Xapian :: Enquire eq (db); Xapian :: Query query ("term"); eq.set_sort_by_value (200); eq.set_query (query); boost :: timer :: cpu_timer st; Xapian :: MSet mset = eq.get_mset (0, 10); In my actual code, also used the AND, and database is 25G, so consuming 25ms. How to use, can make sorting faster? thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.xapian.org/pipermail/xapian-devel/attachments/20140326/5e832faf/attachment-0002.html>
-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.xapian.org/pipermail/xapian-devel/attachments/20140326/11a18003/attachment-0002.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 7489F2E3 at 8084701C.0F9A3253.bmp Type: application/octet-stream Size: 22334 bytes Desc: not available URL: <lists.xapian.org/pipermail/xapian-devel/attachments/20140326/11a18003/attachment-0002.obj>