Olly Betts
2017-Dec-18 22:40 UTC
How to get the serialise score returned in Xapian::KeyMaker->operator().
On Sat, Dec 16, 2017 at 10:11:40PM +0000, Olly Betts wrote:> Unfortunately the sort key isn't currently exposed via the public API. > It's available internally and it seems like it ought to be accessible > but there's no accessor method for it - I can add one but that won't > help for existing releases.I've added MSetIterator::get_sort_key() to master in 9f807b83ab61a943a355a9ff6733299eab8e6bb1, and backported to the RELEASE/1.4 branch in 93ea6216fe8141d6223c869c6bccb039414db0fa, so this should be in 1.4.6 when that's released. Cheers, Olly
张少华
2018-Jan-15 12:55 UTC
How to get the serialise score returned in Xapian::KeyMaker->operator().
Thanks for you reply. In our case, we want to get a weight using the user' properties(age, gender, price preference) and products' properties(price, comment count, purchased amount among different gender or range of age). So our weight function is complex, no matter we use KeyMaker or PostingSource, six to eight values in slot will be used. But we find that using doc.get_value(slot) several times separately in each search makes getting result slowly. Now we want to constuct a forward index (using unordered map) which uses docid as key and its value contains the slot values we need, also the forward index will be constructed while we starting our application. Then we can get the values we used at the same time, and we need not to use sortable_unserialise(). Do you have some suggestions about this or is there some other way to make our search faster? By the way, our application scenario is here. https://lists.xapian.org/pipermail/xapian-discuss/2018-January/009579.html Cheers, Zhang -- 发自我的网易邮箱手机智能版 在 2017-12-19 06:40:52,"Olly Betts" <olly at survex.com> 写道:>On Sat, Dec 16, 2017 at 10:11:40PM +0000, Olly Betts wrote: >> Unfortunately the sort key isn't currently exposed via the public API. >> It's available internally and it seems like it ought to be accessible >> but there's no accessor method for it - I can add one but that won't >> help for existing releases. > >I've added MSetIterator::get_sort_key() to master in >9f807b83ab61a943a355a9ff6733299eab8e6bb1, and backported to the >RELEASE/1.4 branch in 93ea6216fe8141d6223c869c6bccb039414db0fa, so this >should be in 1.4.6 when that's released. > >Cheers, > Olly
Olly Betts
2018-Jan-16 19:25 UTC
How to get the serialise score returned in Xapian::KeyMaker->operator().
On Mon, Jan 15, 2018 at 08:55:26PM +0800, 张少华 wrote:> In our case, we want to get a weight using the user' properties(age, > gender, price preference) and products' properties(price, comment > count, purchased amount among different gender or range of age). So > our weight function is complex, no matter we use KeyMaker or > PostingSource, six to eight values in slot will be used. > > But we find that using doc.get_value(slot) several times separately in > each search makes getting result slowly.Each value slot is stored as a separate chunked stream, so fetching many of them will increase the work required.> Now we want to constuct a forward index (using unordered map) which > uses docid as key and its value contains the slot values we need, also > the forward index will be constructed while we starting our > application. Then we can get the values we used at the same time, and > we need not to use sortable_unserialise().I'm not sure unordered_map is the best choice here - the values will be accessed in increasing docid order, and something that has better locality of access will probably be faster due to caching considerations. Unless you have a lot of large gaps in your docids, I'd consider just using a vector indexed by the docid. Even with a few unused entries, you save all the hash table overhead that unordered_map will add.> Do you have some suggestions about this or is there some other way to > make our search faster?You could also serialise all the data you want for weighting into a single value slot. If you have the RAM for it and don't mind the start-up time overhead, I'd imagine your approach would be faster. Cheers, Olly
Reasonably Related Threads
- How to get the serialise score returned in Xapian::KeyMaker->operator().
- How to get the serialise score returned in Xapian::KeyMaker->operator().
- How to get the serialise score returned in Xapian::KeyMaker->operator().
- How to get the serialise score returned in Xapian::KeyMaker->operator().
- How to get the serialise score returned in Xapian::KeyMaker->operator().