On 5/2/2016 9:03 PM, Olly Betts wrote:> On Fri, Apr 22, 2016 at 12:23:15PM -0400, Alex Aminoff wrote:
>> I did some digging and found a thread from 2011 talking about how to
>> subclass Xapian::PostingSource in order to incorporate the date or
>> recency of a document in its weighting:
>>
>>
http://thread.gmane.org/gmane.comp.search.xapian.general/8849/focus=8856
>>
>> As in that thread, I want to be clear that I don't want to sort by
>> date, but rather incorporate date information into the score by
>> which I sort the results. I may be able to stumble around and figure
>> this out, but I wonder if any current xapian users have done
>> something like this and how did it work out?
>
> I know some people have done recency boosting along the lines of that
> thread, but they don't seem to be speaking up about their experiences.
>
> I've not done this directly myself, but the main trick is probably
> finding a suitable amount to boost by, so that the relevancy from
> recency and relevance from content combine in a balanced way.
>
Right. My plan was to put in a configurable knob and then try different
values and let some end users play with it until they were happy(er).
>> We are a perl shop, but I guess I will need to figure out some C++
>> in order to do this?
>
> Currently some work would be needed to pull this off in Perl.
>
Huh. Sounds fairly complex.
> Search::Xapian doesn't wrap PostingSource, so for 1.2.x you'd need
> to write XS wrappers for this class, which isn't trivial if you want to
> be able to subclass in Perl.
>
Perhaps I am not understanding the basic concept, but I was figuring we
would just write a subclass of PostingSource in C++ that does what we
want, and not bother with the perl bindings. Is that not possible? I
realize that ideally we would develop the general solution and share our
code out to the community, but I assume that would be more work.
> The new SWIG-based Perl bindings in 1.3.x wrap PostingSource, but don't
> currently support subclassing in Perl (because SWIG's support for doing
> so in Perl was added more recently). Enabling it is probably fairly
> easy.
>
> However, some of the details of the SWIG-based Perl bindings may change
> before they're declared stable in 1.4.x:
>
> https://trac.xapian.org/ticket/523
>
> That's one of the last two bugs blocking 1.4.0, and I'm currently
> working on the other one. As noted in that ticket, we might bump that
> one for 1.4.0, but it'll be a high priority to address in early 1.4.x.
>
> So it really depends what timescale you're looking at for getting this
> implemented.
We are not in a rush. The nominal deadline is to have a new search
engine up and in production by NBER's centennial in 2020.
- Alex