thr3ads.net - Xapian devel - Omega: Missing support for newer weighting schemes [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Vivek Pal

2017-Apr-12 10:51 UTC

Omega: Missing support for newer weighting schemes

> Each scheme already has a human-readable name, and Xapian::Registry
> can map that to an "examplar" object of the right type, so we
> could take a string like "bm25 1 0.8", see the first word is
"bm25"
> and get a BM25Weight object, then call parse_params("1 0.8") on
it to
> create the correct Weight object (broadly similar to how unserialise()
> is handled).
Hi Olly -- the following piece of tested code in omega/weight.cc hopefully
achieves what we intend to do. It works fine for all tests. Please let me
know what you think.

  if (startswith(scheme, "pl2")) {
     const char *p = scheme.c_str() + 3;
     if (*p == '\0') {
         enq.set_weighting_scheme(Xapian::BM25Weight());
         return;
     }
     if (C_isspace(*p)) {
       Xapian::Registry reg;
       const Xapian::Weight * wt
reg.get_weighting_scheme("Xapian::PL2Weight");
       enq.set_weighting_scheme(*wt->set_parameter_values(p));
       return;
     }
  }

Although, I'm still having a hard time trying to figure out how it's
possible
to achieve what we do with the above code by just:

  enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme));

in omega like you mentioned previously.

Also, turns out that parse_params method is identical to unserialise() method
in each Weight subclass so why not simply use unserialise method rather than
implementing the same functionality that it provides under a different name
like parse_params to avoid code duplication? In fact, the following code
works just fine for all tests:

  enq.set_weighting_scheme(*wt->unserialise(p));

Thanks,
Vivek

Vivek Pal

2017-Apr-12 10:56 UTC

head link

Omega: Missing support for newer weighting schemes

> enq.set_weighting_scheme(Xapian::BM25Weight());
Sorry; it should be Xapian::PL2Weight() here.
> enq.set_weighting_scheme(*wt->set_parameter_values(p));
set_parameter_values is basically parse_params method; an attempt to
find a better name for parse_params.

Olly Betts

2017-Apr-13 01:07 UTC

head link

Omega: Missing support for newer weighting schemes

On Wed, Apr 12, 2017 at 04:21:23PM +0530, Vivek Pal
wrote:> Also, turns out that parse_params method is identical to unserialise()
method
> in each Weight subclass so why not simply use unserialise method rather
than
> implementing the same functionality that it provides under a different name
> like parse_params to avoid code duplication?
They're conceptually similar, but unserialise() takes a binary serialisation
of the parameters (so it can easily and compactly pass floating point values
exactly), whereas here we need to parse an ASCII string representation of
the parameters.
> In fact, the following code works just fine for all tests:
> 
>   enq.set_weighting_scheme(*wt->unserialise(p));
Currently omega has no test coverage for parsing of weighting schemes, so the
tests passing doesn't mean this works correctly.

(We only added any tests of the omega CGI about 18 months ago, and at present
we really only have tests of changes since then - feature tests for new
features and regression tests for bug fixes).

Cheers,
    Olly

Olly Betts

2017-Apr-13 01:08 UTC

head link

Omega: Missing support for newer weighting schemes

On Wed, Apr 12, 2017 at 04:26:32PM +0530, Vivek Pal
wrote:> > enq.set_weighting_scheme(*wt->set_parameter_values(p));
> 
> set_parameter_values is basically parse_params method; an attempt to
> find a better name for parse_params.
It's not a setter either - that method name suggests it is setting the
values on wt, but it actually returns a new object.

Cheers,
    Olly

Vivek Pal

2017-Apr-13 19:02 UTC

head link

Omega: Missing support for newer weighting schemes

>> Also, turns out that parse_params method is identical to unserialise()
method
>> in each Weight subclass so why not simply use unserialise method rather
than
>> implementing the same functionality that it provides under a different
name
>> like parse_params to avoid code duplication?
>
> They're conceptually similar, but unserialise() takes a binary
serialisation
> of the parameters (so it can easily and compactly pass floating point
values
> exactly), whereas here we need to parse an ASCII string representation of
> the parameters.
Ah, yes indeed. Thanks for clearing that up.
>> In fact, the following code works just fine for all tests:
>>
>>   enq.set_weighting_scheme(*wt->unserialise(p));
>
> Currently omega has no test coverage for parsing of weighting schemes, so
the
> tests passing doesn't mean this works correctly.
>
> (We only added any tests of the omega CGI about 18 months ago, and at
present
> we really only have tests of changes since then - feature tests for new
> features and regression tests for bug fixes).
Oh, didn't realise there were no tests for this functionality.
Probably, the next thing
would be to write some tests afterwards.

Thanks,
Vivek

Xapian devel - Apr 2017 - Omega: Missing support for newer weighting schemes

Omega: Missing support for newer weighting schemes

Omega: Missing support for newer weighting schemes

Omega: Missing support for newer weighting schemes

Omega: Missing support for newer weighting schemes

Omega: Missing support for newer weighting schemes