On Sun, Apr 09, 2017 at 11:34:07PM +0530, Vivek Pal wrote:> > Each scheme already has a human-readable name, and Xapian::Registry > > can map that to an "examplar" object of the right type, so we > > could take a string like "bm25 1 0.8", see the first word is "bm25" > > and get a BM25Weight object, then call parse_params("1 0.8") on it to > > create the correct Weight object (broadly similar to how unserialise() > > is handled). > > If I followed correctly, since the set_weighting_scheme method in > omega/weight.cc already does exactly that, do you suggest adding a > parse_params method in each weighting scheme class to create the specific > object? -- in which case the set_weighting_scheme method in omega/weight.cc > would then use parse_params method to create the specific object with params > in the "scheme" string. E.g. - > > if (startswith(scheme, "bm25")) { > const char *p = scheme.c_str() + 4; > if (*p == '\0') { > enq.set_weighting_scheme(Xapian::BM25Weight()); > return; > } > if (C_isspace(*p)) { > Xapian::BM25Weight wt = Xapian::BM25Weight::parse_params(p); > enq.set_weighting_scheme(wt); > return; > } > }No, use Xapian::Registry to find the weighting scheme from the name like how Weight::unserialise() does (otherwise every caller would need code similar to that above). The code in omega would just be: enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme)); There's probably a much better name than parse_params() though. Cheers, Olly
> No, use Xapian::Registry to find the weighting scheme from the name > like how Weight::unserialise() does (otherwise every caller would need > code similar to that above).Okay, I looked into Xapian::Registry and it seems you are referring to using the get_weighting_scheme method? (which expects a string e.g. "bm25" i.e. the name of a weighting scheme.) And, Weight::unserialise() looks like throwing a Xapian::UnimplementedError error. I wonder if you meant the same method?> The code in omega would just be: > > enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme));I wonder if we could do something more like: enq.set_weighting_scheme(Xapian::Registry::get_weighting_scheme(name).parse_params(params)); where, Xapian::Registry::get_weighting_scheme(name) returns a weighting scheme object e.g. BM25Weight object and then calling parse_params method on that to return a BM25Weight object now with parameters values as found in params string. It should work given that we first separate the "scheme" string into two substrings namely "name" and "params" as used above. Also, we'd check if params is not null e.g. in the case of scheme = "bm25" and then we'd just call: enq.set_weighting_scheme(Xapian::Registry::get_weighting_scheme(name)); Also, regarding Xapian::Weight::parse_params(scheme) -- I thought we were to define parse_params method in each Weight subclass e.g. BM25Weight, LMWeight etc.?> There's probably a much better name than parse_params() though.Yes, I agree. May be get_parameterized_weighting_scheme sound better? (if it isn't too long to read :)) Thanks, Vivek
> Each scheme already has a human-readable name, and Xapian::Registry > can map that to an "examplar" object of the right type, so we > could take a string like "bm25 1 0.8", see the first word is "bm25" > and get a BM25Weight object, then call parse_params("1 0.8") on it to > create the correct Weight object (broadly similar to how unserialise() > is handled).Hi Olly -- the following piece of tested code in omega/weight.cc hopefully achieves what we intend to do. It works fine for all tests. Please let me know what you think. if (startswith(scheme, "pl2")) { const char *p = scheme.c_str() + 3; if (*p == '\0') { enq.set_weighting_scheme(Xapian::BM25Weight()); return; } if (C_isspace(*p)) { Xapian::Registry reg; const Xapian::Weight * wt reg.get_weighting_scheme("Xapian::PL2Weight"); enq.set_weighting_scheme(*wt->set_parameter_values(p)); return; } } Although, I'm still having a hard time trying to figure out how it's possible to achieve what we do with the above code by just: enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme)); in omega like you mentioned previously. Also, turns out that parse_params method is identical to unserialise() method in each Weight subclass so why not simply use unserialise method rather than implementing the same functionality that it provides under a different name like parse_params to avoid code duplication? In fact, the following code works just fine for all tests: enq.set_weighting_scheme(*wt->unserialise(p)); Thanks, Vivek
On Mon, Apr 10, 2017 at 11:47:36PM +0530, Vivek Pal wrote:> > No, use Xapian::Registry to find the weighting scheme from the name > > like how Weight::unserialise() does (otherwise every caller would need > > code similar to that above). > > Okay, I looked into Xapian::Registry and it seems you are referring to using > the get_weighting_scheme method? (which expects a string e.g. "bm25" i.e. the > name of a weighting scheme.)Yes, you pass it a string and it gives you the registered object with that name.> And, Weight::unserialise() looks like throwing a Xapian::UnimplementedError > error. I wonder if you meant the same method?That's the default implementation - each subclass overrides that with an actual implementation (at least if it wants to work with remote databases).> > The code in omega would just be: > > > > enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme)); > > I wonder if we could do something more like: > > enq.set_weighting_scheme(Xapian::Registry::get_weighting_scheme(name).parse_params(params)); > > where, Xapian::Registry::get_weighting_scheme(name) returns a weighting scheme > object e.g. BM25Weight object and then calling parse_params method on that > to return a BM25Weight object now with parameters values as found in params > string.That doesn't work as written because you need a registry object to call get_weighting_scheme() on (it's not a static method). But the idea is to have a static method of Xapian::Weight to do essentially that. It probably needs to take an optional Registry object like Query::unserialise() does so the user can add their own custom weighting schemes in: static const Query unserialise(const std::string & serialised, const Registry & reg = Registry());> It should work given that we first separate the "scheme" string into two > substrings namely "name" and "params" as used above.That ought to be handled by parse_params().> Also, we'd check if params is not null e.g. in the case of scheme = "bm25" > and then we'd just call: > > enq.set_weighting_scheme(Xapian::Registry::get_weighting_scheme(name));This shouldn't be necessary, though it does avoid pointlessly cloning the weight object from the registry in the case of an empty parameter string. It means all weighting schemes would need to support being specified with no parameters - I guess there's probably always a sane default set of parameters.> Also, regarding Xapian::Weight::parse_params(scheme) -- I thought we were to > define parse_params method in each Weight subclass e.g. BM25Weight, LMWeight > etc.?Sorry, I was misusing that name for the static "dispatcher" method as well as the virtual method implementing the parameter parsing.> > There's probably a much better name than parse_params() though. > > Yes, I agree. May be get_parameterized_weighting_scheme sound better? (if it > isn't too long to read :))That makes it sound like a getter, whereas it's really a factory method. Also we use British English spellings in the API so it'd be "parameterised" (we could provide inline aliases, though we haven't for other cases such as "serialise" and "miles_to_metres" and nobody has moaned so far). Cheers, Olly