On Mon, Apr 10, 2017 at 11:47:36PM +0530, Vivek Pal wrote:> > No, use Xapian::Registry to find the weighting scheme from the name > > like how Weight::unserialise() does (otherwise every caller would need > > code similar to that above). > > Okay, I looked into Xapian::Registry and it seems you are referring to using > the get_weighting_scheme method? (which expects a string e.g. "bm25" i.e. the > name of a weighting scheme.)Yes, you pass it a string and it gives you the registered object with that name.> And, Weight::unserialise() looks like throwing a Xapian::UnimplementedError > error. I wonder if you meant the same method?That's the default implementation - each subclass overrides that with an actual implementation (at least if it wants to work with remote databases).> > The code in omega would just be: > > > > enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme)); > > I wonder if we could do something more like: > > enq.set_weighting_scheme(Xapian::Registry::get_weighting_scheme(name).parse_params(params)); > > where, Xapian::Registry::get_weighting_scheme(name) returns a weighting scheme > object e.g. BM25Weight object and then calling parse_params method on that > to return a BM25Weight object now with parameters values as found in params > string.That doesn't work as written because you need a registry object to call get_weighting_scheme() on (it's not a static method). But the idea is to have a static method of Xapian::Weight to do essentially that. It probably needs to take an optional Registry object like Query::unserialise() does so the user can add their own custom weighting schemes in: static const Query unserialise(const std::string & serialised, const Registry & reg = Registry());> It should work given that we first separate the "scheme" string into two > substrings namely "name" and "params" as used above.That ought to be handled by parse_params().> Also, we'd check if params is not null e.g. in the case of scheme = "bm25" > and then we'd just call: > > enq.set_weighting_scheme(Xapian::Registry::get_weighting_scheme(name));This shouldn't be necessary, though it does avoid pointlessly cloning the weight object from the registry in the case of an empty parameter string. It means all weighting schemes would need to support being specified with no parameters - I guess there's probably always a sane default set of parameters.> Also, regarding Xapian::Weight::parse_params(scheme) -- I thought we were to > define parse_params method in each Weight subclass e.g. BM25Weight, LMWeight > etc.?Sorry, I was misusing that name for the static "dispatcher" method as well as the virtual method implementing the parameter parsing.> > There's probably a much better name than parse_params() though. > > Yes, I agree. May be get_parameterized_weighting_scheme sound better? (if it > isn't too long to read :))That makes it sound like a getter, whereas it's really a factory method. Also we use British English spellings in the API so it'd be "parameterised" (we could provide inline aliases, though we haven't for other cases such as "serialise" and "miles_to_metres" and nobody has moaned so far). Cheers, Olly
>> > The code in omega would just be: >> > >> > enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme)); >> >> I wonder if we could do something more like: >> >> enq.set_weighting_scheme(Xapian::Registry::get_weighting_scheme(name).parse_params(params)); >> >> where, Xapian::Registry::get_weighting_scheme(name) returns a weighting scheme >> object e.g. BM25Weight object and then calling parse_params method on that >> to return a BM25Weight object now with parameters values as found in params >> string. > > That doesn't work as written because you need a registry object to call > get_weighting_scheme() on (it's not a static method).Yes, right. That was just to convey the basic high level idea with something crude, although later realized the mistake when dived into the details.> But the idea is to have a static method of Xapian::Weight to do > essentially that. It probably needs to take an optional Registry object > like Query::unserialise() does so the user can add their own custom > weighting schemes in: > > static const Query unserialise(const std::string & serialised, > const Registry & reg = Registry()); > >> It should work given that we first separate the "scheme" string into two >> substrings namely "name" and "params" as used above. > > That ought to be handled by parse_params().Thanks Olly for clarifying. So, we want to define a static parse_params method needs to defined be in Xapian::Weight that would take "scheme" and a Registry object to do the following set of things: 1. get an appropriate weighting scheme object by looking at the first word in "scheme" string, 2. call parse_params method defined in that particular weighting scheme subclass with just parameter string (e.g. parse_params("1.0 0.5")) which creates and returns the weighting scheme object. 3. then ultimately return the weighting scheme object to the code in omega: enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme)); This is my current understanding of the main idea here but if you see something odd then please let me know. Thanks, Vivek
On Fri, Apr 14, 2017 at 12:31:08AM +0530, Vivek Pal wrote:> Thanks Olly for clarifying. So, we want to define a static parse_params > method needs to defined be in Xapian::Weight that would take "scheme" and > a Registry object to do the following set of things: > > 1. get an appropriate weighting scheme object by looking at the first > word in "scheme" string, > > 2. call parse_params method defined in that particular weighting scheme > subclass with just parameter string (e.g. parse_params("1.0 0.5")) which > creates and returns the weighting scheme object. > > 3. then ultimately return the weighting scheme object to the code in omega: > > enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme)); > > This is my current understanding of the main idea here but if you see > something odd then please let me know.Yes, that's the idea. On Fri, Apr 14, 2017 at 12:31:58AM +0530, Vivek Pal wrote:> >> > enq.set_weighting_scheme(*wt->set_parameter_values(p)); > >> > >> set_parameter_values is basically parse_params method; an attempt to > >> find a better name for parse_params. > > > > It's not a setter either - that method name suggests it is setting the > > values on wt, but it actually returns a new object. > > I'm kind of running out of names for this method. Would you like to suggest > one?My best idea so far is "create" for the static method which takes the full string and an optional Registry. We probably don't want to use the same name for the virtual method which takes just the parameter part, especially as both would be callable with a single string (I did use "parse_params" to refer to both, but that was mostly because I got mixed up as to which I was talking about). Perhaps "create_from_parameters" for that one? It's conceivably useful to call from user code (if you want to fix the weighting scheme and just allow the parameters to be specified) but the longer name is a bit of a signal that it's less commonly used. On Fri, Apr 14, 2017 at 12:32:26AM +0530, Vivek Pal wrote:> > Currently omega has no test coverage for parsing of weighting schemes, so the > > tests passing doesn't mean this works correctly. > > > > (We only added any tests of the omega CGI about 18 months ago, and at present > > we really only have tests of changes since then - feature tests for new > > features and regression tests for bug fixes). > > Oh, didn't realise there were no tests for this functionality. > Probably, the next thing would be to write some tests afterwards.Yes, that'd be great. Cheers, Olly