On Sat, Apr 08, 2017 at 09:11:22PM +0100, James Aylett wrote:> On 8 Apr 2017, at 19:15, Vivek Pal <vivekpal.dtu at gmail.com> wrote: > > >> and the details of which weighting schemes were available in which version > >> isn't a key part of the $set command itself. > > > > Do you suggest dropping that piece of information out? Since the reason behind > > adding that is unclear to me so I'm not entirely sure if it would be a good > > idea. > > No, we shouldn't lose it completely. But I think that omegascript should > document the language and the functions, and the omega documentation > should cover things like exactly which weighting schemes are available. > Right now, omega doesn't have particularly coherent documentation, and I > think finding a better home for the weighting scheme details might help.We actually talked about adding a mechanism so Weight classes could provide a scheme-specific parsing of parameters from a human-readable text string. Currently there's code in omega to handle this, and also in examples/quest.cc and a third version in the evaluation module - if we've implemented this 3 times ourselves in modules we've built on the C++ API that seems a strong hint that this functionality might belong in the API instead. Each scheme already has a human-readable name, and Xapian::Registry can map that to an "examplar" object of the right type, so we could take a string like "bm25 1 0.8", see the first word is "bm25" and get a BM25Weight object, then call parse_params("1 0.8") on it to create the correct Weight object (broadly similar to how unserialise() is handled). Then we can document the available schemes and the parameters they take in one place and refer to that from omega, quest and the evaluation module. Cheers, Olly
> Each scheme already has a human-readable name, and Xapian::Registry > can map that to an "examplar" object of the right type, so we > could take a string like "bm25 1 0.8", see the first word is "bm25" > and get a BM25Weight object, then call parse_params("1 0.8") on it to > create the correct Weight object (broadly similar to how unserialise() > is handled).If I followed correctly, since the set_weighting_scheme method in omega/weight.cc already does exactly that, do you suggest adding a parse_params method in each weighting scheme class to create the specific object? -- in which case the set_weighting_scheme method in omega/weight.cc would then use parse_params method to create the specific object with params in the "scheme" string. E.g. - if (startswith(scheme, "bm25")) { const char *p = scheme.c_str() + 4; if (*p == '\0') { enq.set_weighting_scheme(Xapian::BM25Weight()); return; } if (C_isspace(*p)) { Xapian::BM25Weight wt = Xapian::BM25Weight::parse_params(p); enq.set_weighting_scheme(wt); return; } } I didn't test the code above so there could be small errors, but hopefully it gets across the general idea. Thanks, Vivek
On Sun, Apr 09, 2017 at 11:34:07PM +0530, Vivek Pal wrote:> > Each scheme already has a human-readable name, and Xapian::Registry > > can map that to an "examplar" object of the right type, so we > > could take a string like "bm25 1 0.8", see the first word is "bm25" > > and get a BM25Weight object, then call parse_params("1 0.8") on it to > > create the correct Weight object (broadly similar to how unserialise() > > is handled). > > If I followed correctly, since the set_weighting_scheme method in > omega/weight.cc already does exactly that, do you suggest adding a > parse_params method in each weighting scheme class to create the specific > object? -- in which case the set_weighting_scheme method in omega/weight.cc > would then use parse_params method to create the specific object with params > in the "scheme" string. E.g. - > > if (startswith(scheme, "bm25")) { > const char *p = scheme.c_str() + 4; > if (*p == '\0') { > enq.set_weighting_scheme(Xapian::BM25Weight()); > return; > } > if (C_isspace(*p)) { > Xapian::BM25Weight wt = Xapian::BM25Weight::parse_params(p); > enq.set_weighting_scheme(wt); > return; > } > }No, use Xapian::Registry to find the weighting scheme from the name like how Weight::unserialise() does (otherwise every caller would need code similar to that above). The code in omega would just be: enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme)); There's probably a much better name than parse_params() though. Cheers, Olly