> > I'm not clear on what the OFFSET really represents. Could you > > please explain a bit? > > Omega paginates results (as does Xapian's MSet, internally). So if > you're displaying the second page of results, you'll need to know > that when building training data. It's affected by TOPDOC and also > by the <>[# CGI variables, but internally to omega there's one > variable it's mapped onto. > > In omegascript, you can find this using $topdoc.Thanks for the explanation. Understood now.> > In the end, we will have two files it seems -- one created from the > > query template containing separate entries for each executed search > > as per the format you described previously and another containing > > query IDs and click URLs logged using a different template? > > Yes, that's right. I recommend logging Xapian docids instead of click > URLs for the reason previously discussed.Yes, I'll use docids instead of click URLs as you recommend. Now for the first step i.e logging separate entries for each executed search from the query template, I wanted to know if I should modify the existing log command or implement a separate one? Although, I think if we implement a new one we'll have a certain level of flexibility for achieving our purpose.> > It doesn't seem to comply with the format > > mentioned in its documentation as it expects two arguments but > > we provide only one here i.e. query.log and what does this > > argument mean? > > The [] means that the second parameter is optional. Indeed, the > documentation says: > > ENTRY defaults to a format similar to the Common Log Format > used by webservers.Thanks, it's clear to me now. I didn't come across the fact that the parameters in square brackets are assumed to be optional. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20170606/0151842e/attachment.html>
On 6 Jun 2017, at 12:45, Vivek Pal <vivekpal.dtu at gmail.com> wrote:> Now for the first step i.e logging separate entries for each executed > search from the query template, I wanted to know if I should modify > the existing log command or implement a separate one? Although, > I think if we implement a new one we'll have a certain level of flexibility > for achieving our purpose.There's a lot of flexibility already, because the log format is just omegascript. So I don't think you need to implement a new command to achieve this. (Although you might need a command to generate the query id. It depends on how you're going to do that.)> > The [] means that the second parameter is optional. > > Thanks, it's clear to me now. I didn't come across the fact that > the parameters in square brackets are assumed to be optional.That's actually a fairly common syntax (for instance, at a bash prompt type "help"), so useful to know :) J -- James Aylett devfort.com — spacelog.org — tartarus.org/james/
> There's a lot of flexibility already, because the log format is just > omegascript. So I don't think you need to implement a new command to > achieve this. (Although you might need a command to generate the query > id. It depends on how you're going to do that.)Ok, I'll try adapting the existing log command to achieve the kind of logging we want. And, about the command to generate unique query ids, I've been thinking to tackle this as a kind of hashing problem where we'll basically provide the query text as input to generate a unique id as output. Although, coming up with a 100% collision-free hashing algorithm for this task is something worth considering first. Other caveats include max length of the generated unique id string and whether we should truncate leading whitespaces from the query text to avoid "essentially same" queries from being recorded in different entries in the log file. What do you suggest? Thanks, Vivek -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20170607/e381ff59/attachment.html>