Hi all, I used Xapian to search a database with 1.5 million products for an online webshop. We are cross referencing this with a relational database to do further refining which is far from optimal. This decision was made because we did not have the time to investigate Xapian further. Now we want to add more refining options, eg. by language, cover language and release date. How can I know which languages are available within a result set? If I index the release date, how can I know which of the following groups are present in my result set? - not released - will be released within the next 30 days - released in the last 30 days - released more than 30 days ago I hope somebody can help me out. I would grately appreciate it. Cheers, James
On Tue, Mar 24, 2009 at 05:49:48PM +0100, james cauwelier wrote:> Hi all, > > > I used Xapian to search a database with 1.5 million products for an online > webshop. We are cross referencing this with a relational database to do > further refining which is far from optimal. This decision was made because > we did not have the time to investigate Xapian further. > > Now we want to add more refining options, eg. by language, cover language > and release date. How can I know which languages are available within a > result set? If I index the release date, how can I know which of the > following groups are present in my result set? > > - not released > - will be released within the next 30 days > - released in the last 30 days > - released more than 30 days ago > > > I hope somebody can help me out. I would grately appreciate it.The easiest approach is to run the query 4 times with extra restrictions to cover each of those cases, and check if you get any results for each. Try it - it might be faster than you expect! Alternatively, there's code in the matchspy branch to look for facets which are stored associated with items in your result set, which might help you. With trunk (or the 1.0 release series), the only other approach that I can think of is to get lots of result (say, the top 1000) and check their document data (outside xapian) to see if they fall into your categories. -- Richard
On Tue, Mar 24, 2009 at 05:49:48PM +0100, james cauwelier wrote:> Now we want to add more refining options, eg. by language, cover language > and release date. How can I know which languages are available within a > result set? If I index the release date, how can I know which of the > following groups are present in my result set? > > - not released > - will be released within the next 30 days > - released in the last 30 days > - released more than 30 days agoOn way of doing this is to place the release date in a value, and configure the QueryParser to use the value range syntax. See <http://xapian.org/docs/valueranges.html> for more detail. I believe all the functionality should work fine from PHP5. For languages, if you stash the language of a document in the document data, you can grab that out of each match to find out the languages in a (part of) the match set, although that's a bit messy. There may be a better solution to this. J -- James Aylett talktorex.co.uk - xapian.org - uncertaintydivision.org