Hello, We would like to create Google or Firefox like "search hints". If someone types "abc", the search system should name some possible hints. I think, Firefox does it by indexing 3-characters of the domain name. If you enter parts, you get some hints. Thank you very much Marcus
2010/1/18 double <ninive at gmx.at>:> Hello, > > We would like to create Google or Firefox like "search hints". > If someone types "abc", the search system should name > some possible hints. > > I think, Firefox does it by indexing 3-characters of the domain > name. If you enter parts, you get some hints.Hi Marcus, I've done this by using an auxiliary database of suggestions, indexing them with partial terms, e.g. "Sigourney Weaver" -> "s", "si", "sig", ... "sigourney wea", ... Then you can just create a single term query out of the user's partial input. A simple refinement to this would be to use spelling correction in case the user mistypes. There's also FLAG_PARTIAL in Xapian's QueryParser (http://shrunk.net/83d4d091) which is supposed to do much the same thing and doesn't require creating a special database, but I haven't tried it. regards, Tom
If you just want to partially match the end of a word, then FLAG_PARTIAL works awesome (we use it to great effect for much the same thing). However, if you want to match substrings within a term, where 'abc' matches against 'fooabcbar', then you will need to take an approach similar to what double suggested. The book "Managing Gigabytes" has a good description of a solution where you split terms into 'bigrams' so that, for example, the term 'fooabcbar' becomes [$f fo oo oa ab bc cb ba ar r$] where $ indicates the beginning or end of a word. Then you can split your search term similarly, either with or without the $ marker depending on your needs. for pure subscript searching, 'abc' becomes ['ab' 'bc'] which matches two of the bigrams in the original term. MG also goes into some details on how you can use this method to do pretty nice general wildcard searching like *abc*, f?o*, *abcba?, etc. -Mike On Mon, Jan 18, 2010 at 1:04 PM, double <ninive at gmx.at> wrote:> Hello, > > We would like to create Google or Firefox like "search hints". > If someone types "abc", the search system should name > some possible hints. > > I think, Firefox does it by indexing 3-characters of the domain > name. If you enter parts, you get some hints. > > Thank you very much > Marcus > > _______________________________________________ > Xapian-discuss mailing list > Xapian-discuss at lists.xapian.org > http://lists.xapian.org/mailman/listinfo/xapian-discuss >
Michel Pelletier schrieb:> If you just want to partially match the end of a word, then > FLAG_PARTIAL works awesome (we use it to great effect for much the > same thing).Yes, impressive indeed. Even if I search for "s", how can this be that fast! Thanks Marcus
Michel Pelletier schrieb:> If you just want to partially match the end of a word, then > FLAG_PARTIAL works awesome (we use it to great effect for much the > same thing).Is there a chance to sort the result by frequency of occurrence? To get the most used words first? Thanks Marcus