Jim Lynch
2006-Feb-15 12:38 UTC
[Xapian-discuss] Trouble with German language indexing/searching
I've indexed a number of documents in German. I'm apparently having a character set problem because I can't seem to find any terms that include characters >0x7f. Is there a way I can list all the terms in the database to see if they were indexed properly? None of the "top terms" seem to include any terms with special characters. I indexed the docs with scriptindex. I'm sending the characters correctly I think because here is a sample query. &P=dar?ber Thanks, Jim.
Jim Lynch
2006-Feb-15 12:52 UTC
[Xapian-discuss] Trouble with German language indexing/searching
I need to modify that somewhat. Some of the words containing special characters are found. I just hadn't discovered them before I sent the first email. This is an example of one of them that failed. Other words with that same special character were found. I turned stemming off when indexing. (scriptindex --stemmer=none) I can't imagine Omega stemming that word, but maybe. Jim. Jim Lynch wrote:> I've indexed a number of documents in German. I'm apparently having a > character set problem because I can't seem to find any terms that > include characters >0x7f. Is there a way I can list all the terms in > the database to see if they were indexed properly? None of the "top > terms" seem to include any terms with special characters. I indexed > the docs with scriptindex. I'm sending the characters correctly I > think because here is a sample query. > > &P=dar?ber > > > Thanks, > Jim. > > _______________________________________________ > Xapian-discuss mailing list > Xapian-discuss@lists.xapian.org > http://lists.xapian.org/mailman/listinfo/xapian-discuss > > >