thr3ads.net - search: "henearkrxern"

Displaying 4 results from an estimated 4 matches for "henearkrxern".

2007 Jul 09

Xapian pubmeet

Hi all, A few of us have been discussing whether we should have a Xapian social gathering of some kind. The current idea is meeting up in a pub in London some time in autumn for drinks and food. However all of this really depends on who might be able to come! It would be a chance to meet other Xapian enthusiasts in an informal social setting and talk about all things search-related (and

Chinese segmentation

2011 Apr 21

Chinese segmentation

hello, I have finished reading the papers, and i think it is time to design my project. First step will be determine the input characters are Chinese. i see the past post that cjk-tokenizer is just dealing with UTF-8 and unicode, but i see some other code system such as gbk and big5. i am wondering that should i just deal with UTF-8 and unicode?

Chinese, Japanese, Korean Tokenizer.

2007 Jun 05

Chinese, Japanese, Korean Tokenizer.

Hi, I am looking for Chinese Japanese and Korean tokenizer that could can be use to tokenize terms for CJK languages. I am not very familiar with these languages however I think that these languages contains one or more words in one symbol which it make more difficult to tokenize into searchable terms. Lucene has CJK Tokenizer ... and I am looking around if there is some open source that we

Document clustering module?

2007 Sep 16

Document clustering module?

Hi, I am implementing some document clustering algorithms in the xapian core. I would like to know if this kind of module will be considered to be incorporated into the core release. Or is there already some document clustering module that is just not open-sourced yet? Best, Yung-chung Lin

search for: henearkrxern