search for: cjkngramiterator

Displaying 1 result from an estimated 1 matches for "cjkngramiterator".

2019 Mar 07
3
Ask for advice on exact requirements to fix #699 mixed CJK numbers
...time, I would fix the code in next commit.) If it's better to create a pull request, please tell me. (Below is my explanation to the code, in case my code is not clear to read) current code only supports the cases that mixed Chinese numbers are embedded into the CJK characters which sent to CJKNgramIterator. And it would extract the whole number as one token instead of 1-gram. The code was added in the operator++ of CJKNgramIterator in cjk-tokenizer.cc, for considerations of minimizing the modification to existing code and harm to modularity. current implementation would pass the test cases below: &...