search for: 6122dcdccc97

Displaying 2 results from an estimated 2 matches for "6122dcdccc97".

2024 Jan 08
1
Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints
...takana and Hangul codepoints as unbroken script and treats Latin characters, numbers, symbols and punctuation as broken script. There's a couple of unit tests that check for this. diff --git a/xapian-core/queryparser/word-breaker.cc b/xapian-core/queryparser/word-breaker.cc index 8108523ccd53..6122dcdccc97 100644 --- a/xapian-core/queryparser/word-breaker.cc +++ b/xapian-core/queryparser/word-breaker.cc @@ -102,8 +102,10 @@ is_unbroken_script(unsigned p) 0xF900 - 1, 0xFAFF, // FE30..FE4F; CJK Compatibility Forms 0xFE30 - 1, 0xFE4F, - // FF00..FFEF; Halfwidth and Fullwidt...
2024 Jan 07
1
Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints
On Thu, Jan 04, 2024 at 05:50:22PM +0100, Robert Stepanek wrote: > Since I am undecided yet if and how to fix this in Xapian I haven't > come up with a pull request. Because trac currently is offline, I > could not file a bug. I hope it's OK to post my analysis here first, > I'll be happy to follow up reporting that bug proper later (should we > conclude that it actually