Displaying 2 results from an estimated 2 matches for "tr29".
Did you mean:
r29
2012 Jul 13
1
Need Suggestions for Sentence Breaking Implementation
...n.wikipedia.org/wiki/Sentence_boundary_disambiguation >
2. There are not many available solutions in C/C++. Almost all of them are
either in Python or Java.
3. There's a sentence boundary detection algorithm defined by Unicode
Standard. It's present at <
http://www.unicode.org/reports/tr29/#Sentence%5FBoundaries >
4. An existing C++ API that does this is BreakIterator class present here -
< http://icu-project.org/apiref/icu4c/classBreakIterator.html > .
Here's a line from it's doc: "The text boundary positions are found
according to the rules described in Unico...
2005 Dec 17
0
[OT] Unicode tokenization for Ferret
I wonder, do we (eventually) have a working Ruby implementation of this
http://www.unicode.org/reports/tr29/
This might come bloody useful not only for Ferret but for the
"excerpt" helper as well
--
Julian ''Julik'' Tarkhanov
me at julik.nl