Displaying 11 results from an estimated 11 matches for "halfwidth".
2024 Jan 08
1
Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints
...suming the latter is valid, just removing this block (or removing the
> parts of it which are Lu or Ll) should fix the problem as then
> tokenisation will switch mode - I tried this and it fixes your case at
> least:
Removing the whole block will cause word-breaker to not correctly handle halfwidth Katakana, such as "??????????" which it would treat as a single term, whereas it should be two: ??????and ????).
My pull request causes word-breaker to only handle halfwidth Katakana and Hangul codepoints as unbroken script and treats Latin characters, numbers, symbols and punctuation a...
2005 Jan 25
1
CODA vs. BOA discrepancy
...and Welch
Stationarity start p-value
test iteration
b[1] passed 1 0.2649
b[2] passed 1 0.6709
b[3] passed 1 0.6376
b[4] passed 1 0.3673
tau passed 1 0.1944
sigma passed 1 0.0725
Halfwidth Mean Halfwidth
test
b[1] passed -39.800 0.303994
b[2] passed 0.714 0.003505
b[3] passed 1.297 0.010317
b[4] passed -0.153 0.004025
tau passed 0.106 0.000918
sigma passed 3.193 0.014986
BOA -- Heidelberger and Welch
Stationarity Test Keep Discard...
2024 Jan 07
1
Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints
...ian-core/queryparser/word-breaker.cc
index 8108523ccd53..4fabc23f4b56 100644
--- a/xapian-core/queryparser/word-breaker.cc
+++ b/xapian-core/queryparser/word-breaker.cc
@@ -103,7 +103,7 @@ is_unbroken_script(unsigned p)
// FE30..FE4F; CJK Compatibility Forms
0xFE30 - 1, 0xFE4F,
// FF00..FFEF; Halfwidth and Fullwidth Forms
- 0xFF00 - 1, 0xFFEF,
+ //0xFF00 - 1, 0xFFEF,
// 1AFF0..1AFFF; Kana Extended-B
// 1B000..1B0FF; Kana Supplement
// 1B100..1B12F; Kana Extended-A
If we're fixing it this way we should check this list for other
instances of this (and doing this would probably reveal if...
2011 Apr 04
2
gap.barplot doesn't support data arrays?
...th no y-tics and bars stretching downwards, as if
all the values were negative:
> twogrp2<-array(twogrp, dim=c(2,5))
>
gap.barplot(twogrp2,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab="Group
values",main="Barplot with gap")
Error in rect(xtics[bigones] - halfwidth, botgap, xtics[bigones] +
halfwidth, :
cannot mix zero-length and non-zero-length coordinates
However, the main title and axis labels do appear correctly.
Are data arrays unsupported for gap.barplot, or am I missing something?
Thanks,
Drew Steen
2005 Nov 15
2
y-axis in histograms
Dear R- list,
I have some data to present with histograms. Therefore I used hist(...).
I have few values with almost 80% of
the frequencies (totaly 800) and some other values with low frequencies
( totaly 5 -10 )
that I want to emphasize. Therefore I want to "cut" the y-axis on 100,
but I
don't know how to deal with this.
Thanks in advance,
Michael Graber
2024 Jan 09
1
Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints
On Mon, Jan 08, 2024 at 02:01:46PM +0100, Robert Stepanek wrote:
> Removing the whole block will cause word-breaker to not correctly
> handle halfwidth Katakana, such as "??????????" which it would treat
> as a single term, whereas it should be two: ??????and ????).
>
> My pull request causes word-breaker to only handle halfwidth Katakana
> and Hangul codepoints as unbroken script and treats Latin characters,
> numbers,...
2024 Jan 04
1
Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints
I think I found a bug in Xapian 1.5 when using FLAG_WORD_BREAKS for input that contains characters in Unicode Halfwidth and Fullwidth Forms (https://unicode.org/charts/PDF/UFF00.pdf).
Since I am undecided yet if and how to fix this in Xapian I haven't come up with a pull request. Because trac currently is offline, I could not file a bug. I hope it's OK to post my analysis here first, I'll be happy to fo...
2007 Feb 13
0
libswfdec/jpeg libswfdec/swfdec_image.c
...ee(tmp);
+ free(tmp_u);
+ free(tmp_v);
+ return (unsigned char *)argb_image;
+}
+
+unsigned char *
+get_argb_420 (JpegDecoder *dec)
+{
+ uint32_t *tmp;
+ uint8_t *tmp_u;
+ uint8_t *tmp_v;
+ uint8_t *tmp1;
+ uint32_t *argb_image;
+ uint8_t *yp, *up, *vp;
+ uint32_t *argbp;
+ int j;
+ int halfwidth;
+ int halfheight;
+
+ halfwidth = (dec->width + 1)>>1;
+ tmp = malloc (4 * dec->width * dec->height);
+ tmp_u = malloc (dec->width);
+ tmp_v = malloc (dec->width);
+ tmp1 = malloc (halfwidth);
+ argb_image = malloc (4 * dec->width * dec->height);
+
+ yp = dec->...
2008 Jun 17
2
[Bug 16395] New: glib abort for "double free or corruption" in jpeg code
...yp = (uint8_t *) 0xaae1fa0 'I' <repeats 96 times>, " "
up = (uint8_t *) 0x9c1b220 '\177' <repeats 200 times>...
vp = (uint8_t *) 0xa974e68 '\203' <repeats 200 times>...
argbp = (uint32_t *) 0x9132aa0
j = 250
halfwidth = 1
#7 0xb1aeb485 in jpeg_decoder_get_argb_image (dec=0x0) at
jpeg_rgb_decoder.c:89
No locals.
#8 0xb1aeb4fe in jpeg_decode_argb (data=0xafb9a57 "????", length=515,
image=0xbfd7edc8, width=0xa8c85a0, height=0xa8c85a4) at jpeg_rgb_decoder.c:63
dec = (JpegDecoder *) 0xabb8d58...
2015 Mar 20
0
Wine release 1.7.39
...rning incorrect HRESULT on unsupported interfaces.
d3drm/tests: Add tests for invalid interfaces in IDirect3DRM::QueryInterface.
d3drm/tests: Remove dynamic loading in d3drm.c.
d3drm/tests: Remove dynamic loading in vector.c.
Akihiro Sagawa (5):
msvcrt: Fix _ismbckata() for Halfwidth Katakana characters.
msvcrt: Add _mbctohira implementation.
msvcrt: Add _mbctokata implementation.
winmm/tests: Add notify flag tests for MPEGVideo driver.
mciqtz32: Fix notify flag behavior.
Alexandre Julliard (23):
server: Don't report completion at all in the M...
2007 Jun 05
7
Chinese, Japanese, Korean Tokenizer.
Hi,
I am looking for Chinese Japanese and Korean tokenizer that could can
be use to tokenize terms for CJK languages. I am not very familiar
with these languages however I think that these languages contains one
or more words in one symbol which it make more difficult to tokenize
into searchable terms.
Lucene has CJK Tokenizer ... and I am looking around if there is some
open source that we