thr3ads.net - similar to: "Unicode in C++"

Displaying 20 results from an estimated 10000 matches similar to: "Unicode in C++"

Pull requests: CJK words and Snippet generator

2016 Sep 19

Pull requests: CJK words and Snippet generator

Olly, sorry for my delayed reply. Am Mo, 12. Sep 2016, um 05:32, schrieb Olly Betts: > On Wed, Sep 07, 2016 at 02:30:16PM +0200, rsto at paranoia.at wrote: > > On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote: > > > I think my main concerns are about efficiency [...] > > For the proposed term coverage, the implementation looks up and inserts > > terms into a map. That

Pull requests: CJK words and Snippet generator

2016 Sep 07

Pull requests: CJK words and Snippet generator

On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote: > I think my main concerns are about efficiency (since that a major > motivation for the current implementation, so slowing it down would be > annoying), and whether we can just make this the standard behaviour > rather than adding an option. The current implementation is O(n) and I took care to keep it at that. For the proposed term

Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints

2024 Jan 10

Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints

On Tue, Jan 9, 2024, at 3:28 AM, Olly Betts wrote: > Thanks, that looks good - now merged. Thanks! > Did you already check the other ranges for cased letters? I can but if > you have already there's not much point. I did not. If you find time, that'd be great. Otherwise I can make room for it in the next days. > > The fullwidth "????? ??????" tests suggests to

Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints

2024 Jan 09

Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints

On Mon, Jan 08, 2024 at 02:01:46PM +0100, Robert Stepanek wrote: > Removing the whole block will cause word-breaker to not correctly > handle halfwidth Katakana, such as "??????????" which it would treat > as a single term, whereas it should be two: ??????and ????). > > My pull request causes word-breaker to only handle halfwidth Katakana > and Hangul codepoints as

Fwd: win32-clipboard and Unicode zero bytes

2006 Feb 02

Fwd: win32-clipboard and Unicode zero bytes

Hi all, I''m forwarding this message from Brian Marick. If you run this test script and then paste the results into a Unicode aware text editor, you''ll notice that it only prints one character instead of three. I tried changing the strlen to _tcslen and strcpy to _tcscpy, but that didn''t help. I mucked around a bit with the MultiByteToWideChar function, too, but

Unicode HOWTO?

2006 Jun 25

Unicode HOWTO?

I am disappointed about the (seeming) lack of Unicode support in Rails. Is there a howto about working the most important limitations? For example, figuring out the length of an entered word: "???".length() will return 6, not 3. So if there is a maximum number of characters a user is allowed to enter, this won''t work as it should -- it treats strings as byte arrays instead

ICU

2006 Apr 10

ICU

I've just been looking at ICU with an eye to reworking the unicode queryparser patch to use it. A few things have jumped out so far which make we wonder if it's the best option. I don't really know what the alternatives are though (currently QueryParser uses glib's unicode routines). The first is that there seems to be bad version skew. Ubuntu breezy (the latest release) has

win32-dir, unicode

2006 May 26

win32-dir, unicode

Hi, I''ve got a preliminary version of the pure Ruby version of win32-dir in CVS. However, I was hoping to work out the Unicode issue. Run this: from = "C:\\test" to = "?????" Dir.mkdir(from) unless File.exists?(from) Dir.create_junction(to, from) It works, but my explorer (and dos) window shows the name garbled. I don''t think it''s a font

Plotmath on Fedora 31 broken with with pango >= 1.44 - workarounds?

2020 Mar 30

Plotmath on Fedora 31 broken with with pango >= 1.44 - workarounds?

Le lundi 30 mars 2020 ? 15:24 +1300, Paul Murrell a ?crit : > Hi > > I have created an R branch that contains a potential fix ... > > https://svn.r-project.org/R/branches/R-symfam/ > > This allows, for example, ... > > cairo_pdf(symbolfamily="OpenSymbol") > > ... to specify that the OpenSymbol family should be used as the > "symbol" font

Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints

2024 Jan 08

Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints

On Sun, Jan 7, 2024, at 7:45 PM, Olly Betts wrote: > I've restarted trac. I now created a pull request: https://github.com/xapian/xapian/pull/329 Should I create a trac issue, too? > Assuming the latter is valid, just removing this block (or removing the > parts of it which are Lu or Ll) should fix the problem as then > tokenisation will switch mode - I tried this and it fixes

Ask for advice on exact requirements to fix #699 mixed CJK numbers

2019 Mar 07

Ask for advice on exact requirements to fix #699 mixed CJK numbers

I am working on "#699 Better tokenisation of mixed CJK numbers", and have implemented a partial patch of Chinese for this ticket. Current code works well with special test cases and all tests in xapian-core could still pass. But I'm confused with exact requirements of the question, for how much we could pay with performance on enabling more cases, and if there are better methods to

# chkconfig: kill at run level 3

2010 Dec 04

# chkconfig: kill at run level 3

In the control script of my daemon in /etc/init.d?, I have # chkconfig: 35 97 3 The result of this is that I have links: /etc/rc.d/rc1.d/K03... /etc/rc.d/rc3.d/S97... /etc/rc.d/rc5.d/S97... As mentioned in a previous thread, my complex daemon throws an exception when I shutdown. Perhaps things might be better if I had: /etc/rc.d/rc3.d/K03... Might this be a good idea? If so,

PathFindExtension and wide strings

2008 Nov 10

PathFindExtension and wide strings

Hi, What''s happening here? require ''windows/path'' require ''windows/unicode'' include Windows::Path include Windows::Unicode file_a = ''bar.txt'' file_w = multi_to_wide(file_a) p PathFindExtensionA(file_a) # ''.txt'' => OK p PathFindExtensionW(file_w) # ''.'' => WRONG Is Ruby chopping the

[LLVMdev] LLVM supports Unicode?

2011 Aug 28

[LLVMdev] LLVM supports Unicode?

Hi, Jo! I'm trying create a new programming language, and I want that it have Unicode support (support for read and manipulate rightly the source-code and string literals). But, in addition, my programming language supports "string interpolation" string, and in these interpolations, tiny snippets of code, like expressions, or variable names. So, I need read each char, separating

[LLVMdev] LLVM supports Unicode?

2011 Aug 28

[LLVMdev] LLVM supports Unicode?

Am 28.08.2011 20:02, schrieb geovanisouza92 at gmail.com: > Hi, Jo! > > I'm trying create a new programming language, and I want that it have > Unicode support (support for read and manipulate rightly the source-code and > string literals). > > But, in addition, my programming language supports "string interpolation" > string, and in these interpolations, tiny

[LLVMdev] LLVM supports Unicode?

2011 Aug 28

[LLVMdev] LLVM supports Unicode?

Am 28.08.2011 16:02, schrieb geovanisouza92 at gmail.com: > Well, have you any idea about how I can implement rightly Unicode in C/C++? What do you mean with "implement in C/C++"? If you mean adding libraries to C/C++ that correctly deal with Unicode: that's nothing you do with a compiler infrastructure. And probably duplicate work, since Unicode libraries already exist. If

Increased memory requirements on 1.2

2006 Nov 08

Increased memory requirements on 1.2

I just recently upgraded a rails app of mine to run on edge (and the 1-2-pre-release branch) and I noticed my fcgis required roughly 6-8MBs more memory after just a couple requests. For example, each fcgi on edge would start around 40MB and rise to ~46MBs after a couple requests. I downgraded my app back to 1.1.6 and each fcgi would start at around 33MBs and rise to ~38MB. As a result of the

Can't read table encoded in Unicode (R-2.8.1)

2009 Apr 18

Can't read table encoded in Unicode (R-2.8.1)

Hi all, I have problems reading Unicode (UTF-16) coded tables in R 2.8.1 under Windows Vista. Imagine the following table: a b c d X 1,2 1,3 1,4 Y 2,2 2,3 2,4 Z 3,2 3,3 3,4 Usually I would use the following code to read the table: t = read.table("test.txt", header=T, sep="\t",dec=",") This works well if I create the table

Plotmath on Fedora 31 broken with with pango >= 1.44 - workarounds?

2020 Mar 30

Plotmath on Fedora 31 broken with with pango >= 1.44 - workarounds?

Le mardi 31 mars 2020 ? 10:14 +1300, Paul Murrell a ?crit : > Hi > > On 30/03/20 11:12 pm, Nicolas Mailhot wrote: > > Le lundi 30 mars 2020 ? 15:24 +1300, Paul Murrell a ?crit : > > > Hi > > > > > > I have created an R branch that contains a potential fix ... > > > > > > https://svn.r-project.org/R/branches/R-symfam/ > > >

How do I get substring of utf-8 string?

2006 Mar 21

How do I get substring of utf-8 string?

I''m trying to get substring from a utf-8 encoded string. (say, first 50 characters of the string) String#[0..49] would give me the first 50 bytes not 50 characters.. I know there is jcode library, but it only let you count number of characters in utf-8 string. unicode gem doesn''t seem to help much. unicode_hacks gem seem to solve the problem, but it also seems to

similar to: Unicode in C++