similar to: Unicode in C++

Displaying 20 results from an estimated 10000 matches similar to: "Unicode in C++"

2016 Sep 19
2
Pull requests: CJK words and Snippet generator
Olly, sorry for my delayed reply. Am Mo, 12. Sep 2016, um 05:32, schrieb Olly Betts: > On Wed, Sep 07, 2016 at 02:30:16PM +0200, rsto at paranoia.at wrote: > > On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote: > > > I think my main concerns are about efficiency [...] > > For the proposed term coverage, the implementation looks up and inserts > > terms into a map. That
2016 Sep 07
2
Pull requests: CJK words and Snippet generator
On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote: > I think my main concerns are about efficiency (since that a major > motivation for the current implementation, so slowing it down would be > annoying), and whether we can just make this the standard behaviour > rather than adding an option. The current implementation is O(n) and I took care to keep it at that. For the proposed term
2024 Jan 10
2
Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints
On Tue, Jan 9, 2024, at 3:28 AM, Olly Betts wrote: > Thanks, that looks good - now merged. Thanks! > Did you already check the other ranges for cased letters? I can but if > you have already there's not much point. I did not. If you find time, that'd be great. Otherwise I can make room for it in the next days. > > The fullwidth "????? ??????" tests suggests to
2024 Jan 09
1
Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints
On Mon, Jan 08, 2024 at 02:01:46PM +0100, Robert Stepanek wrote: > Removing the whole block will cause word-breaker to not correctly > handle halfwidth Katakana, such as "??????????" which it would treat > as a single term, whereas it should be two: ??????and ????). > > My pull request causes word-breaker to only handle halfwidth Katakana > and Hangul codepoints as
2006 Feb 02
5
Fwd: win32-clipboard and Unicode zero bytes
Hi all, I''m forwarding this message from Brian Marick. If you run this test script and then paste the results into a Unicode aware text editor, you''ll notice that it only prints one character instead of three. I tried changing the strlen to _tcslen and strcpy to _tcscpy, but that didn''t help. I mucked around a bit with the MultiByteToWideChar function, too, but
2006 Jun 25
7
Unicode HOWTO?
I am disappointed about the (seeming) lack of Unicode support in Rails. Is there a howto about working the most important limitations? For example, figuring out the length of an entered word: "???".length() will return 6, not 3. So if there is a maximum number of characters a user is allowed to enter, this won''t work as it should -- it treats strings as byte arrays instead
2006 Apr 10
1
ICU
I've just been looking at ICU with an eye to reworking the unicode queryparser patch to use it. A few things have jumped out so far which make we wonder if it's the best option. I don't really know what the alternatives are though (currently QueryParser uses glib's unicode routines). The first is that there seems to be bad version skew. Ubuntu breezy (the latest release) has
2006 May 26
13
win32-dir, unicode
Hi, I''ve got a preliminary version of the pure Ruby version of win32-dir in CVS. However, I was hoping to work out the Unicode issue. Run this: from = "C:\\test" to = "?????" Dir.mkdir(from) unless File.exists?(from) Dir.create_junction(to, from) It works, but my explorer (and dos) window shows the name garbled. I don''t think it''s a font
2020 Mar 30
2
Plotmath on Fedora 31 broken with with pango >= 1.44 - workarounds?
Le lundi 30 mars 2020 ? 15:24 +1300, Paul Murrell a ?crit : > Hi > > I have created an R branch that contains a potential fix ... > > https://svn.r-project.org/R/branches/R-symfam/ > > This allows, for example, ... > > cairo_pdf(symbolfamily="OpenSymbol") > > ... to specify that the OpenSymbol family should be used as the > "symbol" font
2024 Jan 08
1
Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints
On Sun, Jan 7, 2024, at 7:45 PM, Olly Betts wrote: > I've restarted trac. I now created a pull request: https://github.com/xapian/xapian/pull/329 Should I create a trac issue, too? > Assuming the latter is valid, just removing this block (or removing the > parts of it which are Lu or Ll) should fix the problem as then > tokenisation will switch mode - I tried this and it fixes
2019 Mar 07
3
Ask for advice on exact requirements to fix #699 mixed CJK numbers
I am working on "#699 Better tokenisation of mixed CJK numbers", and have implemented a partial patch of Chinese for this ticket. Current code works well with special test cases and all tests in xapian-core could still pass. But I'm confused with exact requirements of the question, for how much we could pay with performance on enabling more cases, and if there are better methods to
2010 Dec 04
3
# chkconfig: kill at run level 3
In the control script of my daemon in /etc/init.d?, I have # chkconfig: 35 97 3 The result of this is that I have links: /etc/rc.d/rc1.d/K03... /etc/rc.d/rc3.d/S97... /etc/rc.d/rc5.d/S97... As mentioned in a previous thread, my complex daemon throws an exception when I shutdown. Perhaps things might be better if I had: /etc/rc.d/rc3.d/K03... Might this be a good idea? If so,
2008 Nov 10
4
PathFindExtension and wide strings
Hi, What''s happening here? require ''windows/path'' require ''windows/unicode'' include Windows::Path include Windows::Unicode file_a = ''bar.txt'' file_w = multi_to_wide(file_a) p PathFindExtensionA(file_a) # ''.txt'' => OK p PathFindExtensionW(file_w) # ''.'' => WRONG Is Ruby chopping the
2011 Aug 28
0
[LLVMdev] LLVM supports Unicode?
Hi, Jo! I'm trying create a new programming language, and I want that it have Unicode support (support for read and manipulate rightly the source-code and string literals). But, in addition, my programming language supports "string interpolation" string, and in these interpolations, tiny snippets of code, like expressions, or variable names. So, I need read each char, separating
2011 Aug 28
2
[LLVMdev] LLVM supports Unicode?
Am 28.08.2011 20:02, schrieb geovanisouza92 at gmail.com: > Hi, Jo! > > I'm trying create a new programming language, and I want that it have > Unicode support (support for read and manipulate rightly the source-code and > string literals). > > But, in addition, my programming language supports "string interpolation" > string, and in these interpolations, tiny
2011 Aug 28
4
[LLVMdev] LLVM supports Unicode?
Am 28.08.2011 16:02, schrieb geovanisouza92 at gmail.com: > Well, have you any idea about how I can implement rightly Unicode in C/C++? What do you mean with "implement in C/C++"? If you mean adding libraries to C/C++ that correctly deal with Unicode: that's nothing you do with a compiler infrastructure. And probably duplicate work, since Unicode libraries already exist. If
2006 Nov 08
14
Increased memory requirements on 1.2
I just recently upgraded a rails app of mine to run on edge (and the 1-2-pre-release branch) and I noticed my fcgis required roughly 6-8MBs more memory after just a couple requests. For example, each fcgi on edge would start around 40MB and rise to ~46MBs after a couple requests. I downgraded my app back to 1.1.6 and each fcgi would start at around 33MBs and rise to ~38MB. As a result of the
2009 Apr 18
1
Can't read table encoded in Unicode (R-2.8.1)
Hi all, I have problems reading Unicode (UTF-16) coded tables in R 2.8.1 under Windows Vista. Imagine the following table: a b c d X 1,2 1,3 1,4 Y 2,2 2,3 2,4 Z 3,2 3,3 3,4 Usually I would use the following code to read the table: t = read.table("test.txt", header=T, sep="\t",dec=",") This works well if I create the table
2020 Mar 30
2
Plotmath on Fedora 31 broken with with pango >= 1.44 - workarounds?
Le mardi 31 mars 2020 ? 10:14 +1300, Paul Murrell a ?crit : > Hi > > On 30/03/20 11:12 pm, Nicolas Mailhot wrote: > > Le lundi 30 mars 2020 ? 15:24 +1300, Paul Murrell a ?crit : > > > Hi > > > > > > I have created an R branch that contains a potential fix ... > > > > > > https://svn.r-project.org/R/branches/R-symfam/ > > >
2006 Mar 21
2
How do I get substring of utf-8 string?
I''m trying to get substring from a utf-8 encoded string. (say, first 50 characters of the string) String#[0..49] would give me the first 50 bytes not 50 characters.. I know there is jcode library, but it only let you count number of characters in utf-8 string. unicode gem doesn''t seem to help much. unicode_hacks gem seem to solve the problem, but it also seems to