thr3ads.net - similar to: "Pull requests: CJK words and Snippet generator"

Displaying 20 results from an estimated 1100 matches similar to: "Pull requests: CJK words and Snippet generator"

Pull requests: CJK words and Snippet generator

2016 Jul 29

Pull requests: CJK words and Snippet generator

Hi James, thanks for the feedback. On Thu, Jul 28, 2016, at 00:22, James Aylett wrote: > This sounds great! I know sufficiently little about CJK that I won't > try to comment on that at all :) I've just opened a pull request for the CJK tokenizer: https://github.com/xapian/xapian/pull/114 > I wonder if we can arrange suitable defaults to use your > implementation with the

Pull requests: CJK words and Snippet generator

2016 Aug 03

Pull requests: CJK words and Snippet generator

Hi, On Fri, Jul 29, 2016, at 13:45, James Aylett wrote: > On Fri, Jul 29, 2016 at 12:12:25PM +0200, rsto at paranoia.at wrote: > > The FastMail snippet generator has been written when MSet didn't create > > snippets. I'll first compare both implementations to see if there is a > > good reason for them to coexist, or might just as well merge any > > additional

Pull requests: CJK words and Snippet generator

2016 Sep 19

Pull requests: CJK words and Snippet generator

Olly, sorry for my delayed reply. Am Mo, 12. Sep 2016, um 05:32, schrieb Olly Betts: > On Wed, Sep 07, 2016 at 02:30:16PM +0200, rsto at paranoia.at wrote: > > On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote: > > > I think my main concerns are about efficiency [...] > > For the proposed term coverage, the implementation looks up and inserts > > terms into a map. That

Pull requests: CJK words and Snippet generator

2016 Aug 05

Pull requests: CJK words and Snippet generator

On Thu, Aug 4, 2016, at 15:08, James Aylett wrote: > On Wed, Aug 03, 2016 at 08:17:05PM +0200, rsto at paranoia.at wrote: > > I'll notify you when the CJK pull request passes Travis. > > That's great, thanks! Alright, after lots of fiddling with .travis.yml I finally made the pull request build on Travis' trusty image: https://github.com/xapian/xapian/pull/114 I have

Pull requests: CJK words and Snippet generator

2016 Aug 18

Pull requests: CJK words and Snippet generator

Hi, On Thu, Aug 11, 2016, at 13:19, rsto at paranoia.at wrote: > The CJK word segmentation and snippet pull requests both pass Travis > since middle/end of last week. Did you find time to look at them? just checking in if you found time to look at the PRs? It'd be nice to know a tentative timeline, so I can plan if to build next features on top of our local fork or the upstream PRs.

Pull requests: CJK words and Snippet generator

2016 Aug 03

Pull requests: CJK words and Snippet generator

On Wed, Aug 3, 2016, at 19:26, James Aylett wrote: > On Wed, Aug 03, 2016 at 06:54:32PM +0200, rsto at paranoia.at wrote: > > Oddly enough, the pull request causes Travis to break for clang but not > > for gcc [1]. That's because the clang build process fails for the test > > 'querypairwise1' [2], which AFAIK I didn't touch at all. Is that a > > known

Pull requests: CJK words and Snippet generator

2016 Sep 07

Pull requests: CJK words and Snippet generator

On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote: > I think my main concerns are about efficiency (since that a major > motivation for the current implementation, so slowing it down would be > annoying), and whether we can just make this the standard behaviour > rather than adding an option. The current implementation is O(n) and I took care to keep it at that. For the proposed term

Pull requests: CJK words and Snippet generator

2016 Dec 14

Pull requests: CJK words and Snippet generator

I haven't had a chance to look at the patch and won't be able to do before January. Its design description sounds promising, though. The snippet generator code linked to by Bron contains mostly the same code as in my pull request, with two exceptions: it adds a flag to make the generator return the empty string for snippets without any matching terms. And it includes a fix to a possible

GSOC 2011- CJK Support

2011 Apr 07

GSOC 2011- CJK Support

Hello, erver one, I am Yongzhi Zhang, a chinese student. I'm interested in CJK Support(also known as Chinese, Japanese, and Korean Support), I have 6 years experience in software development (c/C++ and java) . I want to work on this project "CJK Support", I come from Beijing of china. Chinese is my native language. This is my advantage for ?CJK Support? . I have fixed a bug for

Ask for advice on exact requirements to fix #699 mixed CJK numbers

2019 Mar 07

Ask for advice on exact requirements to fix #699 mixed CJK numbers

I am working on "#699 Better tokenisation of mixed CJK numbers", and have implemented a partial patch of Chinese for this ticket. Current code works well with special test cases and all tests in xapian-core could still pass. But I'm confused with exact requirements of the question, for how much we could pay with performance on enabling more cases, and if there are better methods to

Icon or CJK fonts in MENU TITLE, is that possible in the future ?

2006 Sep 27

Icon or CJK fonts in MENU TITLE, is that possible in the future ?

First I would like to say thank you to HPA for providing some really nice features in recently syslinux version. About new functions, actually I have another radical idea, since we are in Asia, most of the users here they would like to see some local fonts for the syslinux/pxelinux menu. I am wondering is that possible, in the future, the syslinux/pxelinux menu can support CJK fonts or icon ?

patch - Some CJK codepoints are also punctuation

2013 Mar 13

patch - Some CJK codepoints are also punctuation

-- Greg. -------------- next part -------------- A non-text attachment was scrubbed... Name: xapian-some-cjk-codepoints-are-also-punctuation.patch Type: text/x-patch Size: 1499 bytes Desc: not available URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20130313/4da8b0f9/attachment.bin>

Ask for advice on exact requirements to fix #699 mixed CJK numbers

2019 Mar 09

Ask for advice on exact requirements to fix #699 mixed CJK numbers

Thanks for your patience. I'm still confused of what I should do next. If it's not worth changing anything here as it's a rare case, sorry for my PR to github before the reply, maybe you need to close it on github. For another case, should I optimize current code with replacing set to a static array? Or rollback current modification to cjk-tokenizer and try to do some work with the

Sweave, cairo_pdf, CJK, ghostscript

2011 Oct 22

Sweave, cairo_pdf, CJK, ghostscript

I have had some fun in the last few days trying to put together an annotated map of China with R and some public GIS data: http://sourceforge.net/projects/outmodedbonsai/files/snpMatrix%20next/1.17.7.11/China_Choropleth_Maps.pdf/download It is done, and rather nice... there are a few issues: - the default pdf() device cannot do CJK with embedded fonts - and cairo_pdf() is not hooked up to

Pull requests: CJK words and Snippet generator

2016 Dec 13

Pull requests: CJK words and Snippet generator

On Tue, Oct 04, 2016 at 10:37:49AM +1100, Bron Gondwana wrote: > Robert is in Australia visiting the FastMail office to co-work with us for a > couple of months, and I'd love to get this Xapian integration work done > during this time. We're also looking to release Cyrus IMAPd version 3.0 some > time in the next few months, and it would be great to not depend on too many >

Chinese, Japanese, Korean Tokenizer.

2007 Jun 05

Chinese, Japanese, Korean Tokenizer.

Hi, I am looking for Chinese Japanese and Korean tokenizer that could can be use to tokenize terms for CJK languages. I am not very familiar with these languages however I think that these languages contains one or more words in one symbol which it make more difficult to tokenize into searchable terms. Lucene has CJK Tokenizer ... and I am looking around if there is some open source that we

patch to add cairo support to Sweave (Re: Sweave, cairo_pdf, CJK, ghostscript)

2011 Oct 22

patch to add cairo support to Sweave (Re: Sweave, cairo_pdf, CJK, ghostscript)

It was as easy as I thought it was half a day ago - here is a patch against R trunk to add cairo support to the Sweave driver, an example Sweave input, and the resulting output. A few more notes: - obviously the documentation needs to be updated... a bit more work to do. - some check to make sure "cairo" and "pdf" are not both set would be nice, as well as checking

More than two font in a plot

2010 Jun 29

More than two font in a plot

Hi there, I am a Chinese R user. I hope to display Chinese character in a plot, and than save it in PostScript format. I have read the article titled "Non-Standard Fonts in PostScript and PDF Graphics", especially the section about CJK fonts. I also tried the code: > pdf("chinese.pdf", width=3, height=1) > grid.text("\u4F60\u597D", y=2/3,

Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints

2024 Jan 04

Possible bug using FLAG_WORD_BREAKS with fullwidth Unicode codepoints

I think I found a bug in Xapian 1.5 when using FLAG_WORD_BREAKS for input that contains characters in Unicode Halfwidth and Fullwidth Forms (https://unicode.org/charts/PDF/UFF00.pdf). Since I am undecided yet if and how to fix this in Xapian I haven't come up with a pull request. Because trac currently is offline, I could not file a bug. I hope it's OK to post my analysis here first,

Chinese segmentation

2011 Apr 21

Chinese segmentation

hello, I have finished reading the papers, and i think it is time to design my project. First step will be determine the input characters are Chinese. i see the past post that cjk-tokenizer is just dealing with UTF-8 and unicode, but i see some other code system such as gbk and big5. i am wondering that should i just deal with UTF-8 and unicode?

similar to: Pull requests: CJK words and Snippet generator