similar to: Apply the google summer code (additional idea)

Displaying 20 results from an estimated 2000 matches similar to: "Apply the google summer code (additional idea)"

2011 Apr 01
0
Applying the Google Summer code
Hi all: Glad to meet you!! My name's Zhang Fan, a Phd Student from Nankai University, China. I have been doing the information retrieval research work for 3 years. I have several papers published in the top-tier computer science conferences such as WSDM, VLDB, CIKM and ACL. I have many years of coding experiments and participated several projects about search engines. I want to take part in
2011 Apr 07
1
About the Summer code
Hi Olly: I have updated my proposal, could you give me more advice? http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/fan/1 Thank you! -- My Homepage: http://sites.google.com/site/zhfan555/ PhD Student at Nankai U and Intern at MSRA -------------- next part -------------- An HTML attachment was scrubbed... URL:
2007 Jun 05
7
Chinese, Japanese, Korean Tokenizer.
Hi, I am looking for Chinese Japanese and Korean tokenizer that could can be use to tokenize terms for CJK languages. I am not very familiar with these languages however I think that these languages contains one or more words in one symbol which it make more difficult to tokenize into searchable terms. Lucene has CJK Tokenizer ... and I am looking around if there is some open source that we
2011 Apr 07
1
GSOC 2011- CJK Support
Hello, erver one, I am Yongzhi Zhang, a chinese student. I'm interested in CJK Support(also known as Chinese, Japanese, and Korean Support), I have 6 years experience in software development (c/C++ and java) . I want to work on this project "CJK Support", I come from Beijing of china. Chinese is my native language. This is my advantage for ?CJK Support? . I have fixed a bug for
2016 Jul 26
2
Pull requests: CJK words and Snippet generator
Hi, The Cyrus IMAP mail server uses Xapian as search engine. Recently, FastMail has sponsored implementation of two Xapian features: CJK word splitting and a generator for search snippets. I've been working on both features and we would be happy to get them merged into Xapian master. The CJK word tokenizer uses the word segmentation algorithms of the International Components for Unicode
2016 Sep 19
2
Pull requests: CJK words and Snippet generator
Olly, sorry for my delayed reply. Am Mo, 12. Sep 2016, um 05:32, schrieb Olly Betts: > On Wed, Sep 07, 2016 at 02:30:16PM +0200, rsto at paranoia.at wrote: > > On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote: > > > I think my main concerns are about efficiency [...] > > For the proposed term coverage, the implementation looks up and inserts > > terms into a map. That
2019 Mar 07
3
Ask for advice on exact requirements to fix #699 mixed CJK numbers
I am working on "#699 Better tokenisation of mixed CJK numbers", and have implemented a partial patch of Chinese for this ticket. Current code works well with special test cases and all tests in xapian-core could still pass. But I'm confused with exact requirements of the question, for how much we could pay with performance on enabling more cases, and if there are better methods to
2016 Sep 07
2
Pull requests: CJK words and Snippet generator
On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote: > I think my main concerns are about efficiency (since that a major > motivation for the current implementation, so slowing it down would be > annoying), and whether we can just make this the standard behaviour > rather than adding an option. The current implementation is O(n) and I took care to keep it at that. For the proposed term
2016 Jul 29
3
Pull requests: CJK words and Snippet generator
Hi James, thanks for the feedback. On Thu, Jul 28, 2016, at 00:22, James Aylett wrote: > This sounds great! I know sufficiently little about CJK that I won't > try to comment on that at all :) I've just opened a pull request for the CJK tokenizer: https://github.com/xapian/xapian/pull/114 > I wonder if we can arrange suitable defaults to use your > implementation with the
2010 Jun 29
5
More than two font in a plot
Hi there, I am a Chinese R user. I hope to display Chinese character in a plot, and than save it in PostScript format. I have read the article titled "Non-Standard Fonts in PostScript and PDF Graphics", especially the section about CJK fonts. I also tried the code: > pdf("chinese.pdf", width=3, height=1) > grid.text("\u4F60\u597D", y=2/3,
2011 Apr 21
2
Chinese segmentation
hello, I have finished reading the papers, and i think it is time to design my project. First step will be determine the input characters are Chinese. i see the past post that cjk-tokenizer is just dealing with UTF-8 and unicode, but i see some other code system such as gbk and big5. i am wondering that should i just deal with UTF-8 and unicode?
2006 Sep 27
3
Icon or CJK fonts in MENU TITLE, is that possible in the future ?
First I would like to say thank you to HPA for providing some really nice features in recently syslinux version. About new functions, actually I have another radical idea, since we are in Asia, most of the users here they would like to see some local fonts for the syslinux/pxelinux menu. I am wondering is that possible, in the future, the syslinux/pxelinux menu can support CJK fonts or icon ?
2016 Aug 18
2
Pull requests: CJK words and Snippet generator
Hi, On Thu, Aug 11, 2016, at 13:19, rsto at paranoia.at wrote: > The CJK word segmentation and snippet pull requests both pass Travis > since middle/end of last week. Did you find time to look at them? just checking in if you found time to look at the PRs? It'd be nice to know a tentative timeline, so I can plan if to build next features on top of our local fork or the upstream PRs.
2011 Oct 26
1
set different font family for strings in mtext or text?
Hi there, Is it possible to set different font family for strings in mtext or text? For example, on windows platform with windows() device: plot(1:10, type = "n") text(5,5, "Chinese (English)") #Chinese for Chinese characters it will give the correct Chinese and English characters with two different font family, i.e., English character in default sans family, and Chinese
2013 Mar 13
2
patch - Some CJK codepoints are also punctuation
-- Greg. -------------- next part -------------- A non-text attachment was scrubbed... Name: xapian-some-cjk-codepoints-are-also-punctuation.patch Type: text/x-patch Size: 1499 bytes Desc: not available URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20130313/4da8b0f9/attachment.bin>
2016 Aug 05
2
Pull requests: CJK words and Snippet generator
On Thu, Aug 4, 2016, at 15:08, James Aylett wrote: > On Wed, Aug 03, 2016 at 08:17:05PM +0200, rsto at paranoia.at wrote: > > I'll notify you when the CJK pull request passes Travis. > > That's great, thanks! Alright, after lots of fiddling with .travis.yml I finally made the pull request build on Travis' trusty image: https://github.com/xapian/xapian/pull/114 I have
2019 Mar 09
2
Ask for advice on exact requirements to fix #699 mixed CJK numbers
Thanks for your patience. I'm still confused of what I should do next. If it's not worth changing anything here as it's a rare case, sorry for my PR to github before the reply, maybe you need to close it on github. For another case, should I optimize current code with replacing set to a static array? Or rollback current modification to cjk-tokenizer and try to do some work with the
2019 Jan 02
2
Broken wiki
Hi all, I've made a minor change (changed one CJK character) on https://wiki.centos.org/zh-tw/SpecialInterestGroup and https://wiki.centos.org/zh/SpecialInterestGroup , and now the whole CentOS wiki is broken.? Can someone undo the changes to fix the wiki? I've maintained the wiki's Chinese translation for 7 years, and this is not the first time the same thing has happened to
2012 May 18
1
UTF-16 input and read.delim/scan
Hi all, I am running 64-bit R 2.15.0 on windows 7. I am trying to use read.delim to read from a file that has 2-byte unicode (CJK) characters. Here is an example of the data (it is tab-delimited if that gets messed up): HITId HITTypeId Title 2Q69Z6KW4ZMAGKKFRT6Q4ONO6MJF68 2LVJ1LY58B72OP36GNBHH16YF7RS7Z 看看句子,写写想法 请看以下的句子,再回答问 So read.delim (code below) doesn't read in correctly. It reads
2011 Oct 22
3
Sweave, cairo_pdf, CJK, ghostscript
I have had some fun in the last few days trying to put together an annotated map of China with R and some public GIS data: http://sourceforge.net/projects/outmodedbonsai/files/snpMatrix%20next/1.17.7.11/China_Choropleth_Maps.pdf/download It is done, and rather nice... there are a few issues: - the default pdf() device cannot do CJK with embedded fonts - and cairo_pdf() is not hooked up to