Displaying 20 results from an estimated 4000 matches similar to: "How to set environment variable XAPIAN_CJK_NGRAM?"
2018 Feb 13
2
How to set environment variable XAPIAN_CJK_NGRAM?
Olly, Thanks a lot!
I installed Xapian 1.2.25 on Ubuntu 14.04. How to set environment variable XAPIAN_CJK_NGRAM? I'm a newbie to Xapian.
Best wishes,
Peter
At 2018-02-12 20:00:02, xapian-discuss-request at lists.xapian.org wrote:
>Send Xapian-discuss mailing list submissions to
> xapian-discuss at lists.xapian.org
>
>To subscribe or unsubscribe via the World Wide Web,
2016 Jul 26
2
Pull requests: CJK words and Snippet generator
Hi,
The Cyrus IMAP mail server uses Xapian as search engine. Recently,
FastMail has sponsored implementation of two Xapian features: CJK word
splitting and a generator for search snippets. I've been working on both
features and we would be happy to get them merged into Xapian master.
The CJK word tokenizer uses the word segmentation algorithms of the
International Components for Unicode
2016 Jul 29
3
Pull requests: CJK words and Snippet generator
Hi James,
thanks for the feedback.
On Thu, Jul 28, 2016, at 00:22, James Aylett wrote:
> This sounds great! I know sufficiently little about CJK that I won't
> try to comment on that at all :)
I've just opened a pull request for the CJK tokenizer:
https://github.com/xapian/xapian/pull/114
> I wonder if we can arrange suitable defaults to use your
> implementation with the
2016 Aug 03
2
Pull requests: CJK words and Snippet generator
Hi,
On Fri, Jul 29, 2016, at 13:45, James Aylett wrote:
> On Fri, Jul 29, 2016 at 12:12:25PM +0200, rsto at paranoia.at wrote:
> > The FastMail snippet generator has been written when MSet didn't create
> > snippets. I'll first compare both implementations to see if there is a
> > good reason for them to coexist, or might just as well merge any
> > additional
2016 Aug 03
2
Pull requests: CJK words and Snippet generator
On Wed, Aug 3, 2016, at 19:26, James Aylett wrote:
> On Wed, Aug 03, 2016 at 06:54:32PM +0200, rsto at paranoia.at wrote:
> > Oddly enough, the pull request causes Travis to break for clang but not
> > for gcc [1]. That's because the clang build process fails for the test
> > 'querypairwise1' [2], which AFAIK I didn't touch at all. Is that a
> > known
2018 Feb 13
1
XAPIAN_CJK_NGRAM can work
Olly,
That's very kind of you to help me. When I used "env XAPIAN_CJK_NGRAM=1 indexer-command" indexer Eprints again, it can search Chinese by character level. But it is not so good for words or phrase level. ICU would be better than CJK. Hope ICU can use soon!
Best wishes,
Peter
At 2018-02-13 11:44:46, "Olly Betts" <olly at survex.com> wrote:
>On
2016 Sep 07
2
Pull requests: CJK words and Snippet generator
On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote:
> I think my main concerns are about efficiency (since that a major
> motivation for the current implementation, so slowing it down would be
> annoying), and whether we can just make this the standard behaviour
> rather than adding an option.
The current implementation is O(n) and I took care to keep it at that.
For the proposed term
2016 Sep 19
2
Pull requests: CJK words and Snippet generator
Olly, sorry for my delayed reply.
Am Mo, 12. Sep 2016, um 05:32, schrieb Olly Betts:
> On Wed, Sep 07, 2016 at 02:30:16PM +0200, rsto at paranoia.at wrote:
> > On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote:
> > > I think my main concerns are about efficiency [...]
> > For the proposed term coverage, the implementation looks up and inserts
> > terms into a map. That
2008 Jan 26
0
CentOS-announce Digest, Vol 35, Issue 17
Send CentOS-announce mailing list submissions to
centos-announce at centos.org
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.centos.org/mailman/listinfo/centos-announce
or, via email, send a message with subject or body 'help' to
centos-announce-request at centos.org
You can reach the person managing the list at
centos-announce-owner at centos.org
When
2011 Dec 14
0
CentOS-announce Digest, Vol 82, Issue 9
Send CentOS-announce mailing list submissions to
centos-announce at centos.org
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.centos.org/mailman/listinfo/centos-announce
or, via email, send a message with subject or body 'help' to
centos-announce-request at centos.org
You can reach the person managing the list at
centos-announce-owner at centos.org
When
2011 Dec 22
0
CESA-2011:1815 Moderate CentOS 6 icu Update
CentOS Errata and Security Advisory 2011:1815 Moderate
Upstream details at : https://rhn.redhat.com/errata/RHSA-2011-1815.html
The following updated files have been uploaded and are currently
syncing to the mirrors: ( sha256sum Filename )
i386:
9e45679b2ebfea059981581acf182ef869923bec036e585a04ad740820492f71 icu-4.2.1-9.1.el6_2.i686.rpm
2015 Mar 11
0
CEBA-2015:0664 CentOS 6 icu FASTTRACK BugFix Update
CentOS Errata and Bugfix Advisory 2015:0664
Upstream details at : https://rhn.redhat.com/errata/RHBA-2015-0664.html
The following updated files have been uploaded and are currently
syncing to the mirrors: ( sha256sum Filename )
i386:
5d2c4f6d550c3b6a9e39cfdfc188aea83045ac0cc11e443d5e9dd4043f0da157 icu-4.2.1-11.el6.i686.rpm
f43cf14fae0a49eaabeb1f41f7410bf441ae0b3ec384bf13292a16c58004b1f3
2020 Mar 25
0
CESA-2020:0896 Important CentOS 6 icu Security Update
CentOS Errata and Security Advisory 2020:0896 Important
Upstream details at : https://access.redhat.com/errata/RHSA-2020:0896
The following updated files have been uploaded and are currently
syncing to the mirrors: ( sha256sum Filename )
i386:
2ca2af516c855425dff52c4ca4e3d7333751162a57c42f5e5b26684c51cfd4f2 icu-4.2.1-15.el6_10.i686.rpm
2020 Oct 19
0
v2.3.11.3 solr plugin search via MUA fails to match accented ascii characters; cmd line exec of `doveadm fts lookup` PANICs (assertion failed)
On 19/10/2020 19:02, PGNet Dev wrote:
> On 10/19/20 9:48 AM, John Fawcett wrote:
>> --with-icu should be sufficient, actually on centos 7 I got libuci
>> compiled in without setting the explicit flag.
>
>> Here's my ldd, which is under /usr/local/lib/dovecot
>>
>> ldd /usr/local/lib/dovecot/libdovecot-fts.so
>
> noted. as suspected. thx.
>
>>
2016 Aug 05
2
Pull requests: CJK words and Snippet generator
On Thu, Aug 4, 2016, at 15:08, James Aylett wrote:
> On Wed, Aug 03, 2016 at 08:17:05PM +0200, rsto at paranoia.at wrote:
> > I'll notify you when the CJK pull request passes Travis.
>
> That's great, thanks!
Alright, after lots of fiddling with .travis.yml I finally made the pull
request build on Travis' trusty image:
https://github.com/xapian/xapian/pull/114
I have
2008 Jan 25
0
CESA-2008:0090 Important CentOS 5 x86_64 icu Update
CentOS Errata and Security Advisory 2008:0090 Important
Upstream details at : https://rhn.redhat.com/errata/RHSA-2008-0090.html
The following updated files have been uploaded and are currently
syncing to the mirrors: ( md5sum Filename )
x86_64:
32bfc5e35b1c9b0bccff785f7da87c52 icu-3.6-5.11.1.x86_64.rpm
f8038551ba743133544429c4f040bc9f libicu-3.6-5.11.1.i386.rpm
2009 Jun 26
0
CESA-2009:1122 Moderate CentOS 5 x86_64 icu Update
CentOS Errata and Security Advisory 2009:1122 Moderate
Upstream details at : https://rhn.redhat.com/errata/RHSA-2009-1122.html
The following updated files have been uploaded and are currently
syncing to the mirrors: ( md5sum Filename )
x86_64:
d363e35d3b60546de87ad57ded6f81f1 icu-3.6-5.11.4.x86_64.rpm
a5fdb39aff45db3134a5466614d62af2 libicu-3.6-5.11.4.i386.rpm
2011 Dec 14
0
CESA-2011:1815 Moderate CentOS 5 x86_64 icu Update
CentOS Errata and Security Advisory 2011:1815 Moderate
Upstream details at : https://rhn.redhat.com/errata/RHSA-2011-1815.html
The following updated files have been uploaded and are currently
syncing to the mirrors: ( sha256sum Filename )
x86_64:
ff8072bc90d53597468560bbb41dbae288239fbc060f26c3cf4cbb7082f25f86 icu-3.6-5.16.1.x86_64.rpm
2020 Mar 25
0
CESA-2020:0897 Important CentOS 7 icu Security Update
CentOS Errata and Security Advisory 2020:0897 Important
Upstream details at : https://access.redhat.com/errata/RHSA-2020:0897
The following updated files have been uploaded and are currently
syncing to the mirrors: ( sha256sum Filename )
x86_64:
2ca61582c7174625804c4b53eee75b38c60c2bb5c9a674b1b614fa0914467dee icu-50.2-4.el7_7.x86_64.rpm
2016 Dec 14
2
Pull requests: CJK words and Snippet generator
I haven't had a chance to look at the patch and won't be able to do
before January. Its design description sounds promising, though.
The snippet generator code linked to by Bron contains mostly the same
code as in my pull request, with two exceptions: it adds a flag to make
the generator return the empty string for snippets without any matching
terms. And it includes a fix to a possible