Displaying 20 results from an estimated 10000 matches similar to: "Windows: accessing an index with non-ASCII path"
2020 Jun 04
2
xapian-core and Windows non-ASCII paths
Hi,
I am attaching a patch against the xapian-core 1.4 branch.
On Windows with MSVC (probably mingw too but I did not test), it allows
xapian-core to create and use an index located at a path containing arbitrary
Unicode characters. As far as I could see, this does not work with the
current code, and, from the question I asked on xapian-discuss nobody seems
to have an obvious external solution
2016 Apr 11
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes:
> On Sun, Apr 10, 2016 at 04:47:01PM +0200, Jean-Francois Dockes wrote:
> > Some might notice the 50% index size increase. Excessive index size is
> > already one relatively rare, but recurring complaint. Except if I did
> > something wrong: I'm actually quite surprised by it.
>
> Did you try compacting the resulting databases?
>
>
2016 Apr 12
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes:
> On Mon, Apr 11, 2016 at 09:54:36AM +0200, Jean-Francois Dockes wrote:
> > The question which remains for me is if I should run xapian-compact
> > after an initial indexing operation. I guess that this depends on the
> > amount of expected updates and that there is no easy answer ?
>
> I think it's not obvious whether it's a good plan
2016 Jan 14
3
Strange index consistency issue
Olly Betts writes:
> On Sun, Jan 10, 2016 at 02:53:14AM +0000, Bob Cargill wrote:
> > I am the recoll user mentioned in the first post above. I still have a copy
> > of the (potentially) corrupted index and I did the requested testing.
> >
> > I ran delve -t '' ./xapiandb on the index and it returned a very long list
> > of document IDs, separated
2019 Feb 02
0
Amount of writes during index creation
This is quite possibly part of the underlying write explosion that we ran into when we wrote:
https://fastmail.blog/2014/12/01/email-search-system/
Which now almost 5 years on, has been running like a champion! We're really pleased with how well it works. Xapian reads from multiple databases are really easy, and the immediate writes onto tmpfs and daily compacts work really well. We also
2012 Dec 23
1
Fwd: Re: Another ue for Recoll/Xapian? - AI/Eliza
People,
I sent this note to JF at Recoll and he suggested asking here (his
response below) - any suggestions?
Thanks,
Phil.
-------- Original Message --------
Subject: Re: Another ue for Recoll? - AI/Eliza
Date: 2012-12-23 19:22
From: jf at dockes.org
To: <phil at pricom.com.au>
Philip Rhoades writes:
> Jean,
>
> I have been using Recoll happily for some time now but I
2016 Apr 10
2
Xapian 1.3.5 snapshot performance and index size
Hi,
I ran some tests with Recoll to compare Xapian 1.2.22 and 1.3.5 performance.
I mostly used two relatively small document sets (realistic/typical recoll
data subsets).
The first set is a 2.2 GB mbox folder, with approximately 56K messages in
275 files, producing approximately 64K documents (because of attachments).
The second set is a 11 GB folder with 5300 PDF files in it (random PDFS
2019 Jan 31
4
Amount of writes during index creation
Olly Betts writes:
> On Mon, Jan 21, 2019 at 03:25:01PM +0100, Jean-Francois Dockes wrote:
> > I have had a problem report from a Recoll user about the amount of writes
> > during index creation.
> >
> > https://opensourceprojects.eu/p/recoll1/tickets/67/
> >
> > The issue is that the index is on SSD and that the amount of writes is
> >
2018 Sep 14
3
How to make database build threaded?
On 14/09/2018 at 09:30, Jean-Francois Dockes wrote:
> Hi,
>
> You may be interested by how Recoll does it:
>
> https://www.lesbonscomptes.com/recoll/idxthreads/threadingRecoll.html
>
> A few things in the document are slightly obsolete (esp. the last
> paragraph: recollindex now does use vfork()), but it's overall quite close
> to how the current indexer works.
2024 Mar 15
1
Using multiple temporary indexes during updates
Hi,
I have been playing at converting the index update stage of the Recoll indexer to use
multiple temporary indexes and a final merge.
This yields an improvement factor of almost 3 (on my quad-core CPU), for the total
indexing time for "easy" files like HTML pages. This is nice (!) and I wanted to share my
admiration for the "compact()" method.
If someone is interested in a
2019 Feb 03
2
Amount of writes during index creation
Bron Gondwana writes:
> This is quite possibly part of the underlying write explosion that we ran into when we wrote:
>
> https://fastmail.blog/2014/12/01/email-search-system/
>
> Which now almost 5 years on, has been running like a champion! We're really pleased with how well it works. Xapian reads from multiple databases are really easy, and the immediate writes onto
2019 Jan 21
2
Amount of writes during index creation
Hi,
I have had a problem report from a Recoll user about the amount of writes
during index creation.
https://opensourceprojects.eu/p/recoll1/tickets/67/
The issue is that the index is on SSD and that the amount of writes is
significant compared to the SSD life expectancy (index size > 250 GB).
>From the numbers he supplied, it seems to me that the total amount of block
writes is roughly
2016 Jan 14
2
Strange index consistency issue
Olly Betts <olly <at> survex.com> writes:
>
> On Thu, Jan 14, 2016 at 11:04:29AM +0100, Jean-Francois Dockes wrote:
> > Olly Betts writes:
> > > On Sun, Jan 10, 2016 at 02:53:14AM +0000, Bob Cargill wrote:
> > > > I will look into the bug you listed to see if it might be related.
If there
> > > > is anything else that I can do, please
2014 May 04
2
Xapian::Document and threads
Hi,
While investigating very infrequent crashes in the Recoll indexer, I have
come to a very basic question: is it safe to pass a copy of a
Xapian::Document from thread to thread (multiple threads queue documents,
other thread updates the index) ?
I don't seem to get directly into trouble while doing this, but I don't see
anything either in the RefCntr implementation which would
2017 May 17
2
Xapian 1.4.3 "Db block overwritten - are there multiple writers?"
Hi,
I have a user reporting the following error during recoll indexing:
flush() failed: Db block overwritten - are there multiple writers?
"flush() failed" is from recoll, the rest is, I think the text of the Xapian
exception.
This is with Xapian 1.4.3 on Linux (I asked for more details, should be
coming).
I don't think that I've ever seen this error, and I also
2017 Dec 08
2
xapian 1.4 performance issue
Olly Betts writes:
> On Thu, Dec 07, 2017 at 10:29:09AM +0100, Jean-Francois Dockes wrote:
> > Recoll builds snippets by partially reconstructing documents out of index
> > contents.
> >
> [...]
> >
> > The specific operation which has become slow is opening many term position
> > lists, each quite short.
>
> The difference will actually
2017 May 22
2
Xapian 1.4.3 "Db block overwritten - are there multiple writers?"
Olly Betts writes:
> On Wed, May 17, 2017 at 09:08:32PM +0200, Jean-Francois Dockes wrote:
> > I have a user reporting the following error during recoll indexing:
> >
> > flush() failed: Db block overwritten - are there multiple writers?
> >
> > "flush() failed" is from recoll, the rest is, I think the text of the Xapian
> > exception.
2016 Dec 29
2
NEAR non-leaf subqueries
Hi,
Xapian 1.2 supports a query like:
(A OR B) NEAR (C OR D)
and distributes the factors to create something like:
(A NEAR 2 C) OR (B NEAR 2 C) OR (B NEAR 2 C) OR (A NEAR 2 C)
Xapian 1.4 rejects such a query with the error message.
OP_NEAR and OP_PHRASE only currently support leaf subqueries
Because Recoll expands the terms to their stem siblings at query time, its
NEAR queries
2018 Sep 13
2
How to make database build threaded?
Hi everybody,
I'm the author of a small C++11 program called XDGSearch. The source
code is hosted on Github, for a quick overview you can visit this link
https://github.com/frank67/XDGSearch/blob/master/README.md
I'm writing to the mailing list because I'd like to make the database
build process splitted in more thread. Is it possible? If you are a C++
programmer you can take a look at
2013 Apr 23
2
Metaflac UTF-8 fixes
Hopefully the last patch from me to UTF-8 issues.
Metaflac can now print all console supported characters from tags on the
screen. It also fixes metaflac to be able to import its own exports back
without non-ascii characters getting mutilated. And --no-utf8-convert
now works properly with import and export commands.
I updated my Windows binary archive with these changes for any
interested