2010/1/21 Marlon Baculio <mbaculio at
hotmail.com>:> That leaves about 300MB for Xapian and the rest of the Linux OS. The main
UI will be a Google style search box.
> 0. How would you configure Xapian for such low memory systems (e.g. how
many readers, flush threshold for writer)?
Totally depends on the load, and the size of documents. To work out
the flush threshold, I'd probably do an index run, logging the number
of documents processed and watching the memory use in "top", and set
the flush threshold to the number processed when the indexer memory
use reaches about half the available memory use (so space is left for
the OS disk block cache, and the other processes).
> 1. Will file handle limitation be a problem for multithreaded Xapian
reader?
Depends on search load. Each reader keeps about 5 filehandles open,
so multiply that by the number of concurrent readers you want. If it
comes close to the per-process fd limit, you've got a problem.
> 2. What are advantages of multiprocess readers (compared to multithreaded)
aside from crash isolation
I can't think of any significant ones off the top of my head. You
can't access a reader concurrently from multiple threads, so it
doesn't make much difference to Xapian whether the reader is in a
separate thread or a separate process.
You might find it easier to pool connections for reuse if you use
threads, but a process pool is perfectly feasible in theory too.
The readers should have very little memory overhead - they don't cache
anything, leaving that up to the OS disk cache (so it'll be shared
between all readers automatically).
--
Richard