On Mon, Jul 26, 2021 at 09:21:05AM +0000, Eric Wong
wrote:> I'm also curious if crit-bit trees can be suited as an on-disk
> structure for Xapian: https://cr.yp.to/critbit.html
>
[...]>
> Downsides:
>
> - iteration requires recursion, which is bound by
> key length; so arbitrarily long keys will hurt.
FWIW, Xapian doesn't need arbitrarily long keys (the current backends
are limited to 255).
> - affected by locality and fragmentation (but AFAIK all large
> data/memory structures do, and we have xapian-compact).
With the current backends released space is tracked and reused. A
B+-tree runs at a certain proportion of space unused to give its
performance guarantees; xapian-compact allows you to eliminate that
slack and also slack space used to support searching the current version
while a new version is being added. In normal use this space just gets
reused but it's useful to be able to eliminate it if you're building a
database which then won't be updated again; it also allows you to
reclaim space after a large deletion which doesn't otherwise happen as
the tables never normally shrink in size (but that space would get reused
if you add more data after the deletion).
It seems problematic if you had to run xapian-compact to reuse
space *at all*, so some sort of freespace handling would really need to
be added.
> I haven't thought much about how transactions would work with
> on-disk formats, though.
That's an essential requirement. Ideally it should support an arbitrary
number of older versions so we can avoid DatabaseModifiedError.
Maybe you could have multiple roots with partially shared trees below
and copy-on-write, probably with reference counting of sub-trees to
allow reclaiming space. It's all rather getting away from the essential
simplicity of the approach though, and the overhead is no longer as low
as that of the simple design.
So overall I'm a bit dubious about replacing the current B+-tree with a
crit-bit tree, but would certainly encourage experimentation if someone
wants to prototype this or other ideas for backend storage.
However...
The current in-development backend (honey) uses a flat basic store
structure, with an index into it.
The plan is to have multiple flat levels overlaid to provide versioning,
with a rolling merge of levels to spread the I/O load out much more
evenly over time than the current backends. It also makes indexing I/O
less random access.
I think crit-bit trees (or some other variety of PATRICIA tree) might
work well for the index here. The index points into the flat store
which contains (key,value) pairs, so the full key is already being
stored on the "external node". Each table (and its index) is
append-only, so I don't think there would be freespace to track if I
follow how insertions work.
> I encountered my own problems having too many shards:
> https://lists.xapian.org/pipermail/xapian-discuss/2020-August/009823.html
Oh sorry, I forgot I said I'd come up with a patch to avoid the separate
postlist table lookup to get the wdf upper bound. I'll reply to that
thread soon.
Cheers,
Olly