I hadn't noticed any interactive slowdown, but when I got around to running the notmuch performance suite, there seems to be some noticable slowdown with the glass backend (default in Xapian 1.3.5) compared to chert (using xapian 1.2.22) These tests are on an older i7 with 12G of RAM and an SSD. I'm reasonable confident they are CPU bound. One curious thing is the increase in system time in the glass case. It also looks like the glass backend is doing a lot more I/O, which could be related. The current notmuch performance corpus has about 200k documents, totalling about 3.5G. Unfortunately each number here represents only a single run. I did rerun the tests with the glass backend, and the variation was reasonably small. Chert ==== T00-new.sh: Testing notmuch new [0.4 large] Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B) Initial notmuch new 669.06 639.78 21.42 323684 3576/9360440 notmuch new #2 0.46 0.00 0.00 8240 3568/200 notmuch new #3 0.01 0.00 0.00 7916 0/8 notmuch new #4 0.01 0.01 0.00 8008 0/8 notmuch new #5 0.01 0.00 0.00 8040 0/8 notmuch new #6 0.01 0.00 0.00 8040 0/8 T01-dump-restore.sh: Testing dump and restore [0.4 large] Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B) load nmbug tags 5.85 2.64 0.10 11280 1376/40496 dump * 7.45 6.51 0.94 25272 104/27928 restore * 7.55 7.15 0.39 8180 0/0 T02-tag.sh: Testing tagging [0.4 large] Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B) tag * +new_tag 200.13 183.01 7.08 38628 264/1664552 tag * +existing_tag 0.00 0.00 0.00 8356 0/0 tag * -existing_tag 153.47 145.00 4.02 34928 0/1626320 tag * -missing_tag 0.00 0.00 0.00 8252 0/0 Glass ==== T00-new.sh: Testing notmuch new [0.4 large] Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B) Initial notmuch new 949.53 697.05 206.98 277436 1290744/21767856 notmuch new #2 2.12 0.01 0.02 8204 2552/160 notmuch new #3 0.01 0.00 0.00 8216 0/8 notmuch new #4 0.01 0.00 0.00 8192 0/8 notmuch new #5 0.01 0.00 0.00 8216 0/8 notmuch new #6 0.01 0.00 0.00 8144 0/8 T01-dump-restore.sh: Testing dump and restore [0.4 large] Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B) load nmbug tags 10.78 4.06 3.59 11376 600/39832 dump * 7.44 6.52 0.91 25296 0/27928 restore * 7.74 7.24 0.48 8740 0/0 T02-tag.sh: Testing tagging [0.4 large] Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B) tag * +new_tag 481.78 278.80 196.89 39448 0/1897360 tag * +existing_tag 0.00 0.00 0.00 8496 0/0 tag * -existing_tag 449.58 242.65 202.74 35456 0/2073520 tag * -missing_tag 0.00 0.00 0.00 8440 0/0
On Thu, Apr 07, 2016 at 08:56:46AM -0300, David Bremner wrote:> I hadn't noticed any interactive slowdown, but when I got around to > running the notmuch performance suite, there seems to be some noticable > slowdown with the glass backend (default in Xapian 1.3.5) compared to > chert (using xapian 1.2.22)Some of this is pretty much expected, though other parts I don't entirely understand. One of the big changes in glass is how the position table is structured. In chert, it is ordered by (document,term) but in glass that has been changed to (term,document). This change makes a huge difference to phrase searches in cases where a lot of phrase data is needed, but it has an indexing time cost - adding a new document can no longer just append a load of entries to the position table, but instead we need to buffer up the changes, and then merge the entries within the existing table. The trade-off isn't ideal for everyone, but the cases of slow phrase searches were a real pain point that needed addressing. The plan is to optimise indexing speed in other ways to regain this loss - some of that has been done but there's a lot more to do still. So the T00-new.sh numbers make sense - there's more work to do, and we need to read existing positional data more to insert the new stuff, so the increased reads and writes make sense. But guessing at what the other two tests do, I wouldn't expect them to be affected by this. I'm also a bit puzzled by how glass can manage not to read any data for "dump *", and several tests seem to not read or write anything for either backend. What exactly are the "In/Out" numbers? Cheers, Olly
Olly Betts <olly at survex.com> writes:> > So the T00-new.sh numbers make sense - there's more work to do, and > we need to read existing positional data more to insert the new stuff, > so the increased reads and writes make sense. > > But guessing at what the other two tests do, I wouldn't expect them to > be affected by this.The non-optimized-away cases of T02-tag just adding and deleting terms to each document with term Tmail> I'm also a bit puzzled by how glass can manage not to read any data > for "dump *", and several tests seem to not read or write anything > for either backend. What exactly are the "In/Out" numbers?that's just the output from /usr/bin/time -f '%e\t%U\t%S\t%M\t%I/%O' The manual describes them as "number of file system inputs/outputs". From looking at the source, they correspond to ru_inblock and ru_oublock fields from the getrusage call. AFAIU, that means the number of non-cached read/writes. d