Ron Kass
2007-Oct-23 20:10 UTC
[Xapian-discuss] Search performance issues and profiling/debugging search?
Hi All.. We are running a small search-test script to test speed of searches. We iterate through a list of 2500 searches, executing each. After a while we started noticing that the speed or recurring searches is not always faster than the original. We started running each search, three times in a row. We were surprised to see that many times, sequentially identical searches take as much time as the original or even more, which doesn't make much sense considering things should be cached in such cases. Some example: 16/7500: Searching for: no recip...************************* : [Time: 1.053 sec] [Results: 25 of 18924] 17/7500: Searching for: no recip...************************* : [Time: 1.053 sec] [Results: 25 of 18862] 18/7500: Searching for: no recip...************************* : [Time: 1.655 sec] [Results: 25 of 17197] ... 52/7500: Searching for: coupons...************************* : [Time: 0.106 sec] [Results: 25 of 78205] 53/7500: Searching for: coupons...************************* : [Time: 0.106 sec] [Results: 25 of 79775] 54/7500: Searching for: coupons...************************* : [Time: 0.688 sec] [Results: 25 of 72030] ... 172/7500: Searching for: microsoft...************************* : [Time: 0.368 sec] [Results: 25 of 245689] 173/7500: Searching for: microsoft...************************* : [Time: 0.29 sec] [Results: 25 of 235657] 174/7500: Searching for: microsoft...************************* : [Time: 1.453 sec] [Results: 25 of 233288] ... 478/7500: Searching for: map of the world...************************* : [Time: 15.703 sec] [Results: 25 of 1355] 479/7500: Searching for: map of the world...************************* : [Time: 10.004 sec] [Results: 25 of 1230] 480/7500: Searching for: map of the world...************************* : [Time: 8.902 sec] [Results: 25 of 1117] Especially with the first and last examples, we notice that search for - no recip - takes more than one second, regardless of how many times we execute it, and - map of the world - takes over 8 seconds constantly. Some notes about the environment of the search test: * we open the database only once, so save time. * hardware, quad xeon, 16GB mem * data across 10 databases over 5xSATA2 drives * Database contains 150M docs, each about 3K * The server is not doing anything beside the search testing. Also: * the Astrix signs in the debug info represent fetching document data, score, matching words and other details from the mset items. * Estimates vary, although its exactly the same search done right one after the other with no changes to the DB (no data added). This is not really a big issue. My questions/concerns: 1. Why would a sequential search take as much time? Shouldn't all data be cached as the query is "warmed up" already? Why would a third search for "microsoft" take so long, not to mention "map of the world"? 2. Even in the fastest queries, we very rarely see repeated search taking less than 0.1 seconds, even those with the simplest of terms. Does that make sense? 3. For such a hardware we use for testing, 16GB ram, quad xeon, 5xSATA2 [500GB each] disks.. How much would you expect it to "carry" assuming a typical document is 3K and we store positional, stemmed information with 12-16 values per item and a short integer number in the document data? 4. Anyone has ideas how we can profile things to see why we have such performance issues? We would be happy to dig deeper to find out where the problems arise from. Looking forward to hearing your feedback.. Ron
Richard Boulton
2007-Oct-23 22:26 UTC
[Xapian-discuss] Search performance issues and profiling/debugging search?
Ron Kass wrote:> * Estimates vary, although its exactly the same search done right one > after the other with no changes to the DB (no data added). This is not > really a big issue.This is the issue which looks oddest to me, however (though there are other oddities). If I understood you correctly, each search is performed on the same Database object, without reopening the Database object between each search. This should result in exactly the same results (and estimates) for each repeated search, since a Database object accesses a snapshot of the underlying database. Since this isn't happening, there must be something we don't understand about your setup, and this is the first thing to resolve. Are you using the network (ie, "remote") backend? If so, the problem could be that some of the database connections are timing out occasionally. Is the database mounted locally, or over a network filing system? If the database isn't being modified while searches are in progress, and/or Database objects aren't being reopened between searches, something very weird is happening. Try doing the search on just a single database, and see if you get the same effects. If not, add in databases gradually until you do. Other than that, it would be useful to separate timings for the search (ie, the get_mset() call) from the calls which fetch the document data, to see where most of the time is being spent. > 4. Anyone has ideas how we can profile things to see why we have such > performance issues? We would be happy to dig deeper to find out where > the problems arise from. oprofile is your best bet for seeing where the time is being spent. -- Richard
Ron Kass
2007-Oct-24 03:04 UTC
[Xapian-discuss] Search performance issues and profiling/debugging search?
Hi Richard & Olly. Yes, the database is NOT being reopened. The databases are accessed locally, not over the net. No data is being added to the databases while testing the searches. Indeed I found the estimates to be strange. Wasn't sure why, but was more concerned with performance and the fact that things were not "warming up". I don't know anything yet about oprofile, will have to dig deeper there. (any pinpoints would be handy) What was not clear from your answers is if it makes sense that second (and third) time searches take that long. Also, we are getting 15+ and sometimes 50+ seconds searches on some simple (non phrased) search terms (like: [map of the world] [Time: 15.703 sec] [Results: 25 of 1355]; [Time: 10.004 sec] [Results: 25 of 1230]; [Results: 25 of 1117]; and there are ones that take even more time) I'm attaching few link to outputs of more details debugs. First I ran a test on all 8 databases (1, 2, 4, 6, 7, 8, 9 and 10) http://www.pidgintech.com/other/fts/test/test_1-2-4-6-7-8-9-10.txt This resulted in Segmentation fault. Then I removed database 10 and reran. http://www.pidgintech.com/other/fts/test/test_1-2-4-6-7-8-9.txt This ran successfully. However, Estimates varied: [16044], [17314], [17219] And overall times were quite long, both for original search and for sequential searches: [Total Time: 0.96 sec], [Total Time: 0.96 sec], [Total Time: 0.941 sec] Then I ran it on each of the databases separately. Note that by this time it is likely (or at least I though it should be) that the blocks are cached in memory from the previous test (of 1,2,4,6,7,8 and 9). http://www.pidgintech.com/other/fts/test/test_isolated_1.txt Summary for DB #1: Estimates fixed on [3015]; Times did not speed up for sequential searches: [Total Time: 0.187 sec], [Total Time: 0.183 sec], [Total Time: 0.199 sec] http://www.pidgintech.com/other/fts/test/test_isolated_2.txt Summary for DB #2: Estimates fixed on [789]; Times became faster, although first time is extremely long considering the single DB and its size: [Total Time: 0.891 sec], [Total Time: 0.082 sec], [Total Time: 0.083 sec] http://www.pidgintech.com/other/fts/test/test_isolated_4.txt Summary for DB #4: Estimates fixed on [3217] ; Times became faster, although first time is extremely long, and the others are still long (>0.2sec) considering the single DB: [Total Time: 1.18 sec], [Total Time: 0.215 sec], [Total Time: 0.212 sec] http://www.pidgintech.com/other/fts/test/test_isolated_6.txt Summary for DB #6: Estimates varied [4447], [3023], [3023]; Times were extremely long in the first two searches, and even in the last one, still long (>0.2sec): [Total Time: 1.961 sec], [Total Time: 4.299 sec], [Total Time: 0.204 sec] http://www.pidgintech.com/other/fts/test/test_isolated_7.txt Summary for DB #7: Estimates varied [4954], [4954], [3593] ; Times varied and were relatively long overall: [Total Time: 0.81 sec], [Total Time: 0.429 sec], [Total Time: 0.779 sec] http://www.pidgintech.com/other/fts/test/test_isolated_8.txt Summary for DB #8: Estimates varied [805], [562], [562] ; Times varied and were relatively long for the first 2 tries: [Total Time: 0.431 sec], [Total Time: 1.077 sec], [Total Time: 0.063 sec] http://www.pidgintech.com/other/fts/test/test_isolated_9.txt Summary for DB #9: Estimates fixed on [3415] ; Times became faster, although first time is extremely long, and the others are still long (>0.2sec): [Total Time: 3.137 sec], [Total Time: 0.222 sec], [Total Time: 0.212 sec] Now I tied #10 (the one who, when added to a group with add_database, causes seg fault), just in case.. and, on its own, it worked. http://www.pidgintech.com/other/fts/test/test_isolated_10.txt Summary for DB #10: Estimates fixed on [3415] ; Times were overall fast: [Total Time: 0.108 sec], [Total Time: 0.072 sec], [Total Time: 0.075 sec] The following output is output of delve on the 8 databases, including delve on the specific terms (no, recip) to try to figure out why we are getting segmentation fault when database 10 is combined with the others. Note that it manages to run through the delve. I wonder where the segmentation fault is coming from. It happens during the get of the MSET [root@fts1 maintenance]# delve /fts/Database_1/ number of documents = 20810000 average document length = 204.054 [root@fts1 maintenance]# delve /fts/Database_2/ number of documents = 6020000 average document length = 203.334 [root@fts1 maintenance]# delve /fts/Database_4/ number of documents = 22365000 average document length = 204.573 [root@fts1 maintenance]# delve /fts/Database_6/ number of documents = 21875000 average document length = 204.567 [root@fts1 maintenance]# delve /fts/Database_7/ number of documents = 24480000 average document length = 205.822 [root@fts1 maintenance]# delve /fts/Database_8/ number of documents = 3870000 average document length = 209.001 [root@fts1 maintenance]# delve /fts/Database_9/ number of documents = 23870000 average document length = 205.659 [root@fts1 maintenance]# delve /fts/Database_10/ number of documents = 5325000 average document length = 202.337 [root@fts1 maintenance]# delve /fts/Database_10/ -1 -t Zno | tail -n 5 5324926 5324929 5324952 5324991 5324995 [root@fts1 maintenance]# delve /fts/Database_10/ -1 -t Zrecip | tail -n 5 5321034 5324077 5324081 5324178 5324179 We have used a very minimal code, stripped of all the "other stuff" we had in it before. Its simply openning the DB once, executing the search and timing things. This is very strange (and frustrating).. I hope I gave enough details. If not, please let me know. Since in some cases, on the nodes that show variance in estimates, two searches show the same estimate while the other is different, it can be that this variance is on all the nodes if tried enough times. I think, that speed is overall very poor, especially for sequential searches, considering it should all be in OS cache. Last notes.. Olly.. 1. Regarding your stable-sort, theory, if its something we can test, let me know how. 2. What undefined value might anything depend on? 3. I don't know the testsuite and valgrind and what you refer to regarding that.. I'll have a look into that and into oprofile. If you have any pinpoints, that would help. Looking forward to your thoughts.. Ron Richard Boulton wrote:> Ron Kass wrote: >> * Estimates vary, although its exactly the same search done right one >> after the other with no changes to the DB (no data added). This is >> not really a big issue. > > This is the issue which looks oddest to me, however (though there are > other oddities). If I understood you correctly, each search is > performed on the same Database object, without reopening the Database > object between each search. This should result in exactly the same > results (and estimates) for each repeated search, since a Database > object accesses a snapshot of the underlying database. Since this > isn't happening, there must be something we don't understand about > your setup, and this is the first thing to resolve. > > Are you using the network (ie, "remote") backend? If so, the problem > could be that some of the database connections are timing out > occasionally. > > Is the database mounted locally, or over a network filing system? > > If the database isn't being modified while searches are in progress, > and/or Database objects aren't being reopened between searches, > something very weird is happening. Try doing the search on just a > single database, and see if you get the same effects. If not, add in > databases gradually until you do. > > > Other than that, it would be useful to separate timings for the search > (ie, the get_mset() call) from the calls which fetch the document > data, to see where most of the time is being spent. > > > > 4. Anyone has ideas how we can profile things to see why we have such > > performance issues? We would be happy to dig deeper to find out where > > the problems arise from. > > oprofile is your best bet for seeing where the time is being spent. >Olly Betts wrote:> On Tue, Oct 23, 2007 at 10:25:47PM +0100, Richard Boulton wrote: > >> Ron Kass wrote: >> >>> * Estimates vary, although its exactly the same search done right one >>> after the other with no changes to the DB (no data added). This is not >>> really a big issue. >>> >> This is the issue which looks oddest to me, however (though there are >> other oddities). >> > > It would be interesting to work out why this is happening. I suspect > doing so will also reveal why the timings vary. > > >> If I understood you correctly, each search is >> performed on the same Database object, without reopening the Database >> object between each search. This should result in exactly the same >> results (and estimates) for each repeated search, since a Database >> object accesses a snapshot of the underlying database. >> > > I have a theory. We sort containers of postlists when building the > postlist tree from the query. If we don't always use a stable sort, > then the exact tree built may vary between runs, which I think could > easily affect both the estimated number of matches and also how much > work the query requires (and so how long it takes). > > If this is happening, we should see if using a stable sort for such > cases imposes any overhead. I suspect it wouldn't since this is a > one-off cost per query, and we're sorting a small number of items > (at most the number of terms in the query). > > Another possibility is that there's some dependence on an undefined > value. The testsuite is clean under valgrind, but I'm sure there's > code which doesn't have testsuite coverage. > > Ron: assuming this is on Linux, if you run a few of these varying cases > under valgrind, does it report any problems? > > Cheers, > Olly >
Ron Kass
2007-Oct-24 17:14 UTC
[Xapian-discuss] Search performance issues and profiling/debugging
Hi.. Alex: Undoubtedly, XenSource is the cause of the OProfile problem. We looked into a patch for that, but there is one only for version 3 and we are using version 4. (and, yes, we are using the "official" XenSource one). As for the disks, each of the 6 disks is defined as an lvm, where one is allocated with 80Gb for system/log and the other 5 and allocated fully (465GB of the 465.5GB) to the databases. Each two databases sit on the same disk. No other instance (since none exist) or Dom0 is using the 5 disks of the databases. Hope that clarifies the disk mapping. James: While anything we put on the server complicates things a bit, xen is not really the issue here I believe. If the problem was Xen related, why would a scheduling problem effect "no recip" in such a consistent way, even after compacting the databases and moving them around? If Xapian used DMA, undocumented interrupts or something else out of the ordinary, I understand why it would be something to look into first, but what makes you think that Xen in the mix can explain the variation in estimates, the strange performance issues with specific queries only, and other strange things we see? We will certainly try to profile things, even test without Xen if we can't profile on it. Again, being the only VM instance running on that machine, there is little scheduling to do and no competition over IO and other resources. But even if there were, why would it be so constant on "no recip" search? Don't make too much sense unless we are missing something. We indeed tested things well over 3 times. This is why I picked "no recip" as a search. It is constantly performing badly even when searched second of third time right after the first (see debug output). Below are stats from 100 runs: Chris: We will try to test it without Xen as well later on. Keep in mind that to do so we will have to move aside 10 databases of 50GB, reinstall the machine and remove the database into place. We would do it first thing if we believed its Xen's issue, although if we can't profile things we might do this anyway (or test it on a different machine). Olly: Sorry, we removed the old Database10 after compressing it. Since then we didn't see the seg fault. We will keep a close eye and contact you as soon as we see such error again. Best regards, Ron. Alexandre Gauthier wrote:> Chris Good a ?crit : >> Ron Kass wrote: >> >>> Not sure what you mean by "other VMs could well be confusing your >>> results" >>> We use XenServer on this machine, but we have only one instance >>> (DomU), and only this instance is running everything locally. So >>> there are no other VMs to confuse things, and even if there were, >>> they have nothing to do with the VM we run the test on or with the >>> test itself. >>> (Can you clarify what you mean?) >>> >> >> If you have multiple VMs sharing the same hardware then activity on one >> will obviously affect the performance on other VMs. Since you're >> running >> a lone DomU other DomUs aren't going to be competing for resources >> but it's possible that something in Dom0 is getting swapped in and >> running. >> >> How are you accessing your drives, is DomU accessing the raw devices >> or is >> it mapped via virtual files from Dom0? >> >> Is it possible to run these tests either directly from Dom0 or even >> better >> with a non-xen kernel? >> >> Given your current configuration of a single VM xen isn't adding >> anything so removing it would eliminate any side-effects of it. I >> also suspect >> that it would cure your oprofile issue. >> >> Chris >> >> > Sorry to intrude, but if I may offer some insight, the Dom0 instance > in a Xen set-up is just as paravirtualized as a DomU -- it just has > control access to memory inside DomUs, and offers the drivers back-end > interfaces. The Dom0 and DomUs both run on top of the Xen kernel. > > Also, if he is running a commercial Xen from XenSource, he won't have > access to the Dom0, which is a custom frankenstein mix of SuSE and > RHEL witth no other puprose but to control the DomUs, a bit like ESX. > > The question of the DomU's disk mapping is still valid, and I'd be > curious to hear the answer. I also think Xen is responsible for the > oprofile troubles, I get that on a Debian DomU as well. > > I hope this vaguely helps... > > Alex > >James Aylett wrote:> On Wed, Oct 24, 2007 at 04:04:22PM +0200, Ron Kass wrote: > > >> Although we should never rule out something completely without checking, >> I believe quite strongly that the issues we are seeing are not coming >> from Xen, as per this instance it is a regular dedicated Linux (centos >> 5) machine and the resources are fully dedicated to it. >> > > It seems to me that there are two distinct problems. You have some > queries that are underperforming, which with some profiling will > expose either something unusual about your database or code, or a > bottleneck or optimisation problem in Xapian. > > The other is the variation. I agree with Chris that adding Xen into > the mix is complicating matters considerably. Things like IO > scheduling, for instance, become harder in even the best > virtualisation systems. It's bad enough that a single instance of an > OS can suddenly start doing things you don't expect, even with no > other significant userspace clients :-/ > > Out of interest, are your figures averages of multiple runs? If not, > I'd be interested in seeing 1st, 2nd and 3rd query times (broken down > as Olly suggests), but with mean & sd over say 100 runs. > > (Apologies if you have done that - I've been trying to follow this > thread closely, but an explosion of posts has combined with a busy > period at my end :-) > > J > >Chris Good wrote:> Ron Kass wrote: > >> I believe quite strongly that the issues we are seeing are not coming >> from Xen, as per this instance it is a regular dedicated Linux (centos >> 5) machine and the resources are fully dedicated to it. >> > > I'd still encourage you to give it a go if only to rule it out and let > you run oprofile. Running inside Xen certainly shouldn't affect your > match sets but it the diskwriter process kicking in could fully explain > some of the timing variances that you've seen when re-running queries. >Olly Betts wrote:>> Anyway, we have actually used xapian-compress on the databases to see if >> it helps. It appears to have rid of the segmentation fault error on >> database 10, but the slowness and the variations in estimates still exist. >> > > A seg fault is clearly a bug somewhere, and I'd really like to know > where. Do you still have the un-compacted database, or if not can you > recreate it? If so, please rerun the test on it under gdb as I > requested in my previous mail! > > Cheers, > Olly