Hello I'm developing a search system for our file servers, and as a basic way to handle file permissions, I decided to split things up in multiple index databases. This seemed to be the easiest and cleanest way to do it. I implemented it and always tested it with smaller indexes (smaller than 1000 docs per index) and everything went fine and search performance over multiple databases was great. But now I indexed a larger part of the filesystem in one index, which grew to over 6000 documents. Searching only this index is still fast as hell. But doing a combined search with only one other database makes searching a real pain. Most of the queries I tested took about 15 seconds to finish, with some queries taking 30 seconds and even longer. Of course I expected it to be a bit slower, but by just adding those 100 docs from another database into the search, searching becomes way too slow to work with it. I wonder if there's a way I can improve search performance over multiple databases? Has anyone dealt with this issue before? Or have I made a wrong architectural choice in the beginning? Thanks in Advance, Alain
Richard Boulton
2012-Aug-15 09:37 UTC
[Xapian-discuss] multiple database search performance
This is not what I'd expect at all; searching across multiple databases has a small overhead but shouldn't really be significantly slower than searching the component databases. Either you're doing something wrong, or there's a bug. Let's try and work out which. For starters: When you say you're searching across multiple databases, are you searching across them locally, or using the remote database protocol? What OS and version of Xapian are you using? -- Richard