I use Omega to index and search an archive of magazine and ebook pdfs etc. I also have a Wiki (in MediaWiki) that I wanted to include in that index too. If it's any use to anybody, I've adapted dbi2omega to export the pages from MediaWiki and shared it on GitHub - search for mediawiki2omega. It doesn't do anything very clever, but it might save someone time figuring out the MediaWiki database and the scriptindex fields. Feel free to correct me if I've not understood the xapian fields properly! I'm sure it could be improved, for example doing something with categories; exporting Talk:, User: etc namespaces; removing/converting wiki markup in some way. But what it does now works just great for me. I wasn't looking to replace the (awful) MediaWiki search (I use SphinxSearch for that, which is vastly better than the built in search). I just wanted a single search point for finding those nuggets I knew I have hidden away somewhere. As an aside, why are the scriptindex field definitions defined in a separate file? Couldn't they go in-line before the data (sort of like column names in a header line in a CSV are)? When parsing the data stream, things like caption:xxx would be read as field definitions, then caption=xxx would be text for indexing - not difficult surely? It would mean a converter such as mediawiki2omega could generate a single stream that could just be piped into scriptindex, without needing to use a separate script file for the field definitions.