Hi. I am one of the developers of Swish-e (http://swish-e.org/), an indexing/search tool similar to Xapian. I am currently researching future development directions for the Swish-e project. Three of our most often-requested features are UTF-8 support, incremental indexing, and large (multimillion) doc sets, all of which seem to be ably handled in the Xapian library. So one possible direction for us might be to use the Xapian db as our backend. Swish-e would be much like Omega in that regard, except that we use libxml2 to parse HTML/XML, etc., along with some other features like defining which characters constitute a "word" and so forth. My question is whether there exists a C API for Xapian, or more precisely, whether anyone has successfully used Xapian from within a C program. I realize that it is possible to use C++ libraries with C, but at a cursory glance, it doesn't look as if the xapian-core C++ source is defined with 'extern "C"' or any of the other common access points for C programs. A change to Swish-e to use Xapian would mean a total re-write for Swish-e, so we could do C++, but it would be good to know that upfront. All thoughts appreciated. cheers, -- Peter Karman . http://peknet.com/ . peter at peknet.com
On Thu, Sep 08, 2005 at 08:06:02AM -0500, Peter Karman wrote:> My question is whether there exists a C API for Xapian, or more precisely, > whether anyone has successfully used Xapian from within a C program. I > realize that it is possible to use C++ libraries with C, but at a cursory > glance, it doesn't look as if the xapian-core C++ source is defined with > 'extern "C"' or any of the other common access points for C programs.There isn't a C API. The C++ API is almost entirely class based, so any fully featured C API would inevitably involve passing around opaque handles which are in reality pointers to C++ objects. That always feels clumsy to me - I think it makes more sense to just use C++ instead of trying to write OO code in C. A restricted C API might be feasible - if you only allowed one WritableDatabase and one Document to exist at a time for example. But I'm not sure it's effort well spent really. One option would be to write your own thin C++ wrapper around Xapian which exports a simple C API with all the features you actually need for your project. I took roughly this approach for the gmane indexer to allow me to bolt Xapian onto an existing C indexer quickly.> A change to Swish-e to use Xapian would mean a total re-write for > Swish-e, so we could do C++, but it would be good to know that > upfront. > > All thoughts appreciated.Another approach which comes to mind is to use something like python. That rather rules out much hope of code reuse though. At least with C++, most existing C code could be reused with only minor changes, or by compiling it as C and wrapping 'extern "C" {...}' around the header. Cheers, Olly