Hi, I have read the concurrency webpage from the Xapian documentation: http://getting-started-with-xapian.readthedocs.io/en/latest/concepts/concurrency.html But it is still not clear to me how to ensure thread-safety when using libxapian (C++ API). Usually when doing multi-threading many threads can read the same variable concurrently without locking provided none of the threads modifies the variable. 1) Does this apply for Xapian as well? I.e. Can many thread search the database provided each thread uses their own Xapian database object and none of the threads updates/indexes the database? 2) What about when one thread indexes a new document? Do I need to ensure (by locking) that no other thread reads or write the database until the first thread has finished indexing? Thanks, Kim
On 8 Feb 2018, at 15:18, Kim Walisch <kim.walisch at gmail.com> wrote:> I have read the concurrency webpage from the Xapian documentation: > > http://getting-started-with-xapian.readthedocs.io/en/latest/concepts/concurrency.html > > But it is still not clear to me how to ensure thread-safety when using > libxapian (C++ API). > > 1) Can many thread search the > database provided each thread uses their own Xapian database object and > none of the threads updates/indexes the database?This is covered explicitly in the document you linked to:> Xapian doesn’t maintain any global state, so you can safely use Xapian in a multi-threaded program provided you don’t share objects between threads. In practice this restriction is often not a problem - each thread can create its own xapian.Database object, and everything will work fine.So the answer is 'yes'.> 2) What about when one thread indexes a new document? Do I need to > ensure (by locking) that no other thread reads or write the database until > the first thread has finished indexing?This isn't a problem either; Xapian takes care of lock management for you. It's covered in a different part of the manual: https://getting-started-with-xapian.readthedocs.io/en/latest/concepts/indexing/databases.html#concurrent-access> Currently, all the backends only support a single writer existing at a given time; attempting to open another writer on the same database will throw xapian.DatabaseLockError to indicate that it wasn’t possible to acquire a lock. Multiple concurrent readers are supported (in addition to the writer).Note the caution there about the number of versions that can exist in Xapian's MVCC, and how to deal with DatabaseModifiedError. J -- James Aylett, occasional troublemaker & project governance xapian.org
Awesome, thanks! On Feb 9, 2018 8:33 PM, "James Aylett" <james-xapian at tartarus.org> wrote:> On 8 Feb 2018, at 15:18, Kim Walisch <kim.walisch at gmail.com> wrote: > > > I have read the concurrency webpage from the Xapian documentation: > > > > http://getting-started-with-xapian.readthedocs.io/en/ > latest/concepts/concurrency.html > > > > But it is still not clear to me how to ensure thread-safety when using > > libxapian (C++ API). > > > > 1) Can many thread search the > > database provided each thread uses their own Xapian database object and > > none of the threads updates/indexes the database? > > This is covered explicitly in the document you linked to: > > > Xapian doesn’t maintain any global state, so you can safely use Xapian > in a multi-threaded program provided you don’t share objects between > threads. In practice this restriction is often not a problem - each thread > can create its own xapian.Database object, and everything will work fine. > > > So the answer is 'yes'. > > > 2) What about when one thread indexes a new document? Do I need to > > ensure (by locking) that no other thread reads or write the database > until > > the first thread has finished indexing? > > This isn't a problem either; Xapian takes care of lock management for you. > It's covered in a different part of the manual: > https://getting-started-with-xapian.readthedocs.io/en/ > latest/concepts/indexing/databases.html#concurrent-access > > > Currently, all the backends only support a single writer existing at a > given time; attempting to open another writer on the same database will > throw xapian.DatabaseLockError to indicate that it wasn’t possible to > acquire a lock. Multiple concurrent readers are supported (in addition to > the writer). > > > Note the caution there about the number of versions that can exist in > Xapian's MVCC, and how to deal with DatabaseModifiedError. > > J > > -- > James Aylett, occasional troublemaker & project governance > xapian.org > >
On Thu, Feb 08, 2018 at 04:18:12PM +0100, Kim Walisch wrote:> But it is still not clear to me how to ensure thread-safety when using > libxapian (C++ API). Usually when doing multi-threading many threads can > read the same variable concurrently without locking provided none of the > threads modifies the variable.That's true for simple types, but breaks down for classes because they may have mutable members - e.g. for caching values computed lazily: class FactorialFactory { private: mutable int r = -1; mutable int n; public: FactorialFactory() {} int calc(int v) const { if (r < 0 || n != v) { r = n; n = v; for (int i = n - 1; i > 1; --i) { r *= i; } } return r; } }; It's not safe to concurrently call f.calc() from different threads, even though conceptually calc() is a read-only method. Cheers, Olly
Hi, We have a search service based on xapian, when receving a request, we create several threads to create xapian-query and search in the corresponding databases, and then merge all results together. In some case, we use Xapian::Query::MatchAll to create the query, but it always has Segmentation fault (core dumped). It looks like some pointers are double freed. This is the function that we use to create xapian-query. std::shared_ptr<Xapian::Query> constructQuery() { Xapian::Query black_list("BLt1"); return std::make_shared<Xapian::Query>(Xapian::Query::OP_AND_NOT, Xapian::Query::MatchAll, black_list); } I have wrote a demo to reproduce our problems, you can click this link for the detail. https://github.com/xiangqianzsh/xapian_leaning/tree/master/matchall_coredump How we solve this problems?
On Mon, Apr 09, 2018 at 12:49:11AM +0800, 张少华 wrote:> In some case, we use Xapian::Query::MatchAll to create the query, but > it always has Segmentation fault (core dumped). It looks like some > pointers are double freed.Xapian::Query::MatchAll is a static object, but that means that it's a Xapian object that can get used from multiple threads concurrently which we don't support. The result is that the reference count can get updated by more than one thread at the same time and get out of step. I don't currently see a nice way to fix this while providing the same API, but there is at least a simple workaround - just use Xapian::Query(std::string()) instead of Xapian::Query::MatchAll in multi-threaded code. Xapian::Query::MatchNothing is also a static object, but in that case the internal object is NULL so there isn't a reference count and it should actually be safe in multi-threaded code.> This is the function that we use to create xapian-query. > > std::shared_ptr<Xapian::Query> constructQuery() { > > Xapian::Query black_list("BLt1"); > > return std::make_shared<Xapian::Query>(Xapian::Query::OP_AND_NOT, Xapian::Query::MatchAll, black_list); > > }This isn't connected with the problem here, but there's no need to use std::shared_ptr<> here if you're just wanting to reference count the object as Xapian::Query is itself a reference counted pointer to an internal implementation object. Cheers, Olly
HI, Many functions in xapian can be traced when they are called using LOGCALL_CTOR/LOGCALL_VOID/... and I want to output the debug information. So, I first install xapian and enable log, using commands as follow ``` cd xapian-core-1.4.5 # xapian source code downloaded ./configure --enable-log=yes --enable-assertions=yes make sudo make install ``` But there isn't any debug information when I run xapian-core-1.4.5/examples/simplesearch.cc Also, I notice -DXAPIAN_REALLY_NO_DEBUG_LOG is added in xapian-core-1.4.5/Makefile, so I deleted this options, but I can not compile successfully then. How I can output the debug information.
James Aylett
2018-Jun-09 09:54 UTC
output debug information of LOGCALL_CTOR/LOGCALL_VOID/...
On 9 Jun 2018, at 07:37, 张少华 <xiangqianzsh at 163.com> wrote:> Many functions in xapian can be traced when they are called using LOGCALL_CTOR/LOGCALL_VOID/... and I want to output the debug information.In xapian-core/HACKING in the source code there's a section on this. As well as pass --enable-log to configure, you also need to: * set XAPIAN_DEBUG_LOG to be the path to a file that you would like debugging output to be appended to, or to the special value ``-`` to indicate that you would like debugging output to be sent to stderr. Unless XAPIAN_DEBUG_LOG is set, no debug logging will be performed. Occurrences of %p in XAPIAN_DEBUG_LOG will be replaced with the current process-id.> Also, I notice -DXAPIAN_REALLY_NO_DEBUG_LOG is added in xapian-core-1.4.5/Makefile, so I deleted this options, but I can not compile successfully then.That symbol is to stop debug calls being available when compiling things that use Xapian. It's used in common/debuglog.h, which says: // In places where we include library code in non-library contexts, we can't // have debug logging enabled, as the support functions aren't visible which sounds like exactly what you've encountered. J -- James Aylett devfort.com — spacelog.org — tartarus.org/james/
Maybe Matching Threads
- output debug information of LOGCALL_CTOR/LOGCALL_VOID/...
- How to ensure thread-safety
- How to get the serialise score returned in Xapian::KeyMaker->operator().
- How to get the serialise score returned in Xapian::KeyMaker->operator().
- How to get the serialise score returned in Xapian::KeyMaker->operator().