Hi everybody, I'm the author of a small C++11 program called XDGSearch. The source code is hosted on Github, for a quick overview you can visit this link https://github.com/frank67/XDGSearch/blob/master/README.md I'm writing to the mailing list because I'd like to make the database build process splitted in more thread. Is it possible? If you are a C++ programmer you can take a look at the populateDB() function at this link https://github.com/frank67/XDGSearch/blob/597380bfd7de94857cef08c95e5e31392807a7cb/indexer.cpp#L85 which is the heart of the databases creation process. I'd be also happy if I find somebody that want to be involved in the XDGSearch development :) Thanks in advance for any answer, best regards -- Franco Martelli
Franco Martelli writes: > Hi everybody, > I'm the author of a small C++11 program called XDGSearch. The source > code is hosted on Github, for a quick overview you can visit this link > https://github.com/frank67/XDGSearch/blob/master/README.md > I'm writing to the mailing list because I'd like to make the database > build process splitted in more thread. Is it possible? If you are a C++ > programmer you can take a look at the populateDB() function at this link > https://github.com/frank67/XDGSearch/blob/597380bfd7de94857cef08c95e5e31392807a7cb/indexer.cpp#L85 > which is the heart of the databases creation process. > I'd be also happy if I find somebody that want to be involved in the > XDGSearch development :) > Thanks in advance for any answer, best regards > > -- > Franco Martelli Hi, You may be interested by how Recoll does it: https://www.lesbonscomptes.com/recoll/idxthreads/threadingRecoll.html A few things in the document are slightly obsolete (esp. the last paragraph: recollindex now does use vfork()), but it's overall quite close to how the current indexer works. jfd
On 14/09/2018 at 09:30, Jean-Francois Dockes wrote:> Hi, > > You may be interested by how Recoll does it: > > https://www.lesbonscomptes.com/recoll/idxthreads/threadingRecoll.html > > A few things in the document are slightly obsolete (esp. the last > paragraph: recollindex now does use vfork()), but it's overall quite close > to how the current indexer works. > > jfd >Thank for your answer, briefly it's No:> The Xapian library index updating code is not designed for multi-threading and must stay protected from multiple accesses.just for evaluation purpose could you provide me some links to the code about how Recoll parallelizes "Data extraction and Conversion" and "Term generation". Thanks in advance, best regards -- Franco Martelli