Vishesh Handa
2014-Mar-25 14:33 UTC
[Xapian-discuss] High memory usage when replacing a document
Hey guys I've been trying to debug some really high memory usage that we have been experiencing when trying to replace a document. The document in question has been produced my passing a 25+ mb text file through the term generator. $ delve . -r 11021 -1 | wc -l 1019413 piping this to a text file amounts to 19 mb. When indexing this document, the xapian db skyrockets to about 400 mb (ram not disk space). I've run it through massif (attached the file - I would recommend running it through massif visualizer). The main offenders seem to be the following - 1. 50 mb - std:strings in ChertTermList 2. 77 mb - Document::add_posting seems to have some internal std::map. I'm guessing this is its internal list of terms. Though 77 mb seems like a LOT. 3. 46 mb - ChertWritableDatabase::add_freq-delta 4. 46 mb - ChertWrtiableDatabase::update_mod_plist (1) seems like it is reading the terms from the Database and keeping them in memory (3) and (4) are probably related to when we create the WritableDb and replace the document. Are there any tips / variables I can configure to trim this memory usage down? -- Vishesh Handa