Hi, I'm testing an Irix instance of 0.8.1. I have a number of documents indexed, here is a directory listing: /usr/local/lib/oasis/closed22: total 2700 -rw------- 1 oasis user 10 Jul 1 12:33 meta -rw------- 1 oasis user 1245184 Sep 1 07:52 position_DB -rw------- 1 oasis user 30 Sep 1 07:52 position_baseA -rw------- 1 oasis user 29 Sep 1 07:51 position_baseB -rw------- 1 oasis user 950272 Sep 1 07:52 postlist_DB -rw------- 1 oasis user 29 Sep 1 07:52 postlist_baseA -rw------- 1 oasis user 19 Sep 1 07:51 postlist_baseB -rw------- 1 oasis user 81920 Sep 1 07:52 record_DB -rw------- 1 oasis user 16 Sep 1 07:52 record_baseA -rw------- 1 oasis user 14 Sep 1 07:51 record_baseB -rw------- 1 oasis user 409600 Sep 1 07:52 termlist_DB -rw------- 1 oasis user 20 Sep 1 07:52 termlist_baseA -rw------- 1 oasis user 18 Sep 1 07:51 termlist_baseB -rw------- 1 oasis user 32768 Sep 1 07:52 value_DB -rw------- 1 oasis user 15 Sep 1 07:52 value_baseA -rw------- 1 oasis user 14 Sep 1 07:51 value_baseB I added with scriptindex 145 documents as shown: Processing 22219.dat Wed Sep 1 07:54:58 2004 records (added, replaced, deleted) = (145, 0, 0) Finished 22219.dat Wed Sep 1 07:55:46 2004 The documents were contained in a single file, a wc is shown: wc closed22/22220.dat 18459 95858 720455 closed22/22220.dat Does this sound like it is performing OK? It seems a little slow to me, but I'd like another opinion. Thanks, Jim.
On Wed, Sep 01, 2004 at 11:01:09AM -0400, Jim Lynch wrote:> I'm testing an Irix instance of 0.8.1. I have a number of documents > indexed, here is a directory listing: > > /usr/local/lib/oasis/closed22: >[...]How many documents are already indexed? Running "delve /usr/local/lib/oasis/closed22" will tell you.> Does this sound like it is performing OK? It seems a little slow to me, > but I'd like another opinion.145 documents in just under a minute does sound slow, but I don't know what the hardware is, or if the machine is doing other tasks. Also, is the index script using "unique"? There will be some overhead in that. If you're building from scratch and know the source data has no duplicates, you could use a variant of the index script without the unique action in and see if that helps. Incidentally, 0.8.2 will be significantly faster in general, even without increasing the number of documents per flush. Cheers, Olly