Kevin Duraj
2011-May-13 16:55 UTC
[Xapian-discuss] Xapian Index 253 million documents = 704G
Xapian Index 253 million documents = 704G I just build my largest single Xapian index with 253 million unique documents on single server using single hard disk, less that 8G RAM and single processor 2.0 GHz. I do not see any search performance decreases in searching my indexes between 100 million and 250 million, which indicates a good scalability of Xapian and it looks like, I can push it easily forwards 300 million documents on single Index. You can check it yourself at: http://myhealthcare.com/ number of documents = 253717716 average document length = 35670.3 document length lower bound = 1 document length upper bound = 181656 highest document id ever used = 253717716 total 704G -rw-r--r-- 1 kevin kevin 28 2011-05-13 08:30 iamchert -rw-r--r-- 1 kevin kevin 14 2011-05-13 03:28 position.baseA -rw-r--r-- 1 kevin kevin 718K 2011-05-13 08:30 position.baseB -rw-r--r-- 1 kevin kevin 359G 2011-05-13 08:30 position.DB -rw-r--r-- 1 kevin kevin 14 2011-05-12 17:22 postlist.baseA -rw-r--r-- 1 kevin kevin 167K 2011-05-13 02:26 postlist.baseB -rw-r--r-- 1 kevin kevin 84G 2011-05-13 02:26 postlist.DB -rw-r--r-- 1 kevin kevin 14 2011-05-13 02:26 record.baseA -rw-r--r-- 1 kevin kevin 301K 2011-05-13 03:02 record.baseB -rw-r--r-- 1 kevin kevin 151G 2011-05-13 03:02 record.DB -rw-r--r-- 1 kevin kevin 14 2011-05-13 03:02 termlist.baseA -rw-r--r-- 1 kevin kevin 224K 2011-05-13 03:28 termlist.baseB -rw-r--r-- 1 kevin kevin 112G 2011-05-13 03:28 termlist.DB Thanks, Kevin Duraj http://myhealthcare.com