I have a program which traverses directories and builds a list of files in those directories. The hierarchy being traversed consists of approximately 750,000 files scattered among approximately 4100 directories. I ran it under three configurations: 1. a regular linux system with on a 15k RPM SAS drive. 2. directly on the RAID under my gluster installation. 3. GlusterFS 3.1.0 / RDMA / Distributed / native fuse clients. I did the experiment twice in a row on each configuration. Results: 1. 90 seconds then 7 seconds. 2. 74 seconds then 4 seconds. 3. 4678 seconds then 4648 seconds. Any suggestions about why my gluster installation is so much slower than the regular file systems and how I can speed things up? Here is pseudo-code for the traversal: push the root onto a stack while stack not empty curdir = pop the stack opendir(curdir) foreach diritem in curdir (ignoring . and ..), stat diritem if diritem is a directory, push diritem onto the stack else put diritem and its size into output closedir(curdir) Thanks! .. Lana (lana.deere at gmail.com)
For gluster 3.1.0 on a large hierarchy I reported that it took these times to build a list of files:> 3. 4678 seconds then 4648 seconds.For gluster 3.1.1 on the same hierarchy, having done "gluster volume reset" after the upgrade, I got 640 seconds then 90 seconds. I'm guessing this improvement is due to the stat-prefetch translator, but whether or not my guess is correct, this is a nice speedup. .. Lana (lana.deere at gmail.com) On Fri, Nov 5, 2010 at 3:03 PM, Lana Deere <lana.deere at gmail.com> wrote:> I have a program which traverses directories and builds a list of > files in those directories. ?The hierarchy being traversed consists > of approximately 750,000 files scattered among approximately 4100 > directories. ?I ran it under three configurations:[...]> ?3. GlusterFS 3.1.0 / RDMA / Distributed / native fuse clients.[...]> I did the experiment twice in a row on each configuration. ?Results: > ?3. 4678 seconds then 4648 seconds.