kwestneat@datadirectnet.com
2007-Jun-08 14:36 UTC
[Lustre-devel] [Bug 12092] requests timeout under load
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=12092 Hello, I''m running into a similar problem I think. I''m running a test where I untar the kernel source on 9 clients, each client untarring into its own directory on the filesystem. The problem I''m running into is that after a while there is so much congestion that requests start timing out and stuff starts to fail. Here''s what it looks like on the management node: [root@dcm tmp]# for x in `seq 1 5`; do echo run $x; /usr/bin/time pdsh -w xfer[08-16] ''/home/kwestnea/md.test2''; done 2>&1 | tee /tmp/md.test2.run16 run 1 xfer11: tar: linux-2.6.21.3/sound/core/seq/oss/seq_oss_writeq.c: Cannot change ownership to uid 0, gid 0: Input/output error xfer12: tar: linux-2.6.21.3/net/ipv6/addrconf_core.c: Cannot change ownership to uid 0, gid 0: Input/output error xfer15: tar: linux-2.6.21.3/net/llc/llc_conn.c: Cannot change ownership to uid 0, gid 0: Input/output error xfer13: tar: linux-2.6.21.3/scripts/kconfig/mconf.c: Cannot change ownership to uid 0, gid 0: Input/output error xfer11: tar: Error exit delayed from previous errors xfer13: tar: Error exit delayed from previous errors xfer15: tar: Error exit delayed from previous errors xfer12: tar: Error exit delayed from previous errors 0.04user 0.01system 27:53.78elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6819minor)pagefaults 0swaps