J Alejandro Medina
2011-Sep-12 01:09 UTC
[Lustre-discuss] Problems with multiple lustre filesystems
Hi to all, Our organization has recently configured two Lustre filesystems on a Linux cluster. Both filesystems are connected to the same 10GBe VLAN. We have tested both filesystems with iOzone and other benchmarking software without errors. When copying data from one filesystem to the other we experience excessive broadcast messages. The network crawls down to its knees until both filesystems stop responding. If we test both filesystems separately we do not see this behavior. Any ideas? -- J. Alejandro Medina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110911/c5d07370/attachment.html
THIELL Stephane
2011-Sep-13 16:20 UTC
[Lustre-discuss] Problems with multiple lustre filesystems
J Alejandro Medina a ?crit :> When copying data from one filesystem to the other we experience > excessive broadcast messages. The network crawls down to its knees > until both filesystems stop responding. > > If we test both filesystems separately we do not see this behavior.An idea could be to reduce your lustre client max_cached_mb value (per-filesystem value). By default, it is set to 2/3 of available system memory so it''s not optimal when mounting multiple lustre filesystems on the same node, especially when copying data from one to the other. see /proc/fs/lustre/llite/*/max_cached_mb HTH, Stephane Thiell CEA
gregoire.pichon at bull.net
2011-Sep-14 07:10 UTC
[Lustre-discuss] Problems with multiple lustre filesystems
> De : THIELL Stephane <stephane.thiell at cea.fr> > > J Alejandro Medina a ?crit : > > When copying data from one filesystem to the other we experience > > excessive broadcast messages. The network crawls down to its knees > > until both filesystems stop responding. > > > > If we test both filesystems separately we do not see this behavior. > An idea could be to reduce your lustre client max_cached_mb value > (per-filesystem value). By default, it is set to 2/3 of available system> memory so it''s not optimal when mounting multiple lustre filesystems on > the same node, especially when copying data from one to the other. > > see /proc/fs/lustre/llite/*/max_cached_mb >Looking at the code (lustre 2.0) it appears the max_cached_mb tunable has no effect. I have found LU-141 "port lustre client page cache shrinker back to clio" that tracks the problem. -- Gr?goire PICHON Bull -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110914/0dc9320c/attachment.html