Hi List, on our MDS we noticed that all memory seems to be used. (And it''s not just normal buffers/cache as far as I can tell.) When we put load on the machine, for example by starting rsync on a few clients, generating file lists to copy data from Lustre to local disks or just running a MDT backup locally using dd/gzip to copy a LVM snapshot to a remote server, kswapd starts using a lot of CPU time, sometimes up to 100% of one CPU core. This is on a Lustre 1.6.7.2.ddn3.5 based file system with about 200TB, the MDT is 800GB with 200M inodes, ACLs enabled. The memory seems mostly used by the kernel and that quite a lot of it is ldlm_locks, ldlm_resource according to slabtop. Some details of this are below, but the main question that we now have is whether or not this is normal and expected. Is there a tunable to restrict Lustre to use a bit less slab memory than it currently is? Will adding more memory to this machine solve the problem that there seems to be not enough memory to run normal processes or will it just delay the occurrences of this? Kind regards, Frederik Memory details: <snip> [root at cs04r-sc-mds01-01 proc]# free total used free shared buffers cached Mem: 16497436 16146416 351020 0 257624 17836 -/+ buffers/cache: 15870956 626480 Swap: 2031608 322768 1708840 [root at cs04r-sc-mds01-01 proc]# cat /proc/meminfo MemTotal: 16497436 kB MemFree: 352004 kB Buffers: 256084 kB Cached: 17688 kB SwapCached: 149544 kB Active: 200764 kB Inactive: 255344 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 16497436 kB LowFree: 352004 kB SwapTotal: 2031608 kB SwapFree: 1708840 kB Dirty: 268 kB Writeback: 0 kB AnonPages: 182272 kB Mapped: 17528 kB Slab: 15248816 kB PageTables: 6984 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 10280324 kB Committed_AS: 1321284 kB VmallocTotal: 34359738367 kB VmallocUsed: 330740 kB VmallocChunk: 34359394255 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB [root at cs04r-sc-mds01-01 proc]# slabtop --once |head -15 Active / Total Objects (% used) : 30350433 / 38705406 (78.4%) Active / Total Slabs (% used) : 3801362 / 3801369 (100.0%) Active / Total Caches (% used) : 114 / 168 (67.9%) Active / Total Size (% used) : 12325021.07K / 14610074.85K (84.4%) Minimum / Average / Maximum Object : 0.02K / 0.38K / 128.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 15657800 14362022 91% 0.50K 1957225 8 7828900K ldlm_locks 10165900 9719990 95% 0.38K 1016590 10 4066360K ldlm_resources 3650979 1038530 28% 0.06K 61881 59 247524K size-64 3646620 3159662 86% 0.12K 121554 30 486216K size-128 3099906 863841 27% 0.21K 172217 18 688868K dentry_cache 1679436 859267 51% 0.83K 419859 4 1679436K ldiskfs_inode_cache 460725 133164 28% 0.25K 30715 15 122860K size-256 122440 65022 53% 0.09K 3061 40 12244K buffer_head -- Frederik Ferner Computer Systems Administrator phone: +44 1235 77 8624 Diamond Light Source Ltd. mob: +44 7917 08 5110 (Apologies in advance for the lines below. Some bits are a legal requirement and I have no control over them.)
Does anyone have information on the next Lustre training event? ~Lawrence -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100824/52cfa9fa/attachment.html
On Tuesday, August 24, 2010, Frederik Ferner wrote:> Hi List, > > on our MDS we noticed that all memory seems to be used. (And it''s not > just normal buffers/cache as far as I can tell.) > > When we put load on the machine, for example by starting rsync > on a few clients, generating file lists to copy data from Lustre to > local disks or just running a MDT backup locally using dd/gzip to copy a > LVM snapshot to a remote server, kswapd starts using a lot of CPU > time, sometimes up to 100% of one CPU core. > > This is on a Lustre 1.6.7.2.ddn3.5 based file system with about 200TB, > the MDT is 800GB with 200M inodes, ACLs enabled.Did you recompile it, or did you use the binaries from my home page (or those you got from CV)? Possibly it is a LRU auto-resize problem, but which has been disabled in DDN builds. As our 1.6 releases didn''t include a patch for that, you would need to specify the correct command options if you recompiled it. Another reason might be bug 22771, although that should only come up on MDS with more memory you have. Cheers, Bernd -- Bernd Schubert DataDirect Networks
Hi Lawrence, We would held this training in Jakarta, Indonesia at 13 - 17 September 2010. ? Regards, Andry http://www.optimacomputer.com -----Original Message----- From: Lawrence Sorrillo <sorrillo at jlab.org> Sender: lustre-discuss-bounces at lists.lustre.org Date: Tue, 24 Aug 2010 14:42:27 To: <lustre-discuss at lists.lustre.org> Subject: [Lustre-discuss] Seeking Lustre Training _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Hi Bernd, thanks for your reply. Bernd Schubert wrote:> On Tuesday, August 24, 2010, Frederik Ferner wrote: >> on our MDS we noticed that all memory seems to be used. (And it''s not >> just normal buffers/cache as far as I can tell.) >> >> When we put load on the machine, for example by starting rsync >> on a few clients, generating file lists to copy data from Lustre to >> local disks or just running a MDT backup locally using dd/gzip to copy a >> LVM snapshot to a remote server, kswapd starts using a lot of CPU >> time, sometimes up to 100% of one CPU core. >> >> This is on a Lustre 1.6.7.2.ddn3.5 based file system with about 200TB, >> the MDT is 800GB with 200M inodes, ACLs enabled. > > Did you recompile it, or did you use the binaries from my home page (or those > you got from CV)?This is a recompiled Lustre version to include the patch from bug 22820.> Possibly it is a LRU auto-resize problem, but which has been disabled in DDN > builds. As our 1.6 releases didn''t include a patch for that, you would need to > specify the correct command options if you recompiled it.I guess it''s likely that I have not specified the correct option. So the binaries on your home page are compiled with ''--disable-lru-resize''? Any other options that you used?> Another reason might be bug 22771, although that should only come up on MDS > with more memory you have.I had a look at that bug and while we have a default stripe count of 1 so the stripe count should fit into the inode. On the other hand we use ACLs in quite a few places, so it seems we might hit this bug if we increase the memory from the 16GB currently, correct? Cheers, Frederik -- Frederik Ferner Computer Systems Administrator phone: +44 1235 77 8624 Diamond Light Source Ltd. mob: +44 7917 08 5110 (Apologies in advance for the lines below. Some bits are a legal requirement and I have no control over them.)
Hello Frederik, On Wednesday, August 25, 2010, Frederik Ferner wrote:> Hi Bernd, > > thanks for your reply. > > Bernd Schubert wrote: > > On Tuesday, August 24, 2010, Frederik Ferner wrote: > >> on our MDS we noticed that all memory seems to be used. (And it''s not > >> just normal buffers/cache as far as I can tell.) > >> > >> When we put load on the machine, for example by starting rsync > >> on a few clients, generating file lists to copy data from Lustre to > >> local disks or just running a MDT backup locally using dd/gzip to copy a > >> LVM snapshot to a remote server, kswapd starts using a lot of CPU > >> time, sometimes up to 100% of one CPU core. > >> > >> This is on a Lustre 1.6.7.2.ddn3.5 based file system with about 200TB, > >> the MDT is 800GB with 200M inodes, ACLs enabled. > > > > Did you recompile it, or did you use the binaries from my home page (or > > those you got from CV)? > > This is a recompiled Lustre version to include the patch from bug > 22820. > > > Possibly it is a LRU auto-resize problem, but which has been disabled in > > DDN builds. As our 1.6 releases didn''t include a patch for that, you > > would need to specify the correct command options if you recompiled it. > > I guess it''s likely that I have not specified the correct option. So the > binaries on your home page are compiled with ''--disable-lru-resize''? > Any other options that you used?I always enable the health-write, which will help pacemaker to detect IO errors (by monitoring /proc/fs/lustre/health_check) --enable-health-write> > > Another reason might be bug 22771, although that should only come up on > > MDS with more memory you have. > > I had a look at that bug and while we have a default stripe count of 1 > so the stripe count should fit into the inode. On the other hand we use > ACLs in quite a few places, so it seems we might hit this bug if we > increase the memory from the 16GB currently, correct?Yeah and I think 16GB should be sufficient for the MDS. -- Bernd Schubert DataDirect Networks