Hello, We have a Lustre 1.8.1 file system about 60 TB in size running on RHEL 5 x86_64. (I can provide hardware details if anyone thinks they''d be relevant.) We are seeing memory problems after several days of sustained I/O into that file system. We are writing from a small number of clients (4 - 5) at an average rate of 50 MB/s, with peaks of 350 MB/s. We read all the data at least twice before deleting them. During this operation, we notice the value of "buffers" reported in ''/proc/meminfo'' on the OSSs involved increasing monotonically until it apparently take up all the system''s memory - 32 GB. Then ''kswapd'' starts consuming a large amount of CPU, the load increases (100+), and the system, including Lustre, slows to crawl and becomes quite useless. If we stop Lustre I/O at this point, ''kswapd'' and the system load calm down, but the "buffers" value does not decrease. Any I/O on the system will then (dd if=/dev/urandom of=/tmp/test ...) will cause ''kswapd'' to run away again. We have observed the monotonically increasing "buffers" condition with non-Lustre I/O on systems running the Lustre 1.8.1 kernel (2.6.18-128.1.14.el5_lustre.1.8.1), but we haven''t gotten them to point where ''kswapd'' goes wild. Has anyboy else seen anything like this? David Simas SLAC
Do you have OSS readcache on? Check out https://bugzilla.lustre.org/show_bug.cgi?id=20778 and https://bugzilla.lustre.org/show_bug.cgi?id=18571 David David Simas wrote:> Hello, > > We have a Lustre 1.8.1 file system about 60 TB in size running on > RHEL 5 x86_64. (I can provide hardware details if anyone thinks > they''d be relevant.) We are seeing memory problems after several > days of sustained I/O into that file system. We are writing from > a small number of clients (4 - 5) at an average rate of 50 MB/s, with > peaks of 350 MB/s. We read all the data at least twice before deleting > them. During this operation, we notice the value of "buffers" > reported in ''/proc/meminfo'' on the OSSs involved increasing monotonically > until it apparently take up all the system''s memory - 32 GB. Then ''kswapd'' > starts consuming a large amount of CPU, the load increases (100+), and the > system, including Lustre, slows to crawl and becomes quite useless. If we > stop Lustre I/O at this point, ''kswapd'' and the system load calm down, but > the "buffers" value does not decrease. Any I/O on the system will then > (dd if=/dev/urandom of=/tmp/test ...) will cause ''kswapd'' to run away > again. We have observed the monotonically increasing "buffers" condition > with non-Lustre I/O on systems running the Lustre 1.8.1 kernel > (2.6.18-128.1.14.el5_lustre.1.8.1), but we haven''t gotten them to point > where ''kswapd'' goes wild. > > Has anyboy else seen anything like this? > > David Simas > SLAC > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Mon, 2009-10-12 at 17:06 -0700, David Simas wrote:> Hello,Hi,> During this operation, we notice the value of "buffers" > reported in ''/proc/meminfo'' on the OSSs involved increasing monotonically > until it apparently take up all the system''s memory - 32 GB.This would likely be OSS read cache, if you have it enabled. If you do, you should disable it due to a potential corruption issue. Details were given previously on this list on how to do that. Check the archives. But that''s not directly related to what you are seeing. Having "buffers" consume all of available memory is SOP (standard Operating Procedure) for Linux. The philosophy is that "free" (unused) memory is wasted memory and as such, any memory not needed by applications or other kernel processing is used to buffer disk I/O. The performance spiff of such is obvious I think.> Then ''kswapd'' > starts consuming a large amount of CPU, the load increases (100+), and the > system, including Lustre, slows to crawl and becomes quite useless.This doesn''t sound normal or good.> If we > stop Lustre I/O at this point, ''kswapd'' and the system load calm down, but > the "buffers" value does not decrease.Right. The buffers will not get "emptied" until something else needs the memory. Again, unused memory is wasted memory.> We have observed the monotonically increasing "buffers" condition > with non-Lustre I/O on systems running the Lustre 1.8.1 kernel > (2.6.18-128.1.14.el5_lustre.1.8.1),Indeed. The using up of memory by the buffer cache is a standard (i.e. non-Lustre-specific) feature, and you will find the same thing on non-Lustre kernels as well. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091013/bc1bf710/attachment.bin
This sounds very much like a problem we saw before we changed the lru_size to a fixed size from dynamic. -- Andrew -----Original Message----- From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of David Simas Sent: Monday, October 12, 2009 6:07 PM To: lustre-discuss at lists.lustre.org Subject: [Lustre-discuss] Memory (?) problem with 1.8.1 Hello, We have a Lustre 1.8.1 file system about 60 TB in size running on RHEL 5 x86_64. (I can provide hardware details if anyone thinks they''d be relevant.) We are seeing memory problems after several days of sustained I/O into that file system. We are writing from a small number of clients (4 - 5) at an average rate of 50 MB/s, with peaks of 350 MB/s. We read all the data at least twice before deleting them. During this operation, we notice the value of "buffers" reported in ''/proc/meminfo'' on the OSSs involved increasing monotonically until it apparently take up all the system''s memory - 32 GB. Then ''kswapd'' starts consuming a large amount of CPU, the load increases (100+), and the system, including Lustre, slows to crawl and becomes quite useless. If we stop Lustre I/O at this point, ''kswapd'' and the system load calm down, but the "buffers" value does not decrease. Any I/O on the system will then (dd if=/dev/urandom of=/tmp/test ...) will cause ''kswapd'' to run away again. We have observed the monotonically increasing "buffers" condition with non-Lustre I/O on systems running the Lustre 1.8.1 kernel (2.6.18-128.1.14.el5_lustre.1.8.1), but we haven''t gotten them to point where ''kswapd'' goes wild. Has anyboy else seen anything like this? David Simas SLAC _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
We face this problem on the Lustre servers on our cluster with GigE network. We found that increasing the following value in /etc/sysctl.conf forces the kswapd to kick in a lot earlier and prevent the scenario that you are talking about. Our servers have only 8GB memory, you might want to bump it up to 2GB or even 4GB with 32GB system memory. # Control the min_free_kbytes vm.min_free_kbytes = 1048576 Hope this helps. Nirmal
On 13-Oct-09, at 04:58, Brian J. Murrell wrote:> On Mon, 2009-10-12 at 17:06 -0700, David Simas wrote: >> During this operation, we notice the value of "buffers" >> reported in ''/proc/meminfo'' on the OSSs involved increasing >> monotonically >> until it apparently take up all the system''s memory - 32 GB. > > This would likely be OSS read cache, if you have it enabled. If you > do, > you should disable it due to a potential corruption issue. Details > were > given previously on this list on how to do that. Check the archives. > > But that''s not directly related to what you are seeing. > > Having "buffers" consume all of available memory is SOP (standard > Operating Procedure) for Linux. The philosophy is that > "free" (unused) > memory is wasted memory and as such, any memory not needed by > applications or other kernel processing is used to buffer disk I/O. > The > performance spiff of such is obvious I think. > >> Then ''kswapd'' >> starts consuming a large amount of CPU, the load increases (100+), >> and the >> system, including Lustre, slows to crawl and becomes quite useless. > > This doesn''t sound normal or good.There is a recent bug for memory pressure on the OSS. I believe it was fixed for the next release.>> If we >> stop Lustre I/O at this point, ''kswapd'' and the system load calm >> down, but >> the "buffers" value does not decrease. > > Right. The buffers will not get "emptied" until something else needs > the memory. Again, unused memory is wasted memory. > >> We have observed the monotonically increasing "buffers" condition >> with non-Lustre I/O on systems running the Lustre 1.8.1 kernel >> (2.6.18-128.1.14.el5_lustre.1.8.1), > > Indeed. The using up of memory by the buffer cache is a standard > (i.e. > non-Lustre-specific) feature, and you will find the same thing on > non-Lustre kernels as well. > > b. > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.