Götz Waschk
2010-Mar-05 12:01 UTC
[Lustre-discuss] Extremely high load and hanging processes on a Lustre client
Hi everyone, I have a critical problem on one of my Lustre client machines running Scientific Linux 5.4 and the patchless Lustre 1.8.2 client. After a few days of usage, some processes like cp and kswapd0 start to use 100% CPU. Only 180k of swap space are in use though. Processes that try to access Lustre use a lot of CPU and seem to hang. There is some output in the kernel log I''ll attach to this mail. Do you have any idea what to test before rebooting the machine? Regards, G?tz Waschk -- AL I:40: Do what thou wilt shall be the whole of the Law. -------------- next part -------------- A non-text attachment was scrubbed... Name: kernel-log.txt.bz2 Type: application/x-bzip2 Size: 23758 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100305/f077433b/attachment-0002.bz2 -------------- next part -------------- A non-text attachment was scrubbed... Name: kernel-lustre-log.txt.bz2 Type: application/x-bzip2 Size: 1305 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100305/f077433b/attachment-0003.bz2
Bernd Schubert
2010-Mar-06 00:38 UTC
[Lustre-discuss] Extremely high load and hanging processes on a Lustre client
On Friday 05 March 2010, G?tz Waschk wrote:> Hi everyone, > > I have a critical problem on one of my Lustre client machines running > Scientific Linux 5.4 and the patchless Lustre 1.8.2 client. After a > few days of usage, some processes like cp and kswapd0 start to use > 100% CPU. Only 180k of swap space are in use though. > > Processes that try to access Lustre use a lot of CPU and seem to hang. > > There is some output in the kernel log I''ll attach to this mail. > > Do you have any idea what to test before rebooting the machine?Don''t reboot, but disable LRU resizing. for i in /proc/fs/lustre/ldlm/namespaces/*; do echo 800 > ${i}/lru_size; done At least that helped all the time before when we had that problem. I hoped it would be fixed in 1.8.2, but seems it is not. Please open a bug report. Thanks, Bernd -- Bernd Schubert DataDirect Networks
Götz Waschk
2010-Mar-08 12:49 UTC
[Lustre-discuss] Extremely high load and hanging processes on a Lustre client
2010/3/6 Bernd Schubert <bs_lists at aakef.fastmail.fm>:>> Do you have any idea what to test before rebooting the machine? > Don''t reboot, but disable LRU resizing. > for i in /proc/fs/lustre/ldlm/namespaces/*; do echo 800 > ${i}/lru_size; done > At least that helped all the time before when we had that problem. I hoped it > would be fixed in 1.8.2, but seems it is not. Please open a bug report.Hi Bernd, thanks for your help, it worked. I have opened a bug here: https://bugzilla.lustre.org/show_bug.cgi?id=22276 Regards, G?tz Waschk