Rafał Radecki
2012-Apr-11 09:48 UTC
[CentOS] CentOS 5 - problem with kernel/process: task blocked for more than 120 seconds.
Hi All. I have a server which hanged two times because we could not connect to the box, it was not responding. In /var/log/messages I saw: Apr 11 10:13:29 server kernel: INFO: task imap:5855 blocked for more than 120 seconds. Apr 11 10:13:29 server kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 11 10:13:29 server kernel: imap D 000124D0 1716 5855 14393 6320 5839 (NOTLB) Apr 11 10:13:29 server kernel: d1adeed0 00000082 632f304d 000124d0 000124cf 0000000e 00000000 00000009 Apr 11 10:13:29 server kernel: c750b000 632f379a 000124d0 0000074d 00000000 c750b10c c17f6d00 d2637580 Apr 11 10:13:29 server kernel: f75d3e68 00000000 00000000 d1adeecc c041ec40 00000000 d1adeed8 f75d3e50 Apr 11 10:13:29 server kernel: Call Trace: Apr 11 10:13:29 server kernel: [<c041ec40>] __wake_up+0x2a/0x3d Apr 11 10:13:29 server kernel: [<c0436d23>] prepare_to_wait+0x24/0x46 Apr 11 10:13:29 server kernel: [<f885f1e2>] log_wait_commit+0x80/0xc7 [jbd] Apr 11 10:13:29 server kernel: [<c0436bd7>] autoremove_wake_function+0x0/0x2d Apr 11 10:13:29 server kernel: [<f885a661>] journal_stop+0x195/0x1ba [jbd] Apr 11 10:13:29 server kernel: [<c0495caa>] __writeback_single_inode+0x197/0x2a3 Apr 11 10:13:29 server kernel: [<c045ee18>] do_writepages+0x2b/0x32 Apr 11 10:13:29 server kernel: [<c045a96c>] __filemap_fdatawrite_range+0x66/0x72 Apr 11 10:13:29 server kernel: [<c0496346>] sync_inode+0x19/0x24 Apr 11 10:13:29 server kernel: [<f8983009>] ext3_sync_file+0xb1/0xdc [ext3] Apr 11 10:13:29 server kernel: [<c047956c>] do_fsync+0x41/0x83 Apr 11 10:13:29 server kernel: [<c04795cb>] __do_fsync+0x1d/0x2b Apr 11 10:13:29 server kernel: [<c0404f4b>] syscall_call+0x7/0xb Apr 11 10:13:29 server kernel: ======================Apr 11 10:13:29 server kernel: INFO: task imap:6882 blocked for more than 120 seconds. Apr 11 10:13:29 server kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 11 10:13:29 server kernel: imap D 000124D0 1772 6882 14393 6999 6053 (NOTLB) Apr 11 10:13:29 server kernel: cf1d0ed0 00000082 632f379a 000124d0 000124cc 0000000e 00000000 00000009 Apr 11 10:13:29 server kernel: f52e9000 632f3f0a 000124d0 00000770 00000000 f52e910c c17f6d00 cbe44740 Apr 11 10:13:29 server kernel: f75d3e68 00000000 00000000 cf1d0ecc c041ec40 00000000 cf1d0ed8 f75d3e50 Apr 11 10:13:29 server kernel: Call Trace: Apr 11 10:13:29 server kernel: [<c041ec40>] __wake_up+0x2a/0x3d Apr 11 10:13:29 server kernel: [<c0436d23>] prepare_to_wait+0x24/0x46 Apr 11 10:13:31 server kernel: [<f885f1e2>] log_wait_commit+0x80/0xc7 [jbd] Apr 11 10:13:31 server kernel: [<c0436bd7>] autoremove_wake_function+0x0/0x2d Apr 11 10:13:31 server kernel: [<f885a661>] journal_stop+0x195/0x1ba [jbd] Apr 11 10:13:31 server kernel: [<c0495caa>] __writeback_single_inode+0x197/0x2a3 Apr 11 10:13:31 server kernel: [<c045ee18>] do_writepages+0x2b/0x32 Apr 11 10:13:31 server kernel: [<c045a96c>] __filemap_fdatawrite_range+0x66/0x72 Apr 11 10:13:31 server kernel: [<c0496346>] sync_inode+0x19/0x24 Apr 11 10:13:31 server kernel: [<f8983009>] ext3_sync_file+0xb1/0xdc [ext3] Apr 11 10:13:31 server kernel: [<c047956c>] do_fsync+0x41/0x83 Apr 11 10:13:31 server kernel: [<c04795cb>] __do_fsync+0x1d/0x2b Apr 11 10:13:31 server kernel: [<c0404f4b>] syscall_call+0x7/0xb Apr 11 10:13:31 server kernel: ====================== I have found: http://www.nico.schottelius.org/blog/reboot-linux-if-task-blocked-for-more-than-n-seconds/, so I think that the processes took all resources of the machine. What should be done to troubleshoot the situation? I have information that probably imap:5855 and imap:6882 caused the problem. How the output: Apr 11 10:13:29 server kernel: imap D 000124D0 1772 6882 14393 6999 6053 (NOTLB) Apr 11 10:13:29 server kernel: cf1d0ed0 00000082 632f379a 000124d0 000124cc 0000000e 00000000 00000009 Apr 11 10:13:29 server kernel: f52e9000 632f3f0a 000124d0 00000770 00000000 f52e910c c17f6d00 cbe44740 Apr 11 10:13:29 server kernel: f75d3e68 00000000 00000000 cf1d0ecc c041ec40 00000000 cf1d0ed8 f75d3e50 Apr 11 10:13:29 server kernel: Call Trace: Apr 11 10:13:29 server kernel: [<c041ec40>] __wake_up+0x2a/0x3d Apr 11 10:13:29 server kernel: [<c0436d23>] prepare_to_wait+0x24/0x46 Apr 11 10:13:31 server kernel: [<f885f1e2>] log_wait_commit+0x80/0xc7 [jbd] Apr 11 10:13:31 server kernel: [<c0436bd7>] autoremove_wake_function+0x0/0x2d Apr 11 10:13:31 server kernel: [<f885a661>] journal_stop+0x195/0x1ba [jbd] Apr 11 10:13:31 server kernel: [<c0495caa>] __writeback_single_inode+0x197/0x2a3 Apr 11 10:13:31 server kernel: [<c045ee18>] do_writepages+0x2b/0x32 Apr 11 10:13:31 server kernel: [<c045a96c>] __filemap_fdatawrite_range+0x66/0x72 Apr 11 10:13:31 server kernel: [<c0496346>] sync_inode+0x19/0x24 Apr 11 10:13:31 server kernel: [<f8983009>] ext3_sync_file+0xb1/0xdc [ext3] Apr 11 10:13:31 server kernel: [<c047956c>] do_fsync+0x41/0x83 Apr 11 10:13:31 server kernel: [<c04795cb>] __do_fsync+0x1d/0x2b Apr 11 10:13:31 server kernel: [<c0404f4b>] syscall_call+0x7/0xb should be read? Any howto? Best regards, Rafal Radecki.
Rafał Radecki
2012-Apr-11 09:53 UTC
[CentOS] CentOS 5 - problem with kernel/process: task blocked for more than 120 seconds.
The host is: Linux server 2.6.18-238.19.1.el5 #1 SMP Fri Jul 15 07:32:29 EDT 2011 i686 i686 i386 GNU/Linux CentOS release 5.6 (Final) dovecot 1.2.16 Best regards, R. W dniu 11 kwietnia 2012 11:48 u?ytkownik Rafa? Radecki < radecki.rafal at gmail.com> napisa?:> Hi All. > > I have a server which hanged two times because we could not connect to the > box, it was not responding. In /var/log/messages I saw: > > Apr 11 10:13:29 server kernel: INFO: task imap:5855 blocked for more than > 120 seconds. > Apr 11 10:13:29 server kernel: "echo 0 > > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Apr 11 10:13:29 server kernel: imap D 000124D0 1716 5855 > 14393 6320 5839 (NOTLB) > Apr 11 10:13:29 server kernel: d1adeed0 00000082 632f304d 000124d0 > 000124cf 0000000e 00000000 00000009 > Apr 11 10:13:29 server kernel: c750b000 632f379a 000124d0 0000074d > 00000000 c750b10c c17f6d00 d2637580 > Apr 11 10:13:29 server kernel: f75d3e68 00000000 00000000 d1adeecc > c041ec40 00000000 d1adeed8 f75d3e50 > Apr 11 10:13:29 server kernel: Call Trace: > Apr 11 10:13:29 server kernel: [<c041ec40>] __wake_up+0x2a/0x3d > Apr 11 10:13:29 server kernel: [<c0436d23>] prepare_to_wait+0x24/0x46 > Apr 11 10:13:29 server kernel: [<f885f1e2>] log_wait_commit+0x80/0xc7 [jbd] > Apr 11 10:13:29 server kernel: [<c0436bd7>] > autoremove_wake_function+0x0/0x2d > Apr 11 10:13:29 server kernel: [<f885a661>] journal_stop+0x195/0x1ba [jbd] > Apr 11 10:13:29 server kernel: [<c0495caa>] > __writeback_single_inode+0x197/0x2a3 > Apr 11 10:13:29 server kernel: [<c045ee18>] do_writepages+0x2b/0x32 > Apr 11 10:13:29 server kernel: [<c045a96c>] > __filemap_fdatawrite_range+0x66/0x72 > Apr 11 10:13:29 server kernel: [<c0496346>] sync_inode+0x19/0x24 > Apr 11 10:13:29 server kernel: [<f8983009>] ext3_sync_file+0xb1/0xdc [ext3] > Apr 11 10:13:29 server kernel: [<c047956c>] do_fsync+0x41/0x83 > Apr 11 10:13:29 server kernel: [<c04795cb>] __do_fsync+0x1d/0x2b > Apr 11 10:13:29 server kernel: [<c0404f4b>] syscall_call+0x7/0xb > Apr 11 10:13:29 server kernel: ======================> Apr 11 10:13:29 server kernel: INFO: task imap:6882 blocked for more than > 120 seconds. > Apr 11 10:13:29 server kernel: "echo 0 > > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Apr 11 10:13:29 server kernel: imap D 000124D0 1772 6882 > 14393 6999 6053 (NOTLB) > Apr 11 10:13:29 server kernel: cf1d0ed0 00000082 632f379a 000124d0 > 000124cc 0000000e 00000000 00000009 > Apr 11 10:13:29 server kernel: f52e9000 632f3f0a 000124d0 00000770 > 00000000 f52e910c c17f6d00 cbe44740 > Apr 11 10:13:29 server kernel: f75d3e68 00000000 00000000 cf1d0ecc > c041ec40 00000000 cf1d0ed8 f75d3e50 > Apr 11 10:13:29 server kernel: Call Trace: > Apr 11 10:13:29 server kernel: [<c041ec40>] __wake_up+0x2a/0x3d > Apr 11 10:13:29 server kernel: [<c0436d23>] prepare_to_wait+0x24/0x46 > Apr 11 10:13:31 server kernel: [<f885f1e2>] log_wait_commit+0x80/0xc7 [jbd] > Apr 11 10:13:31 server kernel: [<c0436bd7>] > autoremove_wake_function+0x0/0x2d > Apr 11 10:13:31 server kernel: [<f885a661>] journal_stop+0x195/0x1ba [jbd] > Apr 11 10:13:31 server kernel: [<c0495caa>] > __writeback_single_inode+0x197/0x2a3 > Apr 11 10:13:31 server kernel: [<c045ee18>] do_writepages+0x2b/0x32 > Apr 11 10:13:31 server kernel: [<c045a96c>] > __filemap_fdatawrite_range+0x66/0x72 > Apr 11 10:13:31 server kernel: [<c0496346>] sync_inode+0x19/0x24 > Apr 11 10:13:31 server kernel: [<f8983009>] ext3_sync_file+0xb1/0xdc [ext3] > Apr 11 10:13:31 server kernel: [<c047956c>] do_fsync+0x41/0x83 > Apr 11 10:13:31 server kernel: [<c04795cb>] __do_fsync+0x1d/0x2b > Apr 11 10:13:31 server kernel: [<c0404f4b>] syscall_call+0x7/0xb > Apr 11 10:13:31 server kernel: ======================> > I have found: > http://www.nico.schottelius.org/blog/reboot-linux-if-task-blocked-for-more-than-n-seconds/, so I think that the processes took all resources of the machine. > > What should be done to troubleshoot the situation? > I have information that probably imap:5855 and imap:6882 caused the > problem. How the output: > > Apr 11 10:13:29 server kernel: imap D 000124D0 1772 6882 > 14393 6999 6053 (NOTLB) > Apr 11 10:13:29 server kernel: cf1d0ed0 00000082 632f379a 000124d0 > 000124cc 0000000e 00000000 00000009 > Apr 11 10:13:29 server kernel: f52e9000 632f3f0a 000124d0 00000770 > 00000000 f52e910c c17f6d00 cbe44740 > Apr 11 10:13:29 server kernel: f75d3e68 00000000 00000000 cf1d0ecc > c041ec40 00000000 cf1d0ed8 f75d3e50 > Apr 11 10:13:29 server kernel: Call Trace: > Apr 11 10:13:29 server kernel: [<c041ec40>] __wake_up+0x2a/0x3d > Apr 11 10:13:29 server kernel: [<c0436d23>] prepare_to_wait+0x24/0x46 > Apr 11 10:13:31 server kernel: [<f885f1e2>] log_wait_commit+0x80/0xc7 [jbd] > Apr 11 10:13:31 server kernel: [<c0436bd7>] > autoremove_wake_function+0x0/0x2d > Apr 11 10:13:31 server kernel: [<f885a661>] journal_stop+0x195/0x1ba [jbd] > Apr 11 10:13:31 server kernel: [<c0495caa>] > __writeback_single_inode+0x197/0x2a3 > Apr 11 10:13:31 server kernel: [<c045ee18>] do_writepages+0x2b/0x32 > Apr 11 10:13:31 server kernel: [<c045a96c>] > __filemap_fdatawrite_range+0x66/0x72 > Apr 11 10:13:31 server kernel: [<c0496346>] sync_inode+0x19/0x24 > Apr 11 10:13:31 server kernel: [<f8983009>] ext3_sync_file+0xb1/0xdc [ext3] > Apr 11 10:13:31 server kernel: [<c047956c>] do_fsync+0x41/0x83 > Apr 11 10:13:31 server kernel: [<c04795cb>] __do_fsync+0x1d/0x2b > Apr 11 10:13:31 server kernel: [<c0404f4b>] syscall_call+0x7/0xb > should be read? Any howto? > > Best regards, > Rafal Radecki. >