Hi I'm relatively new to OCFS2 and I am using in in conjunction with DRBD. I'm hitting issues when the mounted file system is used heavily. The process doing the reading/writing sometimes hangs and gets killed by the kernel. E.g. [925741.227267] INFO: task rsync:29141 blocked for more than 120 seconds. [925741.238300] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [925741.260379] rsync D ffff88042fcd39c0 0 29141 29139 0x00000000 [925741.260382] ffff8802ffba98d0 0000000000000082 ffff8804157c9700 ffff8802ffba9fd8 [925741.260386] ffff8802ffba9fd8 ffff8802ffba9fd8 ffff880419545c00 ffff8804157c9700 [925741.260389] ffffffffa0532295 ffff8804157c9700 ffff880408e6ff38 ffff880408e6ff40 [925741.260393] Call Trace: [925741.260409] [<ffffffffa0532295>] ? ocfs2_read_blocks+0x3e5/0x6a0 [ocfs2] [925741.260421] [<ffffffff81680b89>] schedule+0x29/0x70 [925741.260425] [<ffffffff8168199d>] rwsem_down_failed_common+0xcd/0x170 [925741.260429] [<ffffffff81681a75>] rwsem_down_read_failed+0x15/0x17 [925741.260432] [<ffffffff813365f4>] call_rwsem_down_read_failed+0x14/0x30 [925741.260436] [<ffffffff8167fd64>] ? down_read+0x24/0x2b [925741.260453] [<ffffffffa0557e71>] ocfs2_start_trans+0xe1/0x1d0 [ocfs2] [925741.260475] [<ffffffffa052ee07>] ocfs2_write_begin_nolock+0x3e7/0x1d50 [ocfs2] [925741.260492] [<ffffffffa0532295>] ? ocfs2_read_blocks+0x3e5/0x6a0 [ocfs2] [925741.260509] [<ffffffffa0550f90>] ? ocfs2_inode_cache_io_lock+0x20/0x20 [ocfs2] [925741.260527] [<ffffffffa0558250>] ? ocfs2_extend_trans+0x200/0x200 [ocfs2] [925741.260542] [<ffffffffa053086e>] ocfs2_write_begin+0xfe/0x220 [ocfs2] [925741.260547] [<ffffffff81122706>] generic_file_buffered_write+0x116/0x280 [925741.260564] [<ffffffffa0550c3a>] ocfs2_file_aio_write+0x82a/0x880 [ocfs2] [925741.260568] [<ffffffff81181746>] do_sync_write+0xe6/0x120 [925741.260572] [<ffffffff812b2bac>] ? security_file_permission+0x2c/0xb0 [925741.260575] [<ffffffff81181d21>] ? rw_verify_area+0x61/0xf0 [925741.260578] [<ffffffff8118203c>] vfs_write+0xac/0x180 [925741.260580] [<ffffffff8118236a>] sys_write+0x4a/0x90 [925741.260585] [<ffffffff81689d29>] system_call_fastpath+0x16/0x1b and [1029000.298598] INFO: task df:6338 blocked for more than 120 seconds. [1029000.309552] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1029000.331360] df D ffff88042fd139c0 0 6338 1516 0x00000000 [1029000.331365] ffff880087d69b08 0000000000000082 ffff8803c0cb2e00 ffff880087d69fd8 [1029000.331370] ffff880087d69fd8 ffff880087d69fd8 ffff8804195d8000 ffff8803c0cb2e00 [1029000.331373] ffff88042fffda08 7fffffffffffffff ffff880087d69d20 ffff880087d69d28 [1029000.331383] Call Trace: [1029000.331393] [<ffffffff81680b89>] schedule+0x29/0x70 [1029000.331397] [<ffffffff8167f064>] schedule_timeout+0x2a4/0x320 [1029000.331403] [<ffffffffa049d020>] ? o2dlm_lock_ast_wrapper+0x20/0x20 [ocfs2_stack_o2cb] [1029000.331407] [<ffffffff8167f42d>] ? mutex_lock+0x1d/0x50 [1029000.331417] [<ffffffff8107e5e1>] ? in_group_p+0x31/0x40 [1029000.331425] [<ffffffff8118d176>] ? generic_permission+0x176/0x260 [1029000.331429] [<ffffffff816809af>] wait_for_common+0xdf/0x190 [1029000.331433] [<ffffffff81087c40>] ? try_to_wake_up+0x2a0/0x2a0 [1029000.331437] [<ffffffff81680b5d>] wait_for_completion+0x1d/0x20 [1029000.331461] [<ffffffffa050bd49>] __ocfs2_cluster_lock.isra.31+0x219/0x840 [ocfs2] [1029000.331470] [<ffffffff8118e813>] ? lookup_fast+0xd3/0x310 [1029000.331473] [<ffffffff8107e86a>] ? lg_local_unlock+0x1a/0x20 [1029000.331476] [<ffffffff8118dd65>] ? complete_walk+0xa5/0x130 [1029000.331499] [<ffffffffa050d671>] ocfs2_inode_lock_full_nested+0x201/0x4e0 [ocfs2] [1029000.331527] [<ffffffffa055e5f5>] ocfs2_statfs+0x65/0x320 [ocfs2] [1029000.331532] [<ffffffff811b1261>] statfs_by_dentry+0xa1/0x140 [1029000.331535] [<ffffffff811b131b>] vfs_statfs+0x1b/0xb0 [1029000.331538] [<ffffffff811b14a7>] user_statfs+0x37/0x50 [1029000.331542] [<ffffffff811b1540>] sys_statfs+0x20/0x40 [1029000.331547] [<ffffffff81689d29>] system_call_fastpath+0x16/0x1b As you can see it isn't one specific thing. A largish rsync copy from a remote server and a simple df both triggered it. The stack traces look pretty different as well which doesn't help. The setup is two identical servers, mirroring disks using DRBD with a fairly vanilla OCFS setup on top. I followed the DRBD user guide for configuration. All packages came from Ubuntu's repositories. I've done some googling and haven't found anything too helpful unfortunately. I can't tell what could be causing it. Any help would be greatly appreciated. Thanks -- Nick Stallman Agentpoint Pty Ltd The Real Estate Web Developers Sydney, Australia nick at agentpoint.com www.agentpoint.com.au | www.zooproperty.com | www.ginga.com.au | www.business2.com.au Business2.com.au is a real estate agent information website that helps you understand Portals, Technology and comes with FREE tools to help your Agency become an online success!