netbsd at tango.lu
2017-Nov-28 10:04 UTC
[Ocfs2-users] OCFS2 CRASH Again and Again, this filesystem is COMPLETE GARBAGE
Hello, Servers crashed like 20 times since the last time I wrote to the list. Today is the last with: [ 1901.810483] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1901.918314] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1902.026297] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1902.134304] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1902.242303] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1902.350317] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1902.458320] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1902.566318] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1902.674300] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1902.782286] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1902.868732] o2net: Connection to node webserver1 (num 0) at 10.0.0.3:7777 shutdown, state 7 [ 1904.882872] o2net: Connected to node webserver1 (num 0) at 10.0.0.3:7777 [ 1904.883058] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1904.990594] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1905.098771] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1905.206754] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1905.314710] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1905.422646] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1905.530853] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1905.638652] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1905.746728] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1905.854609] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1905.962636] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1906.070921] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1906.178744] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1906.286737] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1906.394632] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1906.502613] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1906.610862] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1906.718651] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1906.826857] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1906.934580] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1907.042570] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1907.150604] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1907.258684] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1907.366672] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 [ 1909.714215] o2net: Connection to node webserver2 (num 1) at 10.0.0.4:7777 has been idle for 30.720 secs. [ 1934.290226] INFO: task php-fpm7.0:823 blocked for more than 120 seconds. [ 1934.290668] Not tainted 4.14.0OCFS #1 [ 1934.290980] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1934.291482] php-fpm7.0 D 0 823 481 0x00000000 [ 1934.291486] Call Trace: [ 1934.291523] __schedule+0x3cc/0x850 [ 1934.291526] schedule+0x36/0x80 [ 1934.291532] schedule_timeout+0x1da/0x350 [ 1934.291611] ? ocfs2_permission+0x79/0xe0 [ocfs2] [ 1934.291614] wait_for_completion+0x121/0x190 [ 1934.291616] ? wait_for_completion+0x121/0x190 [ 1934.291628] ? wake_up_q+0x80/0x80 [ 1934.291651] __ocfs2_cluster_lock.isra.37+0x2d9/0x7b0 [ocfs2] [ 1934.291674] ocfs2_inode_lock_full_nested+0x2f2/0x8d0 [ocfs2] [ 1934.291695] ? ocfs2_inode_lock_full_nested+0x2f2/0x8d0 [ocfs2] [ 1934.291718] ocfs2_inode_revalidate+0x82/0x180 [ocfs2] [ 1934.291740] ocfs2_getattr+0x3c/0x100 [ocfs2] [ 1934.291753] vfs_getattr_nosec+0x70/0x80 [ 1934.291755] vfs_statx+0x8d/0xe0 [ 1934.291757] SYSC_newstat+0x3d/0x70 [ 1934.291760] SyS_newstat+0xe/0x10 [ 1934.291762] entry_SYSCALL_64_fastpath+0x1e/0xa9 [ 1934.291765] RIP: 0033:0x7faad233e085 [ 1934.291766] RSP: 002b:00007ffcc8ce5968 EFLAGS: 00000246 ORIG_RAX: 0000000000000004 [ 1934.291768] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007faad233e085 [ 1934.291769] RDX: 00007ffcc8ce5af0 RSI: 00007ffcc8ce5af0 RDI: 00007faab72f39f8 [ 1934.291770] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 1934.291771] R10: fffffffffffffd28 R11: 0000000000000246 R12: 00007faacea00040 [ 1934.291772] R13: 00000000ffffffff R14: 0000000000000200 R15: 00007faab8b7e7a0 [ 1938.386144] o2net: Connection to node webserver1 (num 0) at 10.0.0.3:7777 has been idle for 30.611 secs. Now all nodes running kernel 4.14 on: No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 9.2 (stretch) Release: 9.2 Codename: stretch Starting with server 3 crashing then after it come back 2 crashed and then 1 and they ended up in a crashing loop where all the KVMs had to be restarted and started in order 1 2 3. I seriously start to get fed up with this crap filesystem. Mount options: UUID=<UID> /mnt/webs ocfs2 _netdev,defaults,data=writeback,noatime,nodiratime,commit=300,journal_async_commit 0 0 Sysctl options: vm.min_free_kbytes=131072 vm.zone_reclaim_mode=1 Just please recommend one parameter what I can try to change to make a difference not to crash? Thank you!
Gang He
2017-Nov-28 10:25 UTC
[Ocfs2-users] OCFS2 CRASH Again and Again, this filesystem is COMPLETE GARBAGE
Hello Netbsd, What was your problem? dlm_send_remote_convert_request failed, or hung_task_timeout? Thanks Gang>>> > Hello, > > Servers crashed like 20 times since the last time I wrote to the list. > Today is the last with: > > [ 1901.810483] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1901.918314] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.026297] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.134304] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.242303] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.350317] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.458320] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.566318] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.674300] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.782286] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.868732] o2net: Connection to node webserver1 (num 0) at > 10.0.0.3:7777 shutdown, state 7 > [ 1904.882872] o2net: Connected to node webserver1 (num 0) at > 10.0.0.3:7777 > [ 1904.883058] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1904.990594] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.098771] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.206754] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.314710] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.422646] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.530853] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.638652] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.746728] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.854609] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.962636] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.070921] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.178744] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.286737] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.394632] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.502613] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.610862] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.718651] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.826857] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.934580] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1907.042570] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1907.150604] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1907.258684] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1907.366672] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1909.714215] o2net: Connection to node webserver2 (num 1) at > 10.0.0.4:7777 has been idle for 30.720 secs. > [ 1934.290226] INFO: task php-fpm7.0:823 blocked for more than 120 > seconds. > [ 1934.290668] Not tainted 4.14.0OCFS #1 > [ 1934.290980] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 1934.291482] php-fpm7.0 D 0 823 481 0x00000000 > [ 1934.291486] Call Trace: > [ 1934.291523] __schedule+0x3cc/0x850 > [ 1934.291526] schedule+0x36/0x80 > [ 1934.291532] schedule_timeout+0x1da/0x350 > [ 1934.291611] ? ocfs2_permission+0x79/0xe0 [ocfs2] > [ 1934.291614] wait_for_completion+0x121/0x190 > [ 1934.291616] ? wait_for_completion+0x121/0x190 > [ 1934.291628] ? wake_up_q+0x80/0x80 > [ 1934.291651] __ocfs2_cluster_lock.isra.37+0x2d9/0x7b0 [ocfs2] > [ 1934.291674] ocfs2_inode_lock_full_nested+0x2f2/0x8d0 [ocfs2] > [ 1934.291695] ? ocfs2_inode_lock_full_nested+0x2f2/0x8d0 [ocfs2] > [ 1934.291718] ocfs2_inode_revalidate+0x82/0x180 [ocfs2] > [ 1934.291740] ocfs2_getattr+0x3c/0x100 [ocfs2] > [ 1934.291753] vfs_getattr_nosec+0x70/0x80 > [ 1934.291755] vfs_statx+0x8d/0xe0 > [ 1934.291757] SYSC_newstat+0x3d/0x70 > [ 1934.291760] SyS_newstat+0xe/0x10 > [ 1934.291762] entry_SYSCALL_64_fastpath+0x1e/0xa9 > [ 1934.291765] RIP: 0033:0x7faad233e085 > [ 1934.291766] RSP: 002b:00007ffcc8ce5968 EFLAGS: 00000246 ORIG_RAX: > 0000000000000004 > [ 1934.291768] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: > 00007faad233e085 > [ 1934.291769] RDX: 00007ffcc8ce5af0 RSI: 00007ffcc8ce5af0 RDI: > 00007faab72f39f8 > [ 1934.291770] RBP: 0000000000000000 R08: 0000000000000000 R09: > 0000000000000000 > [ 1934.291771] R10: fffffffffffffd28 R11: 0000000000000246 R12: > 00007faacea00040 > [ 1934.291772] R13: 00000000ffffffff R14: 0000000000000200 R15: > 00007faab8b7e7a0 > [ 1938.386144] o2net: Connection to node webserver1 (num 0) at > 10.0.0.3:7777 has been idle for 30.611 secs. > > > Now all nodes running kernel 4.14 on: > > No LSB modules are available. > Distributor ID: Debian > Description: Debian GNU/Linux 9.2 (stretch) > Release: 9.2 > Codename: stretch > > Starting with server 3 crashing then after it come back 2 crashed and > then 1 and they ended up in a crashing loop where all the KVMs had to be > restarted and started in order 1 2 3. > > I seriously start to get fed up with this crap filesystem. > > Mount options: > > UUID=<UID> /mnt/webs ocfs2 > _netdev,defaults,data=writeback,noatime,nodiratime,commit=300,journal_async_ > commit > 0 0 > > Sysctl options: > > vm.min_free_kbytes=131072 > vm.zone_reclaim_mode=1 > > Just please recommend one parameter what I can try to change to make a > difference not to crash? > > Thank you! > > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-users
Changwei Ge
2017-Nov-29 00:24 UTC
[Ocfs2-users] OCFS2 CRASH Again and Again, this filesystem is COMPLETE GARBAGE
Hi, It seems that your cluster has something wrong with connection between nodes. So no dlm message can be sent out. This may cause a node being fenced, thus to crash. Please check your network condition including switch, Ethernet HBA card, etc. Thanks, Changwei On 2017/11/28 18:07, netbsd at tango.lu wrote:> Hello, > > Servers crashed like 20 times since the last time I wrote to the list. > Today is the last with: > > [ 1901.810483] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1901.918314] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.026297] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.134304] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.242303] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.350317] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.458320] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.566318] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.674300] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.782286] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -107 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1902.868732] o2net: Connection to node webserver1 (num 0) at > 10.0.0.3:7777 shutdown, state 7 > [ 1904.882872] o2net: Connected to node webserver1 (num 0) at > 10.0.0.3:7777 > [ 1904.883058] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1904.990594] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.098771] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.206754] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.314710] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.422646] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.530853] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.638652] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.746728] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.854609] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1905.962636] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.070921] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.178744] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.286737] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.394632] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.502613] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.610862] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.718651] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.826857] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1906.934580] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1907.042570] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1907.150604] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1907.258684] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1907.366672] (php-fpm7.0,822,3):dlm_send_remote_convert_request:420 > ERROR: Error -92 when sending message 504 (key 0x91e4e5c6) to node 0 > [ 1909.714215] o2net: Connection to node webserver2 (num 1) at > 10.0.0.4:7777 has been idle for 30.720 secs. > [ 1934.290226] INFO: task php-fpm7.0:823 blocked for more than 120 > seconds. > [ 1934.290668] Not tainted 4.14.0OCFS #1 > [ 1934.290980] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 1934.291482] php-fpm7.0 D 0 823 481 0x00000000 > [ 1934.291486] Call Trace: > [ 1934.291523] __schedule+0x3cc/0x850 > [ 1934.291526] schedule+0x36/0x80 > [ 1934.291532] schedule_timeout+0x1da/0x350 > [ 1934.291611] ? ocfs2_permission+0x79/0xe0 [ocfs2] > [ 1934.291614] wait_for_completion+0x121/0x190 > [ 1934.291616] ? wait_for_completion+0x121/0x190 > [ 1934.291628] ? wake_up_q+0x80/0x80 > [ 1934.291651] __ocfs2_cluster_lock.isra.37+0x2d9/0x7b0 [ocfs2] > [ 1934.291674] ocfs2_inode_lock_full_nested+0x2f2/0x8d0 [ocfs2] > [ 1934.291695] ? ocfs2_inode_lock_full_nested+0x2f2/0x8d0 [ocfs2] > [ 1934.291718] ocfs2_inode_revalidate+0x82/0x180 [ocfs2] > [ 1934.291740] ocfs2_getattr+0x3c/0x100 [ocfs2] > [ 1934.291753] vfs_getattr_nosec+0x70/0x80 > [ 1934.291755] vfs_statx+0x8d/0xe0 > [ 1934.291757] SYSC_newstat+0x3d/0x70 > [ 1934.291760] SyS_newstat+0xe/0x10 > [ 1934.291762] entry_SYSCALL_64_fastpath+0x1e/0xa9 > [ 1934.291765] RIP: 0033:0x7faad233e085 > [ 1934.291766] RSP: 002b:00007ffcc8ce5968 EFLAGS: 00000246 ORIG_RAX: > 0000000000000004 > [ 1934.291768] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: > 00007faad233e085 > [ 1934.291769] RDX: 00007ffcc8ce5af0 RSI: 00007ffcc8ce5af0 RDI: > 00007faab72f39f8 > [ 1934.291770] RBP: 0000000000000000 R08: 0000000000000000 R09: > 0000000000000000 > [ 1934.291771] R10: fffffffffffffd28 R11: 0000000000000246 R12: > 00007faacea00040 > [ 1934.291772] R13: 00000000ffffffff R14: 0000000000000200 R15: > 00007faab8b7e7a0 > [ 1938.386144] o2net: Connection to node webserver1 (num 0) at > 10.0.0.3:7777 has been idle for 30.611 secs. > > > Now all nodes running kernel 4.14 on: > > No LSB modules are available. > Distributor ID: Debian > Description: Debian GNU/Linux 9.2 (stretch) > Release: 9.2 > Codename: stretch > > Starting with server 3 crashing then after it come back 2 crashed and > then 1 and they ended up in a crashing loop where all the KVMs had to be > restarted and started in order 1 2 3. > > I seriously start to get fed up with this crap filesystem. > > Mount options: > > UUID=<UID> /mnt/webs ocfs2 > _netdev,defaults,data=writeback,noatime,nodiratime,commit=300,journal_async_commit > 0 0 > > Sysctl options: > > vm.min_free_kbytes=131072 > vm.zone_reclaim_mode=1 > > Just please recommend one parameter what I can try to change to make a > difference not to crash? > > Thank you! > > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-users >