Piotr Teodorowski
2010-Dec-08 11:02 UTC
[Ocfs2-users] kernel BUG at fs/ocfs2/dlm/dlmthread.c:266
Hi. We've got yesterday a bug in OCFS2 (see subject). It freezed the node es1prap02 and caused reboot of some (not all) of other nodes belonging to the same ocfs2 cluster. Could you, please, let us know: * is the above bug known (or maybe alreadey fixed)? * is it possible to find out why three other nodes rebooted? It seems, that it can be another bug near 'quorum test' in our OCFS2 version. Am I right? our configuration: O2CB_HEARTBEAT_THRESHOLD=31 O2CB_IDLE_TIMEOUT_MS=30000 O2CB_KEEPALIVE_DELAY_MS=2000 O2CB_RECONNECT_DELAY_MS=2000 dist debian lenny ocfs2-tools 1.4.1-1 kernel 2.6.26-2-amd64 modinfo ocfs2 filename: /lib/modules/2.6.26-2-amd64/kernel/fs/ocfs2/ocfs2.ko license: GPL author: Oracle version: 1.5.0 description: OCFS2 1.5.0 srcversion: C692B48692BFC8597E4D7A7 depends: jbd,ocfs2_stackglue,ocfs2_nodemanager vermagic: 2.6.26-2-amd64 SMP mod_unload modversions Logs and cluseter configuration in atachment. Thanks Regards, Piotr Teodorowski -------------- next part -------------- A non-text attachment was scrubbed... Name: messages.logs.tgz Type: application/x-compressed-tar Size: 53432 bytes Desc: not available Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20101208/523ddae2/attachment-0002.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: netconsole.logs.tgz Type: application/x-compressed-tar Size: 6767 bytes Desc: not available Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20101208/523ddae2/attachment-0003.bin -------------- next part -------------- node: ip_port = 7777 ip_address = 172.28.4.48 number = 0 name = es1prgw01 cluster = ocfs2 node: ip_port = 7777 ip_address = 172.28.4.56 number = 1 name = es4prgw01 cluster = ocfs2 node: ip_port = 7777 ip_address = 172.28.4.65 number = 3 name = es1prap02 cluster = ocfs2 node: ip_port = 7777 ip_address = 172.28.4.66 number = 4 name = es1prap03 cluster = ocfs2 node: ip_port = 7777 ip_address = 172.28.4.80 number = 5 name = es4prap01 cluster = ocfs2 node: ip_port = 7777 ip_address = 172.28.4.81 number = 6 name = es4prap02 cluster = ocfs2 node: ip_port = 7777 ip_address = 172.28.4.64 number = 2 name = es1prap01 cluster = ocfs2 node: ip_port = 7777 ip_address = 172.28.4.78 number = 7 name = esiprap01 cluster = ocfs2 node: ip_port = 7777 ip_address = 172.28.4.67 number = 8 name = es1prap04 cluster = ocfs2 node: ip_port = 7777 ip_address = 172.28.4.68 number = 9 name = es1prap05 cluster = ocfs2 cluster: node_count = 10 name = ocfs2
Sunil Mushran
2010-Dec-08 17:05 UTC
[Ocfs2-users] kernel BUG at fs/ocfs2/dlm/dlmthread.c:266
<http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.26.y.git;a=blob;f=fs/ocfs2/dlm/dlmthread.c;h=4060bb328bc8a08c22bbd77c59835d757ebdcda5;hb=refs/tags/v2.6.26.2#l265>265 if (dlm_purge_lockres(dlm, lockres)) 266 BUG(); Known issue. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7beaf243787f85a2ef9213ccf13ab4a243283fde However, I don't know whether we can apply it as is on 2.6.26. That's an old kernel and may require some other patches too. Fixed in 2.6.36. Backported to 2.6.32, 2.6.34 and 2.6.35. On 12/08/2010 03:02 AM, Piotr Teodorowski wrote:> Hi. > > We've got yesterday a bug in OCFS2 (see subject). It freezed the node > es1prap02 and caused reboot of some (not all) of other nodes belonging to the > same ocfs2 cluster. Could you, please, let us know: > > * is the above bug known (or maybe alreadey fixed)? > * is it possible to find out why three other nodes rebooted? It seems, that it > can be another bug near 'quorum test' in our OCFS2 version. Am I right? > > > our configuration: > O2CB_HEARTBEAT_THRESHOLD=31 > O2CB_IDLE_TIMEOUT_MS=30000 > O2CB_KEEPALIVE_DELAY_MS=2000 > O2CB_RECONNECT_DELAY_MS=2000 > > dist debian lenny > ocfs2-tools 1.4.1-1 > kernel 2.6.26-2-amd64 > > modinfo ocfs2 > filename: /lib/modules/2.6.26-2-amd64/kernel/fs/ocfs2/ocfs2.ko > license: GPL > author: Oracle > version: 1.5.0 > description: OCFS2 1.5.0 > srcversion: C692B48692BFC8597E4D7A7 > depends: jbd,ocfs2_stackglue,ocfs2_nodemanager > vermagic: 2.6.26-2-amd64 SMP mod_unload modversions > > > Logs and cluseter configuration in atachment. > > Thanks > > Regards, > Piotr Teodorowski > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users-------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20101208/df270f1a/attachment.html