Guozhonghua
2014-Feb-21 03:28 UTC
[Ocfs2-users] Hi everyone, is it an issue? The host is blocked as the issue created. Thanks a lot.
Hi everyone, as we test the performance of the ocfs2 with fio. As the test case running, one of host of ocfs2 cluster will be blocked a small time and restart sooner. The test environment is that there are six host sharing one iSCSI LUN which capacity is about 1T and it is formatted with ocfs2, and mount point on every host is /vms/vStore. All of the host's OS is ubuntu 12.04, and we upgrade the kernel with 3.2.50, and ocfs2 as compiled according with kernel 3.2.50. We test the performance of the ocfs2 with fio on one every host. The fio test configure is as below, and the filename is different on every host. Such as file1...file5 is on host1, file6....file10 are on host2, and so on. One example fio file is as below: root at cvknode4:~/fios_test4# cat 1024k_10r [global] ioengine=libaio rw=read bs=1024K time_based runtime=180 size=9g direct=1 iodepth=1 [file1] filename=/vms/vStor/file41 [file2] filename=/vms/vStor/file42 [file3] filename=/vms/vStor/file43 [file4] filename=/vms/vStor/file44 [file5] filename=/vms/vStor/file45 As we start fio tools on the hosts sequent, several minute later, one host will blocked and restart(fenced). Is it one issue of ocfs2? Or is there any fixed patch for it? The syslog is as below: Feb 19 17:50:01 cvknode9 CRON[16143]: (CRON) info (No MTA installed, discarding output) Feb 19 17:50:01 cvknode9 CRON[16147]: (CRON) info (No MTA installed, discarding output) Feb 19 17:50:01 cvknode9 CRON[16146]: (CRON) info (No MTA installed, discarding output) Feb 19 17:50:01 cvknode9 CRON[16144]: (CRON) info (No MTA installed, discarding output) Feb 19 17:50:01 cvknode9 CRON[16141]: (CRON) info (No MTA installed, discarding output) Feb 19 17:50:02 cvknode9 CRON[16134]: (CRON) info (No MTA installed, discarding output) Feb 19 17:50:03 cvknode9 crmadmin: [16194]: ERROR: admin_message_timeout: No messages received in 2 seconds Feb 19 17:50:03 cvknode9 CRON[16140]: (CRON) info (No MTA installed, discarding output) Feb 19 17:51:00 cvknode9 kernel: [ 803.464977] ------------[ cut here ]------------ Feb 19 17:51:00 cvknode9 kernel: [ 803.464991] WARNING: at kernel/watchdog.c:241 watchdog_overflow_callback+0x9a/0xc0() Feb 19 17:51:00 cvknode9 kernel: [ 803.464993] Hardware name: FlexServer B590 Feb 19 17:51:00 cvknode9 kernel: [ 803.464995] Watchdog detected hard LOCKUP on cpu 0 Feb 19 17:51:00 cvknode9 kernel: [ 803.464997] Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables ocfs2(O) quota_tree drbd lru_cache 8021q garp stp vhost_net macvtap macvlan kvm_intel kvm ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) configfs openvswitch_mod(O) nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc psmouse ioatdma dm_multipath serio_raw sb_edac hpilo edac_core dca acpi_power_meter mac_hid lp parport hpsa be2iscsi iscsi_boot_sysfs libiscsi be2net scsi_transport_iscsi Feb 19 17:51:00 cvknode9 kernel: [ 803.465065] Pid: 6029, comm: ocfs2dc Tainted: G O 3.2.50 #1 Feb 19 17:51:00 cvknode9 kernel: [ 803.465067] Call Trace: Feb 19 17:51:00 cvknode9 kernel: [ 803.465069] <NMI> [<ffffffff81066daf>] warn_slowpath_common+0x7f/0xc0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465084] [<ffffffff81066ea6>] warn_slowpath_fmt+0x46/0x50 Feb 19 17:51:00 cvknode9 kernel: [ 803.465089] [<ffffffff8101b833>] ? native_sched_clock+0x13/0x80 Feb 19 17:51:00 cvknode9 kernel: [ 803.465093] [<ffffffff810d6b1a>] watchdog_overflow_callback+0x9a/0xc0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465099] [<ffffffff8110eb76>] __perf_event_overflow+0x96/0x1f0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465103] [<ffffffff8110c491>] ? perf_event_update_userpage+0x11/0xc0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465109] [<ffffffff8102468a>] ? x86_perf_event_set_period+0xda/0x150 Feb 19 17:51:00 cvknode9 kernel: [ 803.465113] [<ffffffff8110f534>] perf_event_overflow+0x14/0x20 Feb 19 17:51:00 cvknode9 kernel: [ 803.465118] [<ffffffff81028c93>] intel_pmu_handle_irq+0x163/0x2e0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465130] [<ffffffff81644b01>] perf_event_nmi_handler+0x21/0x30 Feb 19 17:51:00 cvknode9 kernel: [ 803.465134] [<ffffffff816443d1>] do_nmi+0x101/0x350 Feb 19 17:51:00 cvknode9 kernel: [ 803.465138] [<ffffffff81643a30>] nmi+0x20/0x30 Feb 19 17:51:00 cvknode9 kernel: [ 803.465147] [<ffffffff8103db15>] ? __ticket_spin_lock+0x25/0x30 Feb 19 17:51:00 cvknode9 kernel: [ 803.465149] <<EOE>> <IRQ> [<ffffffff81642fee>] _raw_spin_lock+0xe/0x20 Feb 19 17:51:00 cvknode9 kernel: [ 803.465216] [<ffffffffa03e5487>] ocfs2_wake_downconvert_thread+0x27/0x60 [ocfs2] Feb 19 17:51:00 cvknode9 kernel: [ 803.465231] [<ffffffffa03e5554>] __ocfs2_cluster_unlock.isra.32+0x94/0xf0 [ocfs2] Feb 19 17:51:00 cvknode9 kernel: [ 803.465245] [<ffffffffa03e5b2b>] ocfs2_rw_unlock+0x6b/0xe0 [ocfs2] Feb 19 17:51:00 cvknode9 kernel: [ 803.465252] [<ffffffff811aa22f>] ? bio_free+0x5f/0x70 Feb 19 17:51:00 cvknode9 kernel: [ 803.465264] [<ffffffffa03cfa2a>] ocfs2_dio_end_io+0x6a/0x110 [ocfs2] Feb 19 17:51:00 cvknode9 kernel: [ 803.465268] [<ffffffff811ad806>] dio_complete+0xe6/0xf0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465271] [<ffffffff811ad87d>] dio_bio_end_aio+0x6d/0xc0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465275] [<ffffffff816432d5>] ? _raw_spin_lock_irq+0x15/0x20 Feb 19 17:51:00 cvknode9 kernel: [ 803.465279] [<ffffffff811a8ecd>] bio_endio+0x1d/0x40 Feb 19 17:51:00 cvknode9 kernel: [ 803.465286] [<ffffffff812ebaf3>] req_bio_endio.isra.45+0xa3/0xe0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465290] [<ffffffff812ec23d>] blk_update_request+0xfd/0x480 Feb 19 17:51:00 cvknode9 kernel: [ 803.465293] [<ffffffff812ec5f1>] blk_update_bidi_request+0x31/0x90 Feb 19 17:51:00 cvknode9 kernel: [ 803.465297] [<ffffffff812ed8ec>] blk_end_bidi_request+0x2c/0x80 Feb 19 17:51:00 cvknode9 kernel: [ 803.465301] [<ffffffff812ed980>] blk_end_request+0x10/0x20 Feb 19 17:51:00 cvknode9 kernel: [ 803.465308] [<ffffffff814229bf>] scsi_io_completion+0xaf/0x630 Feb 19 17:51:00 cvknode9 kernel: [ 803.465316] [<ffffffff81418ebc>] scsi_finish_command+0xcc/0x130 Feb 19 17:51:00 cvknode9 kernel: [ 803.465319] [<ffffffff8142281e>] scsi_softirq_done+0x13e/0x150 Feb 19 17:51:00 cvknode9 kernel: [ 803.465325] [<ffffffff812f38b3>] blk_done_softirq+0x83/0xa0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465331] [<ffffffff8104f835>] ? check_preempt_curr+0x75/0xa0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465336] [<ffffffff8106e438>] __do_softirq+0xa8/0x210 Feb 19 17:51:00 cvknode9 kernel: [ 803.465339] [<ffffffff8104f89d>] ? ttwu_do_wakeup+0x3d/0x120 Feb 19 17:51:00 cvknode9 kernel: [ 803.465345] [<ffffffff8164d3ec>] call_softirq+0x1c/0x30 Feb 19 17:51:00 cvknode9 kernel: [ 803.465352] [<ffffffff81016205>] do_softirq+0x65/0xa0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465355] [<ffffffff8106e81e>] irq_exit+0x8e/0xb0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465361] [<ffffffff810313c5>] smp_call_function_single_interrupt+0x35/0x40 Feb 19 17:51:00 cvknode9 kernel: [ 803.465366] [<ffffffff8164ce5e>] call_function_single_interrupt+0x6e/0x80 Feb 19 17:51:00 cvknode9 kernel: [ 803.465368] <EOI> [<ffffffffa01eda72>] ? o2net_send_message_vec+0x142/0x9f0 [ocfs2_nodemanager] Feb 19 17:51:00 cvknode9 kernel: [ 803.465380] [<ffffffff8103dafd>] ? __ticket_spin_lock+0xd/0x30 Feb 19 17:51:00 cvknode9 kernel: [ 803.465384] [<ffffffff81642fee>] _raw_spin_lock+0xe/0x20 Feb 19 17:51:00 cvknode9 kernel: [ 803.465398] [<ffffffffa03e862f>] ocfs2_downconvert_thread+0x1af/0xc50 [ocfs2] Feb 19 17:51:00 cvknode9 kernel: [ 803.465402] [<ffffffff810136e5>] ? __switch_to+0xf5/0x360 Feb 19 17:51:00 cvknode9 kernel: [ 803.465408] [<ffffffff8108a6b0>] ? add_wait_queue+0x60/0x60 Feb 19 17:51:00 cvknode9 kernel: [ 803.465421] [<ffffffffa03e8480>] ? ocfs2_downconvert_lock+0x250/0x250 [ocfs2] Feb 19 17:51:00 cvknode9 kernel: [ 803.465425] [<ffffffff81089c0c>] kthread+0x8c/0xa0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465429] [<ffffffff8164d2f4>] kernel_thread_helper+0x4/0x10 Feb 19 17:51:00 cvknode9 kernel: [ 803.465433] [<ffffffff81089b80>] ? flush_kthread_worker+0xa0/0xa0 Feb 19 17:51:00 cvknode9 kernel: [ 803.465436] [<ffffffff8164d2f0>] ? gs_change+0x13/0x13 Feb 19 17:51:00 cvknode9 kernel: [ 803.465438] ---[ end trace aa7a8184efeebe01 ]--- Feb 19 17:51:15 cvknode9 kernel: [ 819.045428] o2net: Connection to node cvmnode (num 2) at 192.168.3.5:7100 shutdown, state 8 Feb 19 17:51:15 cvknode9 kernel: [ 819.049277] o2net: Connection to node cvmnode (num 2) at 192.168.3.5:7100 has been idle for 30.64 secs, shutting it down. Feb 19 17:51:15 cvknode9 kernel: [ 819.049284] o2net_idle_timer 1598: Local and remote node is heartbeating, and try connect Feb 19 17:51:15 cvknode9 kernel: [ 819.368962] o2net: Connection to node cvknode13 (num 6) at 192.168.3.13:7100 has been idle for 30.102 secs, shutting it down. Feb 19 17:51:15 cvknode9 kernel: [ 819.368973] o2net_idle_timer 1598: Local and remote node is heartbeating, and try connect Feb 19 17:51:15 cvknode9 kernel: [ 819.378952] o2net: Connection to node cvknode13 (num 6) at 192.168.3.13:7100 shutdown, state 8 Feb 19 17:51:19 cvknode9 kernel: [ 822.754439] ------------[ cut here ]------------ Feb 19 17:51:19 cvknode9 kernel: [ 822.754451] WARNING: at kernel/watchdog.c:241 watchdog_overflow_callback+0x9a/0xc0() Feb 19 17:51:19 cvknode9 kernel: [ 822.754454] Hardware name: FlexServer B590 Feb 19 17:51:19 cvknode9 kernel: [ 822.754455] Watchdog detected hard LOCKUP on cpu 28 Feb 19 17:51:19 cvknode9 kernel: [ 822.754457] Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables ocfs2(O) quota_tree drbd lru_cache 8021q garp stp vhost_net macvtap macvlan kvm_intel kvm ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) configfs openvswitch_mod(O) nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc psmouse ioatdma dm_multipath serio_raw sb_edac hpilo edac_core dca acpi_power_meter mac_hid lp parport hpsa be2iscsi iscsi_boot_sysfs libiscsi be2net scsi_transport_iscsi Feb 19 17:51:19 cvknode9 kernel: [ 822.754530] Pid: 230, comm: kworker/u:1 Tainted: G W O 3.2.50 #1 Feb 19 17:51:19 cvknode9 kernel: [ 822.754532] Call Trace: Feb 19 17:51:19 cvknode9 kernel: [ 822.754534] <NMI> [<ffffffff81066daf>] warn_slowpath_common+0x7f/0xc0 Feb 19 17:51:19 cvknode9 kernel: [ 822.754546] [<ffffffff81066ea6>] warn_slowpath_fmt+0x46/0x50 Feb 19 17:51:19 cvknode9 kernel: [ 822.754551] [<ffffffff8101b833>] ? native_sched_clock+0x13/0x80 Feb 19 17:51:19 cvknode9 kernel: [ 822.754555] [<ffffffff810d6b1a>] watchdog_overflow_callback+0x9a/0xc0 Feb 19 17:51:19 cvknode9 kernel: [ 822.754559] [<ffffffff8110eb76>] __perf_event_overflow+0x96/0x1f0 Feb 19 17:51:19 cvknode9 kernel: [ 822.754563] [<ffffffff8110c491>] ? perf_event_update_userpage+0x11/0xc0 Feb 19 17:51:19 cvknode9 kernel: [ 822.754568] [<ffffffff8102468a>] ? x86_perf_event_set_period+0xda/0x150 Feb 19 17:51:19 cvknode9 kernel: [ 822.754572] [<ffffffff8110f534>] perf_event_overflow+0x14/0x20 Feb 19 17:51:19 cvknode9 kernel: [ 822.754576] [<ffffffff81028c93>] intel_pmu_handle_irq+0x163/0x2e0 Feb 19 17:51:19 cvknode9 kernel: [ 822.754583] [<ffffffff81644b01>] perf_event_nmi_handler+0x21/0x30 Feb 19 17:51:19 cvknode9 kernel: [ 822.754587] [<ffffffff816443d1>] do_nmi+0x101/0x350 Feb 19 17:51:19 cvknode9 kernel: [ 822.754591] [<ffffffff81643a30>] nmi+0x20/0x30 Feb 19 17:51:19 cvknode9 kernel: [ 822.754598] [<ffffffff8103db15>] ? __ticket_spin_lock+0x25/0x30 Feb 19 17:51:19 cvknode9 kernel: [ 822.754599] <<EOE>> [<ffffffff81642fee>] _raw_spin_lock+0xe/0x20 Feb 19 17:51:19 cvknode9 kernel: [ 822.754669] [<ffffffffa03e3034>] ocfs2_schedule_blocked_lock+0x84/0x130 [ocfs2] Feb 19 17:51:19 cvknode9 kernel: [ 822.754684] [<ffffffffa03ea90b>] ocfs2_blocking_ast+0x24b/0x2b0 [ocfs2] Feb 19 17:51:19 cvknode9 kernel: [ 822.754692] [<ffffffffa021ffea>] ? __dlm_lookup_lockres_full+0xba/0x130 [ocfs2_dlm] Feb 19 17:51:19 cvknode9 kernel: [ 822.754696] [<ffffffff81642fee>] ? _raw_spin_lock+0xe/0x20 Feb 19 17:51:19 cvknode9 kernel: [ 822.754700] [<ffffffffa00ba020>] ? o2dlm_lock_ast_wrapper+0x20/0x20 [ocfs2_stack_o2cb] Feb 19 17:51:19 cvknode9 kernel: [ 822.754704] [<ffffffffa00ba034>] o2dlm_blocking_ast_wrapper+0x14/0x20 [ocfs2_stack_o2cb] Feb 19 17:51:19 cvknode9 kernel: [ 822.754711] [<ffffffffa02396eb>] dlm_do_local_bast+0x4b/0xe0 [ocfs2_dlm] Feb 19 17:51:19 cvknode9 kernel: [ 822.754716] [<ffffffffa02201f8>] ? dlm_lookup_lockres+0x88/0xa0 [ocfs2_dlm] Feb 19 17:51:19 cvknode9 kernel: [ 822.754722] [<ffffffffa0239f86>] dlm_proxy_ast_handler+0x806/0xa10 [ocfs2_dlm] Feb 19 17:51:19 cvknode9 kernel: [ 822.754728] [<ffffffff81077e5c>] ? mod_timer+0x24c/0x2f0 Feb 19 17:51:19 cvknode9 kernel: [ 822.754734] [<ffffffff810823fe>] ? queue_delayed_work_on+0xbe/0x1a0 Feb 19 17:51:19 cvknode9 kernel: [ 822.754741] [<ffffffffa01eb003>] ? o2net_handler_tree_lookup+0x23/0xc0 [ocfs2_nodemanager] Feb 19 17:51:19 cvknode9 kernel: [ 822.754748] [<ffffffffa01ed036>] o2net_rx_until_empty+0x506/0xe00 [ocfs2_nodemanager] Feb 19 17:51:19 cvknode9 kernel: [ 822.754753] [<ffffffff8104f698>] ? hrtick_update+0x38/0x40 Feb 19 17:51:19 cvknode9 kernel: [ 822.754757] [<ffffffff81056838>] ? dequeue_task_fair+0xb8/0x100 Feb 19 17:51:19 cvknode9 kernel: [ 822.754762] [<ffffffff810136e5>] ? __switch_to+0xf5/0x360 Feb 19 17:51:19 cvknode9 kernel: [ 822.754767] [<ffffffff810843c7>] process_one_work+0x127/0x470 Feb 19 17:51:19 cvknode9 kernel: [ 822.754771] [<ffffffff810854a4>] worker_thread+0x164/0x370 Feb 19 17:51:19 cvknode9 kernel: [ 822.754775] [<ffffffff81085340>] ? manage_workers.isra.31+0x230/0x230 Feb 19 17:51:19 cvknode9 kernel: [ 822.754780] [<ffffffff81089c0c>] kthread+0x8c/0xa0 Feb 19 17:51:19 cvknode9 kernel: [ 822.754785] [<ffffffff8164d2f4>] kernel_thread_helper+0x4/0x10 Feb 19 17:51:19 cvknode9 kernel: [ 822.754789] [<ffffffff81089b80>] ? flush_kthread_worker+0xa0/0xa0 Feb 19 17:51:19 cvknode9 kernel: [ 822.754793] [<ffffffff8164d2f0>] ? gs_change+0x13/0x13 Feb 19 17:51:19 cvknode9 kernel: [ 822.754794] ---[ end trace aa7a8184efeebe02 ]--- Feb 19 17:51:45 cvknode9 kernel: [ 849.247579] INFO: rcu_sched detected stalls on CPUs/tasks: { 0 28} (detected by 32, t=15002 jiffies) Feb 19 17:51:45 cvknode9 kernel: [ 849.247598] sending NMI to all CPUs: ------------------------------------------------------------------------------------------------------------------------------------- ???????????????????????????????????????? ???????????????????????????????????????? ???????????????????????????????????????? ??? This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20140221/8720b079/attachment-0001.html
Marty Sweet
2014-Feb-21 03:47 UTC
[Ocfs2-users] Hi everyone, is it an issue? The host is blocked as the issue created. Thanks a lot.
Hi, By the looks of the call-trace this looks like an issue with communicating with the iSCSI target. I run 6 nodes with OCFS2 over Fibre Channel on Ubuntu 12.04 Linux 3.5 and Linux 3.13 (VM Cluster and Samba Cluster). Before investigating this any further I would advise upgrading your kernel to at least 3.8 (which is officially support by Ubuntu). There have been many improvements to OCFS2 in recent kernels which have increased the stability in (at least our environment) substantially. $ apt-get install linux-image-3.8.0-35-generic If the problem still persists, does you iSCSI target have monitoring statistics which you could look into? If the network link is becoming saturated this could be the issue (especially if heartbeat is running over the same interface). Could you also let us know which node is fencing and if all subsequent nodes receive this stack trace? Does the IO lock up? ( $ ls /vms/vStore ) Marty On Fri, Feb 21, 2014 at 3:28 AM, Guozhonghua <guozhonghua at h3c.com> wrote:> Hi everyone, as we test the performance of the ocfs2 with fio. As the test > case running, one of host of ocfs2 cluster will be blocked a small time and > restart sooner. > > The test environment is that there are six host sharing one iSCSI LUN which > capacity is about 1T and it is formatted with ocfs2, and mount point on > every host is /vms/vStore. > > All of the host?s OS is ubuntu 12.04, and we upgrade the kernel with 3.2.50, > and ocfs2 as compiled according with kernel 3.2.50. > > We test the performance of the ocfs2 with fio on one every host. > > > > The fio test configure is as below, and the filename is different on every > host. > > Such as file1?file5 is on host1, file6?.file10 are on host2, and so on. > > > > One example fio file is as below: > > root at cvknode4:~/fios_test4# cat 1024k_10r > > [global] > > ioengine=libaio > > rw=read > > bs=1024K > > time_based > > runtime=180 > > size=9g > > direct=1 > > iodepth=1 > > > > [file1] > > filename=/vms/vStor/file41 > > > > [file2] > > filename=/vms/vStor/file42 > > > > [file3] > > filename=/vms/vStor/file43 > > > > [file4] > > filename=/vms/vStor/file44 > > > > [file5] > > filename=/vms/vStor/file45 > > > > As we start fio tools on the hosts sequent, several minute later, one host > will blocked and restart(fenced). > > Is it one issue of ocfs2? Or is there any fixed patch for it? > > > > The syslog is as below: > > Feb 19 17:50:01 cvknode9 CRON[16143]: (CRON) info (No MTA installed, > discarding output) > > Feb 19 17:50:01 cvknode9 CRON[16147]: (CRON) info (No MTA installed, > discarding output) > > Feb 19 17:50:01 cvknode9 CRON[16146]: (CRON) info (No MTA installed, > discarding output) > > Feb 19 17:50:01 cvknode9 CRON[16144]: (CRON) info (No MTA installed, > discarding output) > > Feb 19 17:50:01 cvknode9 CRON[16141]: (CRON) info (No MTA installed, > discarding output) > > Feb 19 17:50:02 cvknode9 CRON[16134]: (CRON) info (No MTA installed, > discarding output) > > Feb 19 17:50:03 cvknode9 crmadmin: [16194]: ERROR: admin_message_timeout: No > messages received in 2 seconds > > Feb 19 17:50:03 cvknode9 CRON[16140]: (CRON) info (No MTA installed, > discarding output) > > Feb 19 17:51:00 cvknode9 kernel: [ 803.464977] ------------[ cut here > ]------------ > > Feb 19 17:51:00 cvknode9 kernel: [ 803.464991] WARNING: at > kernel/watchdog.c:241 watchdog_overflow_callback+0x9a/0xc0() > > Feb 19 17:51:00 cvknode9 kernel: [ 803.464993] Hardware name: FlexServer > B590 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.464995] Watchdog detected hard > LOCKUP on cpu 0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.464997] Modules linked in: > ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables > x_tables ocfs2(O) quota_tree drbd lru_cache 8021q garp stp vhost_net macvtap > macvlan kvm_intel kvm ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp libiscsi_tcp ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) > ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) configfs > openvswitch_mod(O) nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc psmouse > ioatdma dm_multipath serio_raw sb_edac hpilo edac_core dca acpi_power_meter > mac_hid lp parport hpsa be2iscsi iscsi_boot_sysfs libiscsi be2net > scsi_transport_iscsi > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465065] Pid: 6029, comm: ocfs2dc > Tainted: G O 3.2.50 #1 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465067] Call Trace: > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465069] <NMI> [<ffffffff81066daf>] > warn_slowpath_common+0x7f/0xc0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465084] [<ffffffff81066ea6>] > warn_slowpath_fmt+0x46/0x50 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465089] [<ffffffff8101b833>] ? > native_sched_clock+0x13/0x80 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465093] [<ffffffff810d6b1a>] > watchdog_overflow_callback+0x9a/0xc0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465099] [<ffffffff8110eb76>] > __perf_event_overflow+0x96/0x1f0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465103] [<ffffffff8110c491>] ? > perf_event_update_userpage+0x11/0xc0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465109] [<ffffffff8102468a>] ? > x86_perf_event_set_period+0xda/0x150 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465113] [<ffffffff8110f534>] > perf_event_overflow+0x14/0x20 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465118] [<ffffffff81028c93>] > intel_pmu_handle_irq+0x163/0x2e0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465130] [<ffffffff81644b01>] > perf_event_nmi_handler+0x21/0x30 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465134] [<ffffffff816443d1>] > do_nmi+0x101/0x350 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465138] [<ffffffff81643a30>] > nmi+0x20/0x30 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465147] [<ffffffff8103db15>] ? > __ticket_spin_lock+0x25/0x30 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465149] <<EOE>> <IRQ> > [<ffffffff81642fee>] _raw_spin_lock+0xe/0x20 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465216] [<ffffffffa03e5487>] > ocfs2_wake_downconvert_thread+0x27/0x60 [ocfs2] > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465231] [<ffffffffa03e5554>] > __ocfs2_cluster_unlock.isra.32+0x94/0xf0 [ocfs2] > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465245] [<ffffffffa03e5b2b>] > ocfs2_rw_unlock+0x6b/0xe0 [ocfs2] > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465252] [<ffffffff811aa22f>] ? > bio_free+0x5f/0x70 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465264] [<ffffffffa03cfa2a>] > ocfs2_dio_end_io+0x6a/0x110 [ocfs2] > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465268] [<ffffffff811ad806>] > dio_complete+0xe6/0xf0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465271] [<ffffffff811ad87d>] > dio_bio_end_aio+0x6d/0xc0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465275] [<ffffffff816432d5>] ? > _raw_spin_lock_irq+0x15/0x20 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465279] [<ffffffff811a8ecd>] > bio_endio+0x1d/0x40 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465286] [<ffffffff812ebaf3>] > req_bio_endio.isra.45+0xa3/0xe0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465290] [<ffffffff812ec23d>] > blk_update_request+0xfd/0x480 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465293] [<ffffffff812ec5f1>] > blk_update_bidi_request+0x31/0x90 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465297] [<ffffffff812ed8ec>] > blk_end_bidi_request+0x2c/0x80 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465301] [<ffffffff812ed980>] > blk_end_request+0x10/0x20 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465308] [<ffffffff814229bf>] > scsi_io_completion+0xaf/0x630 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465316] [<ffffffff81418ebc>] > scsi_finish_command+0xcc/0x130 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465319] [<ffffffff8142281e>] > scsi_softirq_done+0x13e/0x150 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465325] [<ffffffff812f38b3>] > blk_done_softirq+0x83/0xa0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465331] [<ffffffff8104f835>] ? > check_preempt_curr+0x75/0xa0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465336] [<ffffffff8106e438>] > __do_softirq+0xa8/0x210 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465339] [<ffffffff8104f89d>] ? > ttwu_do_wakeup+0x3d/0x120 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465345] [<ffffffff8164d3ec>] > call_softirq+0x1c/0x30 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465352] [<ffffffff81016205>] > do_softirq+0x65/0xa0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465355] [<ffffffff8106e81e>] > irq_exit+0x8e/0xb0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465361] [<ffffffff810313c5>] > smp_call_function_single_interrupt+0x35/0x40 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465366] [<ffffffff8164ce5e>] > call_function_single_interrupt+0x6e/0x80 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465368] <EOI> [<ffffffffa01eda72>] > ? o2net_send_message_vec+0x142/0x9f0 [ocfs2_nodemanager] > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465380] [<ffffffff8103dafd>] ? > __ticket_spin_lock+0xd/0x30 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465384] [<ffffffff81642fee>] > _raw_spin_lock+0xe/0x20 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465398] [<ffffffffa03e862f>] > ocfs2_downconvert_thread+0x1af/0xc50 [ocfs2] > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465402] [<ffffffff810136e5>] ? > __switch_to+0xf5/0x360 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465408] [<ffffffff8108a6b0>] ? > add_wait_queue+0x60/0x60 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465421] [<ffffffffa03e8480>] ? > ocfs2_downconvert_lock+0x250/0x250 [ocfs2] > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465425] [<ffffffff81089c0c>] > kthread+0x8c/0xa0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465429] [<ffffffff8164d2f4>] > kernel_thread_helper+0x4/0x10 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465433] [<ffffffff81089b80>] ? > flush_kthread_worker+0xa0/0xa0 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465436] [<ffffffff8164d2f0>] ? > gs_change+0x13/0x13 > > Feb 19 17:51:00 cvknode9 kernel: [ 803.465438] ---[ end trace > aa7a8184efeebe01 ]--- > > Feb 19 17:51:15 cvknode9 kernel: [ 819.045428] o2net: Connection to node > cvmnode (num 2) at 192.168.3.5:7100 shutdown, state 8 > > Feb 19 17:51:15 cvknode9 kernel: [ 819.049277] o2net: Connection to node > cvmnode (num 2) at 192.168.3.5:7100 has been idle for 30.64 secs, shutting > it down. > > Feb 19 17:51:15 cvknode9 kernel: [ 819.049284] o2net_idle_timer 1598: Local > and remote node is heartbeating, and try connect > > Feb 19 17:51:15 cvknode9 kernel: [ 819.368962] o2net: Connection to node > cvknode13 (num 6) at 192.168.3.13:7100 has been idle for 30.102 secs, > shutting it down. > > Feb 19 17:51:15 cvknode9 kernel: [ 819.368973] o2net_idle_timer 1598: Local > and remote node is heartbeating, and try connect > > Feb 19 17:51:15 cvknode9 kernel: [ 819.378952] o2net: Connection to node > cvknode13 (num 6) at 192.168.3.13:7100 shutdown, state 8 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754439] ------------[ cut here > ]------------ > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754451] WARNING: at > kernel/watchdog.c:241 watchdog_overflow_callback+0x9a/0xc0() > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754454] Hardware name: FlexServer > B590 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754455] Watchdog detected hard > LOCKUP on cpu 28 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754457] Modules linked in: > ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables > x_tables ocfs2(O) quota_tree drbd lru_cache 8021q garp stp vhost_net macvtap > macvlan kvm_intel kvm ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp libiscsi_tcp ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) > ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) configfs > openvswitch_mod(O) nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc psmouse > ioatdma dm_multipath serio_raw sb_edac hpilo edac_core dca acpi_power_meter > mac_hid lp parport hpsa be2iscsi iscsi_boot_sysfs libiscsi be2net > scsi_transport_iscsi > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754530] Pid: 230, comm: kworker/u:1 > Tainted: G W O 3.2.50 #1 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754532] Call Trace: > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754534] <NMI> [<ffffffff81066daf>] > warn_slowpath_common+0x7f/0xc0 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754546] [<ffffffff81066ea6>] > warn_slowpath_fmt+0x46/0x50 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754551] [<ffffffff8101b833>] ? > native_sched_clock+0x13/0x80 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754555] [<ffffffff810d6b1a>] > watchdog_overflow_callback+0x9a/0xc0 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754559] [<ffffffff8110eb76>] > __perf_event_overflow+0x96/0x1f0 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754563] [<ffffffff8110c491>] ? > perf_event_update_userpage+0x11/0xc0 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754568] [<ffffffff8102468a>] ? > x86_perf_event_set_period+0xda/0x150 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754572] [<ffffffff8110f534>] > perf_event_overflow+0x14/0x20 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754576] [<ffffffff81028c93>] > intel_pmu_handle_irq+0x163/0x2e0 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754583] [<ffffffff81644b01>] > perf_event_nmi_handler+0x21/0x30 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754587] [<ffffffff816443d1>] > do_nmi+0x101/0x350 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754591] [<ffffffff81643a30>] > nmi+0x20/0x30 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754598] [<ffffffff8103db15>] ? > __ticket_spin_lock+0x25/0x30 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754599] <<EOE>> > [<ffffffff81642fee>] _raw_spin_lock+0xe/0x20 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754669] [<ffffffffa03e3034>] > ocfs2_schedule_blocked_lock+0x84/0x130 [ocfs2] > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754684] [<ffffffffa03ea90b>] > ocfs2_blocking_ast+0x24b/0x2b0 [ocfs2] > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754692] [<ffffffffa021ffea>] ? > __dlm_lookup_lockres_full+0xba/0x130 [ocfs2_dlm] > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754696] [<ffffffff81642fee>] ? > _raw_spin_lock+0xe/0x20 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754700] [<ffffffffa00ba020>] ? > o2dlm_lock_ast_wrapper+0x20/0x20 [ocfs2_stack_o2cb] > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754704] [<ffffffffa00ba034>] > o2dlm_blocking_ast_wrapper+0x14/0x20 [ocfs2_stack_o2cb] > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754711] [<ffffffffa02396eb>] > dlm_do_local_bast+0x4b/0xe0 [ocfs2_dlm] > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754716] [<ffffffffa02201f8>] ? > dlm_lookup_lockres+0x88/0xa0 [ocfs2_dlm] > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754722] [<ffffffffa0239f86>] > dlm_proxy_ast_handler+0x806/0xa10 [ocfs2_dlm] > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754728] [<ffffffff81077e5c>] ? > mod_timer+0x24c/0x2f0 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754734] [<ffffffff810823fe>] ? > queue_delayed_work_on+0xbe/0x1a0 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754741] [<ffffffffa01eb003>] ? > o2net_handler_tree_lookup+0x23/0xc0 [ocfs2_nodemanager] > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754748] [<ffffffffa01ed036>] > o2net_rx_until_empty+0x506/0xe00 [ocfs2_nodemanager] > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754753] [<ffffffff8104f698>] ? > hrtick_update+0x38/0x40 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754757] [<ffffffff81056838>] ? > dequeue_task_fair+0xb8/0x100 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754762] [<ffffffff810136e5>] ? > __switch_to+0xf5/0x360 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754767] [<ffffffff810843c7>] > process_one_work+0x127/0x470 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754771] [<ffffffff810854a4>] > worker_thread+0x164/0x370 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754775] [<ffffffff81085340>] ? > manage_workers.isra.31+0x230/0x230 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754780] [<ffffffff81089c0c>] > kthread+0x8c/0xa0 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754785] [<ffffffff8164d2f4>] > kernel_thread_helper+0x4/0x10 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754789] [<ffffffff81089b80>] ? > flush_kthread_worker+0xa0/0xa0 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754793] [<ffffffff8164d2f0>] ? > gs_change+0x13/0x13 > > Feb 19 17:51:19 cvknode9 kernel: [ 822.754794] ---[ end trace > aa7a8184efeebe02 ]--- > > Feb 19 17:51:45 cvknode9 kernel: [ 849.247579] INFO: rcu_sched detected > stalls on CPUs/tasks: { 0 28} (detected by 32, t=15002 jiffies) > > Feb 19 17:51:45 cvknode9 kernel: [ 849.247598] sending NMI to all CPUs: > > ------------------------------------------------------------------------------------------------------------------------------------- > ???????????????????????????????????????? > ???????????????????????????????????????? > ???????????????????????????????????????? > ??? > This e-mail and its attachments contain confidential information from H3C, > which is > intended only for the person or entity whose address is listed above. Any > use of the > information contained herein in any way (including, but not limited to, > total or partial > disclosure, reproduction, or dissemination) by persons other than the > intended > recipient(s) is prohibited. If you receive this e-mail in error, please > notify the sender > by phone or email immediately and delete it! > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-users-- Marty