thr3ads.net - similar to: "BUG: soft lockup

Displaying 20 results from an estimated 10000 matches similar to: "BUG: soft lockup - CPU#1 stuck for 61s"

2009 Jul 29

Error message whil booting system

Hi, When system booting getting error message "modprobe: FATAL: Module ocfs2_stackglue not found" in message. Some nodes reboot without any error message. ------------------------------------------------- ul 27 10:02:19 alf3 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team Jul 27 10:02:19 alf3 kernel: Netfilter messages via NETLINK v0.30. Jul 27 10:02:19 alf3 kernel:

Network 10 sec timeout setting?

2007 Feb 06

Network 10 sec timeout setting?

Hello! Hey didnt a setting for the 10 second network timeout get into the 2.6.20 kernel? if so how do we set this? I am getting OCFS2 1.3.3 (2201,0):o2net_connect_expired:1547 ERROR: no connection established with node 1 after 10.0 seconds, giving up and returning errors. (2458,0):dlm_request_join:802 ERROR: status = -107 (2458,0):dlm_try_to_join_domain:950 ERROR: status = -107

servers blocked on ocfs2

2010 Dec 09

servers blocked on ocfs2

Hi, we have recently started to use ocfs2 on some RHEL 5.5 servers (ocfs2-1.4.7) Some days ago, two servers sharing an ocfs2 filesystem, and with quite virtual services, stalled, in what it seems on ocfs2 issue. This are the lines in their messages files: =====node heraclito (0)======================================== /Dec 4 09:15:06 heraclito kernel: o2net: connection to node parmenides

shutdown by o2net_idle_timer causes Xen to hang

2009 Mar 18

shutdown by o2net_idle_timer causes Xen to hang

Hello, we've had some serious trouble with a two-node Xen-based OCFS2 cluster. In brief: we had two incidents where one node detects an idle timeout and shuts the other node down which causes the other node and the Dom0 to hang. Both times this could only be resolved by rebooting the whole machine using the built-in IPMI card. All machines (including the other DomUs) run Centos 5.2

OCFS2 Node restart

2009 Jul 22

OCFS2 Node restart

Hi, I have 6 nodes cluster with OCFS2 1.4.2 running on vmware virtual system RedHat 5.2 (2.6.18-128.1.16.el5) 64bit. Out of 6 nodes two nodes alf0 and alf3 reboot automatically, I enabled remote logging for kernel, and here is log. I noticed VM become non-response and suddenly reboots. I am running Alfresco (documents sharing) application all nodes are accessing common share on OCFS.

another fencing question

2010 Jan 14

another fencing question

Hi, periodically one of on my two nodes cluster is fenced here are the logs: Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2- rc.minint.it (num 0) at 1.1.1.6:7777 Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR: link to 0 went down! Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR: status = -112 Jan 14 07:01:44

question about oracle shared home install

2009 Jun 09

question about oracle shared home install

Hi All, Scenario: I'm trying to install 9i rac on a 2 node cluster on OCFS2 OS: Oracle enterprise linux To my understanding, OCFS2 supports shared home installs which to my knowledge is not only can i have datafile and control files but also clustermanager files and binaries (pretty much everything: no files or executables need to kept local to any nodes). I have one single shared file for

Reg: ocfs2 two node cluster crashed, node2 crashed, when I rebooted node1 for maintenance.

2010 Oct 23

Reg: ocfs2 two node cluster crashed, node2 crashed, when I rebooted node1 for maintenance.

Hi All, We have ocfs2 node cluster with oracle 11G RAC running, The node2 got crashed automatically, when i rebooted node one for maintenance. please check the log from node2 , before its got crashed. Oct 23 15:42:25 node2 kernel: ocfs2_dlm: Nodes in domain ("029C02C993E44E90879922E268FB161A"): 2 Oct 23 15:42:29 node2 kernel: ocfs2_dlm: Node 1 leaves domain

Troubles with two node

2007 Nov 29

Troubles with two node

Hi all, I'm running OCFS2 on two system with OpenSUSE 10.2 connected on fibre channel with a shared storage (HP MSA1500 + HP PROLIANT MSA20). The cluster has two node (web-ha1 and web-ha2), sometimes (1 or 2 times on a month) the OCFS2 stop to work on both system. On the first node I'm getting no error in log files and after a forced shoutdown of the first node on the second I can see

dlm timeouts and following errors -112

2007 Feb 26

dlm timeouts and following errors -112

Hi list, I am experimenting with ocfs2 (rpm package: 1.2.2-0.2), using linux-ha 2.0.8 (all running on a SLES 10 x86-64, rpm packages from linux-ha.org) for the heartbeat. The three nodes are connected on a gigabit switch. From time to time I have problems to unmount a drive, and I have to reboot the whole system to fix the problem. When these lockups occur, I see these messages in

Strange dmesg messages

2009 Feb 04

Strange dmesg messages

Hi list, Something went wrong this morning and we have a node ( #0 ) reboot. Something blocked the NFS access from both nodes, one rebooted and the another we restarted the nfsd and it brought him back. Looking at node #0 - the one that rebooted - logs everything seems normal, but looking at the othere node dmesg's we saw this messages: First the o2net detected that node #0 was dead: (It

iscsi connection drop, comes back in seconds, then deadlock in cluster

2009 Nov 06

iscsi connection drop, comes back in seconds, then deadlock in cluster

Greetings ocfs2 folks, A client is experiencing some random deadlock issues within a cluster, wondering if anyone can point us in the right direction. The iSCSI connection seemed to have dropped on one node briefly, ultimately several hours later landing us in a complete deadlock scenario where multiple nodes (Node 7 and Node 8) had to be panic'd (by hand - they didn't ever panic on

Node fence on RHEL4 machine running 1.2.8-2

2008 Jul 14

Node fence on RHEL4 machine running 1.2.8-2

Hello, We have a four-node RHEL4 RAC cluster running OCFS2 version 1.2.8-2 and the 2.6.9-67.0.4hugemem kernel. The cluster has been really stable since we upgraded to 1.2.8-2 early this year, but this morning, one of the nodes fenced and rebooted itself, and I wonder if anyone could glance at the below remote syslogs and offer an opinion as to why. First, here's the output of

OCF2 and LVM

2007 Oct 08

OCF2 and LVM

Does anybody knows if is there a certified procedure in to backup a RAC DB 10.2.0.3 based on OCFS2 , via split mirror or snaphots technology ? Using Linux LVM and OCFS2, does anybody knows if is possible to dinamically extend an OCFS filesystem, once the underlying LVM Volume has been extended ? Thanks in advance Riccardo Paganini

ocfs2 - Kernel panic on many write/read from both

2011 Dec 20

ocfs2 - Kernel panic on many write/read from both

Sorry i don`t copy everything: TEST-MAIL1# echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 5239722 26198604 246266859 TEST-MAIL1# echo "ls //orphan_dir:0001"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 6074335 30371669 285493670 TEST-MAIL2 ~ # echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 5239722 26198604

One node hangs up issue requiring goog idea, thanks

2014 Sep 26

One node hangs up issue requiring goog idea, thanks

Hi, all, As we use OCFS2, the network is not good. When the converting request message can?t send to the another node, there will be a node hangs up which will still waiting for the dlm. CAS2/logdir/var/log/syslog.1-6778-Sep 16 20:57:16 CAS2 kernel: [516366.623623] o2net: Connection to node CAS1 (num 1) at 10.172.254.1:7100 has been idle for 30.87 secs, shutting it down.

o2net patch that avoids socket disconnect/reconnect

2009 Nov 20

o2net patch that avoids socket disconnect/reconnect

This fix modifies o2net layer behavior which seems to trigger some DLM race issues during umount/evictions that needs to be fixed as well. I am working on the dlm issues but meanwhile please review this patch. Thanks, --Srini

loss of connection

2010 Dec 15

loss of connection

My log says suddenly: Dec 14 02:35:16 hp1 kernel: [1492482.232822] o2net: no longer connected to node hp2 (num 1) at 192.168.1.2:7777 Dec 14 02:35:18 hp1 kernel: [1492483.960150] BUG: soft lockup - CPU#1 stuck for 61s! [kvm:32398] I have no idea what happens here and why - but the result are a lot of problems with virtual machines. Viele Gr??e Andreas Rittershofer -- Hier k?nnte keine

ERROR: -91 after Kernel Upgrade

2011 May 10

ERROR: -91 after Kernel Upgrade

Hey guys, I have a OCFS2 Cluster mounted at 4 xen-server (gentoo). Today I upgraded the xen-kernel for tests at one server (server2) from 2.6.34-xen to 2.6.38-xen-r1. After reboot the server couldn''t mount the ocsfs2 device anymore. ocfs2-tools version: sys-fs/ocfs2-tools-1.4.3 Modules are loaded and /config type configfs and /dlm type ocfs2_dlmfs are mounted. server2 ~ # mount

Self-fencing issues (RHEL4)

2006 Apr 18

Self-fencing issues (RHEL4)

Hi. I'm running RHEL4 for my test system, Adaptec Firewire controllers, Maxtor One Touch III shared disk (see the details below), 100Mb/s dedicated interconnect. It panics with no load about each 20 minutes (error message from netconsole attached) Any clues? Yegor --- [root at rac1 ~]# cat /proc/fs/ocfs2/version OCFS2 1.2.0 Tue Mar 7 15:51:20 PST 2006 (build

similar to: BUG: soft lockup - CPU#1 stuck for 61s