thr3ads.net - similar to: "RHEL 4 U2 / OCFS 1.2.1 weekly crash?"

Displaying 20 results from an estimated 6000 matches similar to: "RHEL 4 U2 / OCFS 1.2.1 weekly crash?"

Newbie questions -- is OCFS2 what I even want?

2006 Nov 03

Newbie questions -- is OCFS2 what I even want?

Dear Sirs and Madams, I run a small visual effects production company, Hammerhead Productions. We'd like to have an easily extensible inexpensive relatively high-performance storage network using open-source components. I was hoping that OCFS2 would be that system. I have a half-dozen 2 TB fileservers I'd like the rest of the network to see as a single 12 TB disk, with the aggregate

Self-fencing issues (RHEL4)

2006 Apr 18

Self-fencing issues (RHEL4)

Hi. I'm running RHEL4 for my test system, Adaptec Firewire controllers, Maxtor One Touch III shared disk (see the details below), 100Mb/s dedicated interconnect. It panics with no load about each 20 minutes (error message from netconsole attached) Any clues? Yegor --- [root at rac1 ~]# cat /proc/fs/ocfs2/version OCFS2 1.2.0 Tue Mar 7 15:51:20 PST 2006 (build

Getting Closer (was: Fencing options)

2010 Jan 18

Getting Closer (was: Fencing options)

One more follow on, The combination of kernel.panic=60 and kernel.printk=7 4 1 7 seems to have netted the culrptit: E01-netconsole.log:Jan 18 09:45:10 E01 (10,0):o2hb_write_timeout:137 ERROR: Heartbeat write timeout to device dm-12 after 60000 milliseconds E01-netconsole.log:Jan 18 09:45:10 E01 (10,0):o2hb_stop_all_regions:1517 ERROR: stopping heartbeat on all active regions.

Private Interconnect and self fencing

2006 Jul 28

Private Interconnect and self fencing

I have an OCFS2 filesystem on a coraid AOE device. It mounts fine, but with heavy I/O the server self fences claiming a write timeout: (16,2):o2hb_write_timeout:164 ERROR: Heartbeat write timeout to device etherd/e0.1p1 after 12000 milliseconds (16,2):o2hb_stop_all_regions:1789 ERROR: stopping heartbeat on all active regions. Kernel panic - not syncing: ocfs2 is very sorry to be fencing this

6 node cluster with unexplained reboots

2007 Jul 29

6 node cluster with unexplained reboots

We just installed a new cluster with 6 HP DL380g5, dual single port Qlogic 24xx HBAs connected via two HP 4/16 Storageworks switches to a 3Par S400. We are using the 3Par recommended config for the Qlogic driver and device-mapper-multipath giving us 4 paths to the SAN. We do see some SCSI errors where DM-MP is failing a path after get a 0x2000 error from the SAN controller, but the path gets puts

cluster with 2 nodes - heartbeat problem fencing

2008 Mar 05

cluster with 2 nodes - heartbeat problem fencing

Hi to all, this is My first time on this mailinglist. I have a problem with Ocfs2 on Debian etch 4.0 I'd like when a node go down or freeze without unmount the ocfs2 partition the heartbeat not fence the server that work well ( kernel panic ). I'd like disable or heartbeat or fencing. So we can work also with only 1 node. Thanks

Unexplained reboots in DRBD82 + OCFS2 setup

2009 Jun 24

Unexplained reboots in DRBD82 + OCFS2 setup

We're trying to setup a dual-primary DRBD environment, with a shared disk with either OCFS2 or GFS. The environment is a Centos 5.3 with DRBD82 (but also tried with DRBD83 from testing) . Setting up a single primary disk and running bonnie++ on it works. Setting up a dual-primary disk, only mounting it on one node (ext3) and running bonnie++ works When setting up ocfs2 on the /dev/drbd0

Unexplained reboots in DRBD82 + OCFS2 setup

2009 Jun 24

Unexplained reboots in DRBD82 + OCFS2 setup

Cannot set heartbeat dead threshold

2009 Nov 13

Cannot set heartbeat dead threshold

Hi I have: SLES 10 SP2 (2.6.16.60-0.21-smp) ocfs2-tools-1.4.0-0.3 ocfs2console-1.4.0-0.3 and I can't change "heartbeat dead threshold" value. Content of /etc/sysconfig/o2cb: # O2CB_ENABLED: 'true' means to load the driver on boot. O2CB_ENABLED=true # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start. O2CB_BOOTCLUSTER=ocfs2 # O2CB_HEARTBEAT_THRESHOLD:

O2CB_HEARTBEAT_THRESHOLD won't take changes

2010 May 31

O2CB_HEARTBEAT_THRESHOLD won't take changes

Hello All, I have multiple OCFS2 clusters on SLES10 SP2 running Xen. We needed to increase the O2CB_HEARTBEAT_THRESHOLD from 31 up to 61 and did so successfully on 2 of our 3 clusters. However on one of the three clusters we are not able to change the value. The /etc/sysconfig/o2cb file contains 61 as the threshold after reconfiguring via /etc/init.d/o2cb configure, we reconfigure all 3 nodes at

Fencing OCFS

2009 Oct 20

Fencing OCFS

I people, i install the ocfs in my virtual machine, with centos 5.3 and xen. But when i turn off the machine1, the ocfs start the fencing off the machine2. I read the doc in oracle.com, but could not solve the problem. Someone help? My conf and my package version. cluster.conf node: ip_port = 7777 ip_address = 192.168.1.35 number = 0 name = x1 cluster

VM node won't talk to host

2008 Aug 21

VM node won't talk to host

I am trying to mount the same partition from a KVM ubuntu 8.04.1 virtual machine and on an ubuntu 8.04.1 host server. I am able to mount the partition just on fine on two ubuntu host servers, they both talk to each other. The logs on both servers show the other machine mounting and unmounting the drive. However, when I mount the drive in the KVM VM I get no communication to the host

mount.ocfs2: Invalid argument while mounting /dev/mapper/xenconfig_part1 on /etc/xen/vm/. Check 'dmesg' for more information on this error.

2011 Jul 14

mount.ocfs2: Invalid argument while mounting /dev/mapper/xenconfig_part1 on /etc/xen/vm/. Check 'dmesg' for more information on this error.

Hello, this is my scenario: 1)I've created a Pacemaker cluster with the following ocfs package on opensuse 11.3 64bit ocfs2console-1.8.0-2.1.x86_64 ocfs2-tools-o2cb-1.8.0-2.1.x86_64 ocfs2-tools-1.8.0-2.1.x86_64 2)I've configured the cluster as usual : <resources> <clone id="dlm-clone"> <meta_attributes id="dlm-clone-meta_attributes">

Troubles with two node

2007 Nov 29

Troubles with two node

Hi all, I'm running OCFS2 on two system with OpenSUSE 10.2 connected on fibre channel with a shared storage (HP MSA1500 + HP PROLIANT MSA20). The cluster has two node (web-ha1 and web-ha2), sometimes (1 or 2 times on a month) the OCFS2 stop to work on both system. On the first node I'm getting no error in log files and after a forced shoutdown of the first node on the second I can see

O2CB global heartbeat - hopefully final drop!

2010 Oct 08

O2CB global heartbeat - hopefully final drop!

All, This is hopefully the final drop of the patches for adding global heartbeat to the o2cb stack. The diff from the previous set is here: http://oss.oracle.com/~smushran/global-hb-diff-2010-10-07 Implemented most of the suggestions provided by Joel and Wengang. The most important one was to activate the feature only at the end, Also, got mostly a clean run with checkpatch.pl. Sunil

ocfs2 fencing problem

2008 Jul 01

ocfs2 fencing problem

Hi, Sunil or Tao, I have a 4 nodes OCFS2 cluster running OCFS2 1.2.8 on SuSE 9 SP4. When I tried to do failover testing (shutting down one node), the whole cluster hung (I can not even login to any server in the cluster). I have to bring all of them up and then be able to use the system. What kind of behavior is it? Is it the fence of OCFS2? Below is my configuration. aopcer13:~ #

2 Node cluster crashing

2006 Jul 10

2 Node cluster crashing

Hi, We have a two node cluster running SLES 9 SP2 connecting directly to an EMC CX300 for storage. We are using OCFS(OCFS2 DLM 0.99.15-SLES) for the voting disk etc, and ASM for data files. The system has been running until last Friday when the whole cluster went down with the following error messages in the /var/log/messages files : rac1: Jul 7 14:56:23 rac1 kernel:

raid5 crash

2005 Jul 07

raid5 crash

hi, after we switch our servers from centos-3 to centos-4 (aka. rhel-4) one of our server always crash once a week without any oops. this happneds with both the normal kernel-2.6.9-11.EL and kernel-2.6.9-11.106.unsupported. after we change the motherboard, the raid contorller and the cables too we still got it. finally we start netdump and last but not least yesterday we got a crash log and a

Another node is heartbeating in our slot! errors with LUN removal/addition

2008 Oct 22

Another node is heartbeating in our slot! errors with LUN removal/addition

Greetings, Last night I manually unpresented and deleted a LUN (a SAN snapshot) that was presented to one node in a four node RAC environment running OCFS2 v1.4.1-1. The system then rebooted with the following error: Oct 21 16:45:34 ausracdb03 kernel: (27,1):o2hb_write_timeout:166 ERROR: Heartbeat write timeout to device dm-24 after 120000 milliseconds Oct 21 16:45:34 ausracdb03 kernel:

Unable to stop cluster as heartbeat region still active

2011 Oct 18

Unable to stop cluster as heartbeat region still active

Hi, I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. My problem is that all the time when i try to run /etc/init.d/o2cb stop it fails with this error: Stopping O2CB cluster CLUSTER: Failed Unable to stop cluster as heartbeat region still active There is no active mount point. I tried to manually stop the heartdbeat with

similar to: RHEL 4 U2 / OCFS 1.2.1 weekly crash?