similar to: another fencing question

Displaying 20 results from an estimated 200 matches similar to: "another fencing question"

2009 May 12
2
add error check for ocfs2_read_locked_inode() call
After upgrading from 2.6.28.10 to 2.6.29.3 I've saw following new errors in kernel log: May 12 14:46:41 falcon-cl5 May 12 14:46:41 falcon-cl5 (6757,7):ocfs2_read_locked_inode:466 ERROR: status = -22 Only one node is mounted volumes in cluster: /dev/sde on /home/apache/users/D1 type ocfs2 (rw,_netdev,noatime,heartbeat=local) /dev/sdd on /home/apache/users/D2 type ocfs2
2010 Dec 09
2
servers blocked on ocfs2
Hi, we have recently started to use ocfs2 on some RHEL 5.5 servers (ocfs2-1.4.7) Some days ago, two servers sharing an ocfs2 filesystem, and with quite virtual services, stalled, in what it seems on ocfs2 issue. This are the lines in their messages files: =====node heraclito (0)======================================== /Dec 4 09:15:06 heraclito kernel: o2net: connection to node parmenides
2010 Jun 19
3
[PATCH 1/1] ocfs2 fix o2dlm dlm run purgelist
There are two problems in dlm_run_purgelist 1. If a lockres is found to be in use, dlm_run_purgelist keeps trying to purge the same lockres instead of trying the next lockres. 2. When a lockres is found unused, dlm_run_purgelist releases lockres spinlock before setting DLM_LOCK_RES_DROPPING_REF and calls dlm_purge_lockres. spinlock is reacquired but in this window lockres can get reused. This
2007 Feb 26
1
dlm timeouts and following errors -112
Hi list, I am experimenting with ocfs2 (rpm package: 1.2.2-0.2), using linux-ha 2.0.8 (all running on a SLES 10 x86-64, rpm packages from linux-ha.org) for the heartbeat. The three nodes are connected on a gigabit switch. From time to time I have problems to unmount a drive, and I have to reboot the whole system to fix the problem. When these lockups occur, I see these messages in
2007 Nov 29
1
Troubles with two node
Hi all, I'm running OCFS2 on two system with OpenSUSE 10.2 connected on fibre channel with a shared storage (HP MSA1500 + HP PROLIANT MSA20). The cluster has two node (web-ha1 and web-ha2), sometimes (1 or 2 times on a month) the OCFS2 stop to work on both system. On the first node I'm getting no error in log files and after a forced shoutdown of the first node on the second I can see
2007 Mar 08
4
ocfs2 cluster becomes unresponsive
We are running OCFS2 on SLES9 machines using a FC SAN. Without warning both nodes will become unresponsive. Can not access either machine via ssh or terminal (hangs after typing in username). However the machine still responds to pings. This continues until one node is rebooted, at which time the second node resumes normal operations. I am not entirely sure that this is an OCFS2 problem at all
2011 Dec 20
8
ocfs2 - Kernel panic on many write/read from both
Sorry i don`t copy everything: TEST-MAIL1# echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 5239722 26198604 246266859 TEST-MAIL1# echo "ls //orphan_dir:0001"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 6074335 30371669 285493670 TEST-MAIL2 ~ # echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 5239722 26198604
2009 Mar 18
2
shutdown by o2net_idle_timer causes Xen to hang
Hello, we've had some serious trouble with a two-node Xen-based OCFS2 cluster. In brief: we had two incidents where one node detects an idle timeout and shuts the other node down which causes the other node and the Dom0 to hang. Both times this could only be resolved by rebooting the whole machine using the built-in IPMI card. All machines (including the other DomUs) run Centos 5.2
2009 Feb 04
1
Strange dmesg messages
Hi list, Something went wrong this morning and we have a node ( #0 ) reboot. Something blocked the NFS access from both nodes, one rebooted and the another we restarted the nfsd and it brought him back. Looking at node #0 - the one that rebooted - logs everything seems normal, but looking at the othere node dmesg's we saw this messages: First the o2net detected that node #0 was dead: (It
2009 Jul 29
3
Error message whil booting system
Hi, When system booting getting error message "modprobe: FATAL: Module ocfs2_stackglue not found" in message. Some nodes reboot without any error message. ------------------------------------------------- ul 27 10:02:19 alf3 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team Jul 27 10:02:19 alf3 kernel: Netfilter messages via NETLINK v0.30. Jul 27 10:02:19 alf3 kernel:
2007 Oct 08
2
OCF2 and LVM
Does anybody knows if is there a certified procedure in to backup a RAC DB 10.2.0.3 based on OCFS2 , via split mirror or snaphots technology ? Using Linux LVM and OCFS2, does anybody knows if is possible to dinamically extend an OCFS filesystem, once the underlying LVM Volume has been extended ? Thanks in advance Riccardo Paganini
2009 Nov 15
2
Trying to multiboot bartpe & puppy linux on a usb flash with syslinux
Hey list, I have an older 2005 Uniwill 259ia3 with a Phoenix bios. Flash booting is limited to keys 512 MB or less! Here's my setup recipe: used the HP utility v. 2.0.6 giving me this geometry ---------------- fdisk -l ----------- Disk /dev/sdb: 493 MB, 493879296 bytes 255 heads, 63 sectors/track, 60 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start
2011 Mar 04
1
node eviction
Hello... I wonder if someone have had similar problem like this... a node evicts almost in a weekly basis and I have not found the root cause yet.... Mar 2 10:20:57 xirisoas3 kernel: ocfs2_dlm: Node 1 joins domain 129859624F7042EAB9829B18CA65FC88 Mar 2 10:20:57 xirisoas3 kernel: ocfs2_dlm: Nodes in domain ("129859624F7042EAB9829B18CA65FC88"): 1 2 3 4 Mar 3 16:18:02 xirisoas3 kernel:
2007 Feb 06
2
Network 10 sec timeout setting?
Hello! Hey didnt a setting for the 10 second network timeout get into the 2.6.20 kernel? if so how do we set this? I am getting OCFS2 1.3.3 (2201,0):o2net_connect_expired:1547 ERROR: no connection established with node 1 after 10.0 seconds, giving up and returning errors. (2458,0):dlm_request_join:802 ERROR: status = -107 (2458,0):dlm_try_to_join_domain:950 ERROR: status = -107
2014 Sep 11
1
May be deadlock for wrong locking order, patch request reviewed, thanks
As we test the ocfs2 cluster, the cluster is sometime hangs up. I got some information about the dead lock, which cause the cluster hangs up, the sys dir / lock is held and the node did not release it which cause the cluster hangs up. root at cvknode-21:~# ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN | grep D PID STAT COMMAND WIDE-WCHAN-COLUMN 7489 D jbd2/sdh-621
2014 Sep 11
1
May be deadlock for wrong locking order, patch request reviewed, thanks
As we test the ocfs2 cluster, the cluster is sometime hangs up. I got some information about the dead lock, which cause the cluster hangs up, the sys dir / lock is held and the node did not release it which cause the cluster hangs up. root at cvknode-21:~# ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN | grep D PID STAT COMMAND WIDE-WCHAN-COLUMN 7489 D jbd2/sdh-621
2023 Jun 13
1
[BUG] ocfs2/dlm: possible data races in dlm_drop_lockres_ref_done() and dlm_get_lock_resource()
Hello, Our static analysis tool finds some possible data races in the OCFS2 file system in Linux 6.4.0-rc6. In most calling contexts, the variables such as res->lockname.name and res->owner are accessed with holding the lock res->spinlock. Here is an example: lockres_seq_start() --> Line 539 in dlmdebug.c spin_lock(&res->spinlock); --> Line 574 in dlmdebug.c (Lock
2009 Jan 14
15
Backport patches to ocfs2 1.4 tree from mainline
Found 15 patches (out of 162) that appeared relevant to ocfs2 1.4. Please review. Sunil
2023 Jun 16
1
[BUG] ocfs2/dlm: possible data races in dlm_drop_lockres_ref_done() and dlm_get_lock_resource()
Hi, On 6/13/23 4:23 PM, Tuo Li wrote: > Hello, > > Our static analysis tool finds some possible data races in the OCFS2 file > system in Linux 6.4.0-rc6. > > In most calling contexts, the variables such as res->lockname.name and > res->owner are accessed with holding the lock res->spinlock. Here is an > example: > > lockres_seq_start() --> Line 539
2010 Apr 05
1
Kernel Panic, Server not coming back up
I have a relatively new test environment setup that is a little different from your typical scenario. This is my first time using OCFS2, but I believe it should work the way I have it setup. All of this is setup on VMWare virtual hosts. I have two front-end web servers and one backend administrative server. They all share 2 virtual hard drives within VMware (independent, persistent, &