similar to: heartbeating in the wrong slot

Displaying 20 results from an estimated 8000 matches similar to: "heartbeating in the wrong slot"

2006 May 26
1
Another node is heartbeating in our slot!
All, We are having some problems getting OCFS2 to run, we are using kernel 2.6.15 with OCFS2 1.2.1. Compiling the OCFS2 sources went fine and all modules load perfectly. However, we can only mount the OCFS2 volume on one machine at a time, when we try to mount the volume on the 2 other machines we get an error stating that another node is heartbeating in our slot. When we mount the volume
2010 Oct 20
1
OCFS2 + iscsi: another node is heartbeating in our slot (over scst)
Hi, I'm building a cluster containing two nodes with seperate common storage server. On storage server i have volume with ocfs2 fs which is sharing this volume via iscsi target. When node connected to the target i can local mount volume on node and using it. Unfortunately. on storage server ocfs2 logged to dmesg: Oct 19 22:21:02 storage kernel: [ 1510.424144]
2008 Mar 05
0
ocfs2 and another node is heartbeating in our slot
Hello, I have one cluster drbd8+ocfs2. If I mount ocfs2 partition on node1 it's work but when I mount partition on node 2 I receive in /var/log/messages this -Mar 5 18:10:04 suse4 kernel: (2857,0):o2hb_do_disk_heartbeat:665 ERROR: Device "drbd1": another node is heartbeating in our slot! -Mar 5 18:10:04 suse4 kernel: WARNING: at include/asm/dma-mapping.h:44 dma_map_sg() -Mar 5
2008 Sep 18
0
Ocfs2-users Digest, Vol 57, Issue 14
I think I might have miss understood where it is failing, has this file been added to the DB on the web site or does it fail when you try to onfigure this? Carle Simmonds Infrastructure Consultant Technology Services Experian UK Ltd __________________________________________________ Tel: +44 (0)115 941 0888 (main switchboard) Mobile: +44 (0)7813 854834 E-Mail: carle.simmonds at uk.experian.com
2008 Sep 18
2
o2hb_do_disk_heartbeat:982:ERROR
Hi everyone; I have a problem on my 10 nodes cluster with ocfs2 1.2.9 and the OS is RHEL 4.7 AS. 9 nodes can start o2cb service and mount san disks on startup however one node can not do that. My cluster configuration is : node: ip_port = 7777 ip_address = 192.168.5.1 number = 0 name = fa01 cluster = ocfs2 node: ip_port =
2007 Mar 16
2
re: o2hb_do_disk_heartbeat:963 ERROR: Device "sdb1" another node is heartbeating in our slot!
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Folks, I'm trying to wrap my head around something that happened in our environment. Basically, we noticed the error in /var/log/messages with no other errors. "Mar 16 13:38:02 dbo3 kernel: (3712,3):o2hb_do_disk_heartbeat:963 ERROR: Device "sdb1": another node is heartbeating in our slot!" Usually there are a
2008 Oct 22
2
Another node is heartbeating in our slot! errors with LUN removal/addition
Greetings, Last night I manually unpresented and deleted a LUN (a SAN snapshot) that was presented to one node in a four node RAC environment running OCFS2 v1.4.1-1. The system then rebooted with the following error: Oct 21 16:45:34 ausracdb03 kernel: (27,1):o2hb_write_timeout:166 ERROR: Heartbeat write timeout to device dm-24 after 120000 milliseconds Oct 21 16:45:34 ausracdb03 kernel:
2011 Mar 03
1
OCFS2 1.4 + DRBD + iSCSI problem with DLM
An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20110303/0fbefee6/attachment.html
2007 Aug 07
0
Quorum and Fencing with user mode heartbeat
Hi all, I read the FAQ, especially the questions 75-84 about Quorum and Fencing. I want to use OCFS2 with Heartbeat V2 with heartbeat_mode 'user'. What I missed in the FAQ is a explanation of what role in the whole OCFS system is taken by HAv2 (or other Cluster software) when using heartbeat_mode 'user'. 1) When is disk heartbeating started? (Mount of device?) 2) When is
2006 Jan 09
0
[PATCH 01/11] ocfs2: event-driven quorum
This patch separates o2net and o2quo from knowing about one another as much as possible. This is the first in a series of patches that will allow userspace cluster interaction. Quorum is separated out first, and will ultimately only be associated with the disk heartbeat as a separate module. To do so, this patch performs the following changes: * o2hb_notify() is added to handle injection of
2007 Aug 09
0
About quorum and fencing
Hi, In the ocfs2 FAQ, it is written: "A node has quorum when: * it sees an odd number of heartbeating nodes and has network connectivity to more than half of them. OR, * it sees an even number of heartbeating nodes and has network connectivity to at least half of them *and* has connectivity to the heartbeating node with the lowest node
2006 Apr 14
1
[RFC: 2.6 patch] fs/ocfs2/: remove unused exports
This patch removes the following unused EXPORT_SYMBOL_GPL's: - cluster/heartbeat.c: o2hb_check_node_heartbeating_from_callback - cluster/heartbeat.c: o2hb_stop_all_regions - cluster/nodemanager.c: o2nm_get_node_by_num - cluster/nodemanager.c: o2nm_configured_node_map - cluster/nodemanager.c: o2nm_get_node_by_ip - cluster/nodemanager.c: o2nm_node_put - cluster/nodemanager.c: o2nm_node_get -
2008 Feb 04
0
[PATCH] o2net: Reconnect after idle time out.
Currently, o2net connects to a node on hb_up and disconnects on hb_down and net timeout. It disconnects on net timeout is ok, but it should attempt to reconnect back. This is because sometimes nodes get overloaded enough that the network connection breaks but the disk hb does not. And if we get into that situation, we either fence (unnecessarily) or wait for its disk hb to die (and sometimes hang
2010 Aug 22
1
ocfs2 crash on intensive disk write
Hi, I'm getting system (and eventually cluster) crashes on intensive disk writes in ubuntu server 10.04 with my OCFS2 file system. I have an iSER (infiniband) backed shared disk array with OCFS2 on it. There are 6 nodes in the cluster, and the heartbeat interface is over a regular 1GigE connection. Originally, the problem presented itself while I was doing performance testing and
2006 Nov 03
2
Newbie questions -- is OCFS2 what I even want?
Dear Sirs and Madams, I run a small visual effects production company, Hammerhead Productions. We'd like to have an easily extensible inexpensive relatively high-performance storage network using open-source components. I was hoping that OCFS2 would be that system. I have a half-dozen 2 TB fileservers I'd like the rest of the network to see as a single 12 TB disk, with the aggregate
2008 Feb 13
2
[PATCH] o2net: Reconnect after idle time out.V2
Modification from V1 to V2: 1. Use atomic ops instead of spin_lock in timer. 2. Add some comments when querying connect_expired work. These comments are copied form Zach's mail.;) Currently, o2net connects to a node on hb_up and disconnects on hb_down and net timeout. It disconnects on net timeout is ok, but it should attempt to reconnect back. This is because sometimes nodes get
2006 Sep 06
1
heartbeat.
Hello OCFS2 team, I'm currently looking at the OCFS2 code in linux-2.6.17.11, and i wander why OCFS2 performs its heartbeat on a disk region unlike on the network like many clustered services stack do. What is the requirement for a disk heartbeat ? Is there any way to tune this behaviour and change it into a network heartbeat ? -- Mathieu
2009 Nov 13
1
Cannot set heartbeat dead threshold
Hi I have: SLES 10 SP2 (2.6.16.60-0.21-smp) ocfs2-tools-1.4.0-0.3 ocfs2console-1.4.0-0.3 and I can't change "heartbeat dead threshold" value. Content of /etc/sysconfig/o2cb: # O2CB_ENABLED: 'true' means to load the driver on boot. O2CB_ENABLED=true # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start. O2CB_BOOTCLUSTER=ocfs2 # O2CB_HEARTBEAT_THRESHOLD:
2006 Feb 21
0
[PATCH 14/14] ocfs2: include disk heartbeat in ocfs2_nodemanager to avoid userspace changes
This patch removes disk heartbeat's modularity which makes it the default. Without this patch, userspace changes are required. This patch is not intended for permanent application, just to make it easier for users not interested in testing the userspace clustering implementation to use ocfs2. In order to switch to user clustering, use "o2cb offline" to shut down the cluster,
2023 Jun 27
0
[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock
As &qs->qs_lock is also acquired by the timer o2net_idle_timer() which executes under softirq context, code executing under process context should disable irq before acquiring the lock, otherwise deadlock could happen if the process context hold the lock then preempt by the timer. Possible deadlock scenario: o2quo_make_decision (workqueue) -> spin_lock(&qs->qs_lock);