thr3ads.net - similar to: "heartbeating in the wrong slot"

Displaying 20 results from an estimated 8000 matches similar to: "heartbeating in the wrong slot"

Another node is heartbeating in our slot!

2006 May 26

Another node is heartbeating in our slot!

All, We are having some problems getting OCFS2 to run, we are using kernel 2.6.15 with OCFS2 1.2.1. Compiling the OCFS2 sources went fine and all modules load perfectly. However, we can only mount the OCFS2 volume on one machine at a time, when we try to mount the volume on the 2 other machines we get an error stating that another node is heartbeating in our slot. When we mount the volume

OCFS2 + iscsi: another node is heartbeating in our slot (over scst)

2010 Oct 20

OCFS2 + iscsi: another node is heartbeating in our slot (over scst)

Hi, I'm building a cluster containing two nodes with seperate common storage server. On storage server i have volume with ocfs2 fs which is sharing this volume via iscsi target. When node connected to the target i can local mount volume on node and using it. Unfortunately. on storage server ocfs2 logged to dmesg: Oct 19 22:21:02 storage kernel: [ 1510.424144]

ocfs2 and another node is heartbeating in our slot

2008 Mar 05

ocfs2 and another node is heartbeating in our slot

Hello, I have one cluster drbd8+ocfs2. If I mount ocfs2 partition on node1 it's work but when I mount partition on node 2 I receive in /var/log/messages this -Mar 5 18:10:04 suse4 kernel: (2857,0):o2hb_do_disk_heartbeat:665 ERROR: Device "drbd1": another node is heartbeating in our slot! -Mar 5 18:10:04 suse4 kernel: WARNING: at include/asm/dma-mapping.h:44 dma_map_sg() -Mar 5

Ocfs2-users Digest, Vol 57, Issue 14

2008 Sep 18

Ocfs2-users Digest, Vol 57, Issue 14

I think I might have miss understood where it is failing, has this file been added to the DB on the web site or does it fail when you try to onfigure this? Carle Simmonds Infrastructure Consultant Technology Services Experian UK Ltd __________________________________________________ Tel: +44 (0)115 941 0888 (main switchboard) Mobile: +44 (0)7813 854834 E-Mail: carle.simmonds at uk.experian.com

o2hb_do_disk_heartbeat:982:ERROR

2008 Sep 18

o2hb_do_disk_heartbeat:982:ERROR

Hi everyone; I have a problem on my 10 nodes cluster with ocfs2 1.2.9 and the OS is RHEL 4.7 AS. 9 nodes can start o2cb service and mount san disks on startup however one node can not do that. My cluster configuration is : node: ip_port = 7777 ip_address = 192.168.5.1 number = 0 name = fa01 cluster = ocfs2 node: ip_port =

re: o2hb_do_disk_heartbeat:963 ERROR: Device "sdb1" another node is heartbeating in our slot!

2007 Mar 16

re: o2hb_do_disk_heartbeat:963 ERROR: Device "sdb1" another node is heartbeating in our slot!

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Folks, I'm trying to wrap my head around something that happened in our environment. Basically, we noticed the error in /var/log/messages with no other errors. "Mar 16 13:38:02 dbo3 kernel: (3712,3):o2hb_do_disk_heartbeat:963 ERROR: Device "sdb1": another node is heartbeating in our slot!" Usually there are a

Another node is heartbeating in our slot! errors with LUN removal/addition

2008 Oct 22

Another node is heartbeating in our slot! errors with LUN removal/addition

Greetings, Last night I manually unpresented and deleted a LUN (a SAN snapshot) that was presented to one node in a four node RAC environment running OCFS2 v1.4.1-1. The system then rebooted with the following error: Oct 21 16:45:34 ausracdb03 kernel: (27,1):o2hb_write_timeout:166 ERROR: Heartbeat write timeout to device dm-24 after 120000 milliseconds Oct 21 16:45:34 ausracdb03 kernel:

OCFS2 1.4 + DRBD + iSCSI problem with DLM

2011 Mar 03

OCFS2 1.4 + DRBD + iSCSI problem with DLM

An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20110303/0fbefee6/attachment.html

Quorum and Fencing with user mode heartbeat

2007 Aug 07

Quorum and Fencing with user mode heartbeat

Hi all, I read the FAQ, especially the questions 75-84 about Quorum and Fencing. I want to use OCFS2 with Heartbeat V2 with heartbeat_mode 'user'. What I missed in the FAQ is a explanation of what role in the whole OCFS system is taken by HAv2 (or other Cluster software) when using heartbeat_mode 'user'. 1) When is disk heartbeating started? (Mount of device?) 2) When is

[PATCH 01/11] ocfs2: event-driven quorum

2006 Jan 09

[PATCH 01/11] ocfs2: event-driven quorum

This patch separates o2net and o2quo from knowing about one another as much as possible. This is the first in a series of patches that will allow userspace cluster interaction. Quorum is separated out first, and will ultimately only be associated with the disk heartbeat as a separate module. To do so, this patch performs the following changes: * o2hb_notify() is added to handle injection of

About quorum and fencing

2007 Aug 09

About quorum and fencing

Hi, In the ocfs2 FAQ, it is written: "A node has quorum when: * it sees an odd number of heartbeating nodes and has network connectivity to more than half of them. OR, * it sees an even number of heartbeating nodes and has network connectivity to at least half of them *and* has connectivity to the heartbeating node with the lowest node

[RFC: 2.6 patch] fs/ocfs2/: remove unused exports

2006 Apr 14

[RFC: 2.6 patch] fs/ocfs2/: remove unused exports

This patch removes the following unused EXPORT_SYMBOL_GPL's: - cluster/heartbeat.c: o2hb_check_node_heartbeating_from_callback - cluster/heartbeat.c: o2hb_stop_all_regions - cluster/nodemanager.c: o2nm_get_node_by_num - cluster/nodemanager.c: o2nm_configured_node_map - cluster/nodemanager.c: o2nm_get_node_by_ip - cluster/nodemanager.c: o2nm_node_put - cluster/nodemanager.c: o2nm_node_get -

[PATCH] o2net: Reconnect after idle time out.

2008 Feb 04

[PATCH] o2net: Reconnect after idle time out.

Currently, o2net connects to a node on hb_up and disconnects on hb_down and net timeout. It disconnects on net timeout is ok, but it should attempt to reconnect back. This is because sometimes nodes get overloaded enough that the network connection breaks but the disk hb does not. And if we get into that situation, we either fence (unnecessarily) or wait for its disk hb to die (and sometimes hang

ocfs2 crash on intensive disk write

2010 Aug 22

ocfs2 crash on intensive disk write

Hi, I'm getting system (and eventually cluster) crashes on intensive disk writes in ubuntu server 10.04 with my OCFS2 file system. I have an iSER (infiniband) backed shared disk array with OCFS2 on it. There are 6 nodes in the cluster, and the heartbeat interface is over a regular 1GigE connection. Originally, the problem presented itself while I was doing performance testing and

Newbie questions -- is OCFS2 what I even want?

2006 Nov 03

Newbie questions -- is OCFS2 what I even want?

Dear Sirs and Madams, I run a small visual effects production company, Hammerhead Productions. We'd like to have an easily extensible inexpensive relatively high-performance storage network using open-source components. I was hoping that OCFS2 would be that system. I have a half-dozen 2 TB fileservers I'd like the rest of the network to see as a single 12 TB disk, with the aggregate

[PATCH] o2net: Reconnect after idle time out.V2

2008 Feb 13

[PATCH] o2net: Reconnect after idle time out.V2

Modification from V1 to V2: 1. Use atomic ops instead of spin_lock in timer. 2. Add some comments when querying connect_expired work. These comments are copied form Zach's mail.;) Currently, o2net connects to a node on hb_up and disconnects on hb_down and net timeout. It disconnects on net timeout is ok, but it should attempt to reconnect back. This is because sometimes nodes get

heartbeat.

2006 Sep 06

heartbeat.

Hello OCFS2 team, I'm currently looking at the OCFS2 code in linux-2.6.17.11, and i wander why OCFS2 performs its heartbeat on a disk region unlike on the network like many clustered services stack do. What is the requirement for a disk heartbeat ? Is there any way to tune this behaviour and change it into a network heartbeat ? -- Mathieu

Cannot set heartbeat dead threshold

2009 Nov 13

Cannot set heartbeat dead threshold

Hi I have: SLES 10 SP2 (2.6.16.60-0.21-smp) ocfs2-tools-1.4.0-0.3 ocfs2console-1.4.0-0.3 and I can't change "heartbeat dead threshold" value. Content of /etc/sysconfig/o2cb: # O2CB_ENABLED: 'true' means to load the driver on boot. O2CB_ENABLED=true # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start. O2CB_BOOTCLUSTER=ocfs2 # O2CB_HEARTBEAT_THRESHOLD:

[PATCH 14/14] ocfs2: include disk heartbeat in ocfs2_nodemanager to avoid userspace changes

2006 Feb 21

[PATCH 14/14] ocfs2: include disk heartbeat in ocfs2_nodemanager to avoid userspace changes

This patch removes disk heartbeat's modularity which makes it the default. Without this patch, userspace changes are required. This patch is not intended for permanent application, just to make it easier for users not interested in testing the userspace clustering implementation to use ocfs2. In order to switch to user clustering, use "o2cb offline" to shut down the cluster,

[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock

2023 Jun 27

[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock

As &qs->qs_lock is also acquired by the timer o2net_idle_timer() which executes under softirq context, code executing under process context should disable irq before acquiring the lock, otherwise deadlock could happen if the process context hold the lock then preempt by the timer. Possible deadlock scenario: o2quo_make_decision (workqueue) -> spin_lock(&qs->qs_lock);

similar to: heartbeating in the wrong slot