Displaying 20 results from an estimated 8000 matches similar to: "heartbeating in the wrong slot"
2006 May 26
1
Another node is heartbeating in our slot!
All,
We are having some problems getting OCFS2 to run, we are using kernel
2.6.15 with OCFS2 1.2.1. Compiling the OCFS2 sources went fine and all
modules load perfectly.
However, we can only mount the OCFS2 volume on one machine at a time,
when we try to mount the volume on the 2 other machines we get an error
stating that another node is heartbeating in our slot. When we mount the
volume
2010 Oct 20
1
OCFS2 + iscsi: another node is heartbeating in our slot (over scst)
Hi,
I'm building a cluster containing two nodes with seperate common storage
server.
On storage server i have volume with ocfs2 fs which is sharing this
volume via iscsi target.
When node connected to the target i can local mount volume on node and
using it.
Unfortunately. on storage server ocfs2 logged to dmesg:
Oct 19 22:21:02 storage kernel: [ 1510.424144]
2008 Mar 05
0
ocfs2 and another node is heartbeating in our slot
Hello,
I have one cluster drbd8+ocfs2.
If I mount ocfs2 partition on node1 it's work but when I mount partition on
node 2 I receive in /var/log/messages this
-Mar 5 18:10:04 suse4 kernel: (2857,0):o2hb_do_disk_heartbeat:665 ERROR:
Device "drbd1": another node is heartbeating in our slot!
-Mar 5 18:10:04 suse4 kernel: WARNING: at include/asm/dma-mapping.h:44
dma_map_sg()
-Mar 5
2008 Sep 18
0
Ocfs2-users Digest, Vol 57, Issue 14
I think I might have miss understood where it is failing, has this file
been added to the DB on the web site or does it fail when you try to
onfigure this?
Carle Simmonds
Infrastructure Consultant
Technology Services
Experian UK Ltd
__________________________________________________
Tel: +44 (0)115 941 0888 (main switchboard)
Mobile: +44 (0)7813 854834
E-Mail: carle.simmonds at uk.experian.com
2008 Sep 18
2
o2hb_do_disk_heartbeat:982:ERROR
Hi everyone;
I have a problem on my 10 nodes cluster with ocfs2 1.2.9 and the OS is RHEL 4.7 AS.
9 nodes can start o2cb service and mount san disks on startup however one node can not do that. My cluster configuration is :
node:
ip_port = 7777
ip_address = 192.168.5.1
number = 0
name = fa01
cluster = ocfs2
node:
ip_port =
2007 Mar 16
2
re: o2hb_do_disk_heartbeat:963 ERROR: Device "sdb1" another node is heartbeating in our slot!
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Folks,
I'm trying to wrap my head around something that happened in our environment.
Basically, we noticed the error in /var/log/messages with no other errors.
"Mar 16 13:38:02 dbo3 kernel: (3712,3):o2hb_do_disk_heartbeat:963 ERROR: Device "sdb1": another node is
heartbeating in our slot!"
Usually there are a
2008 Oct 22
2
Another node is heartbeating in our slot! errors with LUN removal/addition
Greetings,
Last night I manually unpresented and deleted a LUN (a SAN snapshot)
that was presented to one node in a four node RAC environment running
OCFS2 v1.4.1-1. The system then rebooted with the following error:
Oct 21 16:45:34 ausracdb03 kernel: (27,1):o2hb_write_timeout:166 ERROR:
Heartbeat write timeout to device dm-24 after 120000 milliseconds
Oct 21 16:45:34 ausracdb03 kernel:
2011 Mar 03
1
OCFS2 1.4 + DRBD + iSCSI problem with DLM
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20110303/0fbefee6/attachment.html
2007 Aug 07
0
Quorum and Fencing with user mode heartbeat
Hi all,
I read the FAQ, especially the questions 75-84 about Quorum and Fencing.
I want to use OCFS2 with Heartbeat V2 with heartbeat_mode 'user'.
What I missed in the FAQ is a explanation of what role in the whole OCFS
system is taken by HAv2 (or other Cluster software)
when using heartbeat_mode 'user'.
1) When is disk heartbeating started? (Mount of device?)
2) When is
2006 Jan 09
0
[PATCH 01/11] ocfs2: event-driven quorum
This patch separates o2net and o2quo from knowing about one another as much
as possible. This is the first in a series of patches that will allow
userspace cluster interaction. Quorum is separated out first, and will
ultimately only be associated with the disk heartbeat as a separate module.
To do so, this patch performs the following changes:
* o2hb_notify() is added to handle injection of
2007 Aug 09
0
About quorum and fencing
Hi,
In the ocfs2 FAQ, it is written:
"A node has quorum when:
* it sees an odd number of heartbeating nodes and has network
connectivity to more than half of them.
OR,
* it sees an even number of heartbeating nodes and has network
connectivity to at least half of them *and* has connectivity to
the heartbeating node with the lowest node
2006 Apr 14
1
[RFC: 2.6 patch] fs/ocfs2/: remove unused exports
This patch removes the following unused EXPORT_SYMBOL_GPL's:
- cluster/heartbeat.c: o2hb_check_node_heartbeating_from_callback
- cluster/heartbeat.c: o2hb_stop_all_regions
- cluster/nodemanager.c: o2nm_get_node_by_num
- cluster/nodemanager.c: o2nm_configured_node_map
- cluster/nodemanager.c: o2nm_get_node_by_ip
- cluster/nodemanager.c: o2nm_node_put
- cluster/nodemanager.c: o2nm_node_get
-
2008 Feb 04
0
[PATCH] o2net: Reconnect after idle time out.
Currently, o2net connects to a node on hb_up and disconnects on
hb_down and net timeout.
It disconnects on net timeout is ok, but it should attempt to
reconnect back. This is because sometimes nodes get overloaded
enough that the network connection breaks but the disk hb does not.
And if we get into that situation, we either fence (unnecessarily)
or wait for its disk hb to die (and sometimes hang
2010 Aug 22
1
ocfs2 crash on intensive disk write
Hi,
I'm getting system (and eventually cluster) crashes on intensive disk
writes in ubuntu server 10.04 with my OCFS2 file system.
I have an iSER (infiniband) backed shared disk array with OCFS2 on it.
There are 6 nodes in the cluster, and the heartbeat interface is over a
regular 1GigE connection. Originally, the problem presented itself while
I was doing performance testing and
2006 Nov 03
2
Newbie questions -- is OCFS2 what I even want?
Dear Sirs and Madams,
I run a small visual effects production company, Hammerhead Productions.
We'd like to have an easily extensible inexpensive relatively
high-performance
storage network using open-source components. I was hoping that OCFS2
would be that system.
I have a half-dozen 2 TB fileservers I'd like the rest of the network to see
as a single 12 TB disk, with the aggregate
2008 Feb 13
2
[PATCH] o2net: Reconnect after idle time out.V2
Modification from V1 to V2:
1. Use atomic ops instead of spin_lock in timer.
2. Add some comments when querying connect_expired work.
These comments are copied form Zach's mail.;)
Currently, o2net connects to a node on hb_up and disconnects on
hb_down and net timeout.
It disconnects on net timeout is ok, but it should attempt to
reconnect back. This is because sometimes nodes get
2006 Sep 06
1
heartbeat.
Hello OCFS2 team,
I'm currently looking at the OCFS2 code in linux-2.6.17.11, and i wander
why OCFS2 performs its heartbeat on a disk region unlike on the network
like many clustered services stack do. What is the requirement for a
disk heartbeat ? Is there any way to tune this behaviour and change it
into a network heartbeat ?
--
Mathieu
2009 Nov 13
1
Cannot set heartbeat dead threshold
Hi
I have:
SLES 10 SP2 (2.6.16.60-0.21-smp)
ocfs2-tools-1.4.0-0.3
ocfs2console-1.4.0-0.3
and I can't change "heartbeat dead threshold" value.
Content of /etc/sysconfig/o2cb:
# O2CB_ENABLED: 'true' means to load the driver on boot.
O2CB_ENABLED=true
# O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.
O2CB_BOOTCLUSTER=ocfs2
# O2CB_HEARTBEAT_THRESHOLD:
2006 Feb 21
0
[PATCH 14/14] ocfs2: include disk heartbeat in ocfs2_nodemanager to avoid userspace changes
This patch removes disk heartbeat's modularity which makes it the default.
Without this patch, userspace changes are required.
This patch is not intended for permanent application, just to make it easier
for users not interested in testing the userspace clustering implementation
to use ocfs2.
In order to switch to user clustering, use "o2cb offline" to shut down the
cluster,
2023 Jun 27
0
[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock
As &qs->qs_lock is also acquired by the timer o2net_idle_timer()
which executes under softirq context, code executing under process
context should disable irq before acquiring the lock, otherwise
deadlock could happen if the process context hold the lock then
preempt by the timer.
Possible deadlock scenario:
o2quo_make_decision (workqueue)
-> spin_lock(&qs->qs_lock);