Displaying 20 results from an estimated 8000 matches similar to: "About quorum and fencing"
2007 Aug 07
0
Quorum and Fencing with user mode heartbeat
Hi all,
I read the FAQ, especially the questions 75-84 about Quorum and Fencing.
I want to use OCFS2 with Heartbeat V2 with heartbeat_mode 'user'.
What I missed in the FAQ is a explanation of what role in the whole OCFS
system is taken by HAv2 (or other Cluster software)
when using heartbeat_mode 'user'.
1) When is disk heartbeating started? (Mount of device?)
2) When is
2006 Jan 09
0
[PATCH 01/11] ocfs2: event-driven quorum
This patch separates o2net and o2quo from knowing about one another as much
as possible. This is the first in a series of patches that will allow
userspace cluster interaction. Quorum is separated out first, and will
ultimately only be associated with the disk heartbeat as a separate module.
To do so, this patch performs the following changes:
* o2hb_notify() is added to handle injection of
2023 Jun 27
0
[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock
As &qs->qs_lock is also acquired by the timer o2net_idle_timer()
which executes under softirq context, code executing under process
context should disable irq before acquiring the lock, otherwise
deadlock could happen if the process context hold the lock then
preempt by the timer.
Possible deadlock scenario:
o2quo_make_decision (workqueue)
-> spin_lock(&qs->qs_lock);
2023 Jun 27
0
[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock
As &qs->qs_lock is also acquired by the timer o2net_idle_timer()
which executes under softirq context, code executing under process
context should disable irq before acquiring the lock, otherwise
deadlock could happen if the process context hold the lock then
preempt by the timer.
Possible deadlock scenario:
o2quo_make_decision (workqueue)
-> spin_lock(&qs->qs_lock);
2010 Dec 09
1
[PATCH] Call userspace script when self-fencing
Hi,
According to comments in file fs/ocfs2/cluster/quorum.c:70 about the
self-fencing operations :
/* It should instead flip the file
?* system RO and call some userspace script. */
So, I tried to add it (but i did'nt find a way to flip the fs in RO).
Here is a proposal for this functionnality, based on ocfs2-1.4.7.
This patch add an entry 'fence_cmd' in /sys to specify an
2018 Feb 26
0
Quorum in distributed-replicate volume
Hi Dave,
On Mon, Feb 26, 2018 at 4:45 PM, Dave Sherohman <dave at sherohman.org> wrote:
> I've configured 6 bricks as distributed-replicated with replica 2,
> expecting that all active bricks would be usable so long as a quorum of
> at least 4 live bricks is maintained.
>
The client quorum is configured per replica sub volume and not for the
entire volume.
Since you have a
2018 Feb 27
0
Quorum in distributed-replicate volume
On Mon, Feb 26, 2018 at 6:14 PM, Dave Sherohman <dave at sherohman.org> wrote:
> On Mon, Feb 26, 2018 at 05:45:27PM +0530, Karthik Subrahmanya wrote:
> > > "In a replica 2 volume... If we set the client-quorum option to
> > > auto, then the first brick must always be up, irrespective of the
> > > status of the second brick. If only the second brick is up,
2018 Feb 26
2
Quorum in distributed-replicate volume
On Mon, Feb 26, 2018 at 05:45:27PM +0530, Karthik Subrahmanya wrote:
> > "In a replica 2 volume... If we set the client-quorum option to
> > auto, then the first brick must always be up, irrespective of the
> > status of the second brick. If only the second brick is up, the
> > subvolume becomes read-only."
> >
> By default client-quorum is
2017 Oct 09
0
[Gluster-devel] AFR: Fail lookups when quorum not met
On 09/22/2017 07:27 PM, Niels de Vos wrote:
> On Fri, Sep 22, 2017 at 12:27:46PM +0530, Ravishankar N wrote:
>> Hello,
>>
>> In AFR we currently allow look-ups to pass through without taking into
>> account whether the lookup is served from the good or bad brick. We always
>> serve from the good brick whenever possible, but if there is none, we just
>> serve
2018 Feb 27
0
Quorum in distributed-replicate volume
On Tue, Feb 27, 2018 at 1:40 PM, Dave Sherohman <dave at sherohman.org> wrote:
> On Tue, Feb 27, 2018 at 12:00:29PM +0530, Karthik Subrahmanya wrote:
> > I will try to explain how you can end up in split-brain even with cluster
> > wide quorum:
>
> Yep, the explanation made sense. I hadn't considered the possibility of
> alternating outages. Thanks!
>
>
2012 May 24
0
Is it possible to use quorum for CTDB to prevent split-brain and removing lockfile in the cluster file system
Hello list,
We know that CTDB uses lockfile in the cluster file system to prevent
split-brain.
It is a really good design when all nodes in the cluster can mount the
cluster file system (e.g. GPFS/GFS/GlusterFS) and CTDB can work happily in
this assumption.
However, when split-brain happens, the disconnected private network
violates this assumption usually.
For example, we have four nodes (A, B,
2009 Nov 17
1
[PATCH 1/1] ocfs2/cluster: Make fence method configurable
By default, o2cb fences the box by calling emergency_restart(). While this
scheme works well in production, it comes in the way during testing as it
does not let the tester take stack/core dumps for analysis.
This patch allows user to dynamically change the fence method to panic() by:
# echo "panic" > /sys/kernel/config/cluster/<clustername>/fence_method
Signed-off-by: Sunil
2006 Apr 18
1
Self-fencing issues (RHEL4)
Hi.
I'm running RHEL4 for my test system, Adaptec Firewire controllers,
Maxtor One Touch III shared disk (see the details below),
100Mb/s dedicated interconnect. It panics with no load about each
20 minutes (error message from netconsole attached)
Any clues?
Yegor
---
[root at rac1 ~]# cat /proc/fs/ocfs2/version
OCFS2 1.2.0 Tue Mar 7 15:51:20 PST 2006 (build
2006 Oct 13
1
Cluster Quorum Question/Problem
Greetings all,
I am in need of professional insight. I have a 2 node cluster running
CentOS, mysql, apache, etc. I have on each system a fiber HBA connected to
a fiber SAN. Each system shows the devices sdb and sdc for each of the
connections on the HBA. I have sdc1 mounted on both machines as /quorum.
When I right to the /quorum from one of the nodes, the file doesn't show up
on the
2005 Apr 17
2
Quorum error
Had a problem starting Oracle after expanding an EMC Metalun. We get the
following errors:
>WARNING: OemInit2: Opened file(/oradata/dbf/quorum.dbf 8), tid =
main:1024 file = oem.c, line = 491 {Sun Apr 17 10:33:41 2005 }
>ERROR: ReadOthersDskInfo(): ReadFile(/oradata/dbf/quorum.dbf)
failed(5) - (0) bytes read, tid = main:1024 file = oem.c, line = 1396
{Sun Apr 17 10:33:41 2005 }
2018 Feb 26
2
Quorum in distributed-replicate volume
I've configured 6 bricks as distributed-replicated with replica 2,
expecting that all active bricks would be usable so long as a quorum of
at least 4 live bricks is maintained.
However, I have just found
http://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/
Which states that "In a replica 2 volume... If we set the client-quorum
2014 Mar 06
1
Clarification on cluster quorum
Hi,
I'm looking for an option to add an arbiter node to the gluster
cluster, but the leads I've been following seem to lead to
inconclusive results.
The scenario is, a 2 node replicated cluster. What I want to do is
introduce a fake host/arbiter node which would set the cluster to a 3
node meaning, we can meet the conditions of allow over 50% to write
(ie. 2 can write, 1 can not).
2008 Mar 05
3
cluster with 2 nodes - heartbeat problem fencing
Hi to all, this is My first time on this mailinglist.
I have a problem with Ocfs2 on Debian etch 4.0
I'd like when a node go down or freeze without unmount the ocfs2 partition
the heartbeat not fence the server that work well ( kernel panic ).
I'd like disable or heartbeat or fencing. So we can work also with only 1
node.
Thanks
2017 Sep 22
2
AFR: Fail lookups when quorum not met
Hello,
In AFR we currently allow look-ups to pass through without taking into
account whether the lookup is served from the good or bad brick. We
always serve from the good brick whenever possible, but if there is
none, we just serve the lookup from one of the bricks that we got a
positive reply from.
We found a bug? [1] due to this behavior were the iatt values returned
in the lookup call
2006 Aug 14
1
2 node cluster, Heartbeat2, self-fencing
Hello everyone.
I am currently working on setting up new servers for my employer. Basically we
want two servers, all of them running several VEs (virtual environments,
OpenVZ) which can dynamically take over each others job if necessary. Some
services will run concurrently on both servers like apache2 (load balancing),
so those need concurrent access to specific data.
We had a close look at