similar to: Node crashed after remove a path

Displaying 20 results from an estimated 100 matches similar to: "Node crashed after remove a path"

2007 Mar 08
4
ocfs2 cluster becomes unresponsive
We are running OCFS2 on SLES9 machines using a FC SAN. Without warning both nodes will become unresponsive. Can not access either machine via ssh or terminal (hangs after typing in username). However the machine still responds to pings. This continues until one node is rebooted, at which time the second node resumes normal operations. I am not entirely sure that this is an OCFS2 problem at all
2008 Jul 14
1
Node fence on RHEL4 machine running 1.2.8-2
Hello, We have a four-node RHEL4 RAC cluster running OCFS2 version 1.2.8-2 and the 2.6.9-67.0.4hugemem kernel. The cluster has been really stable since we upgraded to 1.2.8-2 early this year, but this morning, one of the nodes fenced and rebooted itself, and I wonder if anyone could glance at the below remote syslogs and offer an opinion as to why. First, here's the output of
2009 Nov 06
0
iscsi connection drop, comes back in seconds, then deadlock in cluster
Greetings ocfs2 folks, A client is experiencing some random deadlock issues within a cluster, wondering if anyone can point us in the right direction. The iSCSI connection seemed to have dropped on one node briefly, ultimately several hours later landing us in a complete deadlock scenario where multiple nodes (Node 7 and Node 8) had to be panic'd (by hand - they didn't ever panic on
2010 Dec 09
2
servers blocked on ocfs2
Hi, we have recently started to use ocfs2 on some RHEL 5.5 servers (ocfs2-1.4.7) Some days ago, two servers sharing an ocfs2 filesystem, and with quite virtual services, stalled, in what it seems on ocfs2 issue. This are the lines in their messages files: =====node heraclito (0)======================================== /Dec 4 09:15:06 heraclito kernel: o2net: connection to node parmenides
2008 Jan 23
1
OCFS2 DLM problems
Hello everyone, once again. We are running into a problem, which has shown now 2 times, possible 3 (once the systems looked different.) The environment is 6 HP DL360/380 g5 servers with eth0 being the public interface, eth1 and bond0 (eth2 and eth3) used for clusterware and bond0 also used for OCFS2. The bond0 interface is in active/passive mode. There are no network errors counters showing and
2023 Jun 27
0
[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock
As &qs->qs_lock is also acquired by the timer o2net_idle_timer() which executes under softirq context, code executing under process context should disable irq before acquiring the lock, otherwise deadlock could happen if the process context hold the lock then preempt by the timer. Possible deadlock scenario: o2quo_make_decision (workqueue) -> spin_lock(&qs->qs_lock);
2023 Jun 27
0
[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock
As &qs->qs_lock is also acquired by the timer o2net_idle_timer() which executes under softirq context, code executing under process context should disable irq before acquiring the lock, otherwise deadlock could happen if the process context hold the lock then preempt by the timer. Possible deadlock scenario: o2quo_make_decision (workqueue) -> spin_lock(&qs->qs_lock);
2010 Nov 13
1
Nat Issue - I think
Hi, I'm using qualify= on my asterisk server that provides outgoing pstn calls to a few companies. I've got one client in particular that has their own asterisk server which is connected to my server. This client seems to be having a nat issue. It's not a connectivity issue as i've tried constant pings and the line is up constantly. I'm getting the following: [2010-11-13
2015 Mar 30
0
WaitForSilence NEVER detects silence,,Post
I have a call server that runs on a few custom AGI scripts initiating calls and then managing the calls. I'm getting stuck on the detecting silence functions. I wanted to use the silence detecting as a quick method of substituting Answering Machine Detection. However, whenever WaitForSilence is supposed to be detecting silence, it always just ends the interval whether or not there is
2015 Mar 30
0
WaitForSilence NEVER detects silence
I have a call server that runs on a few custom AGI scripts initiating calls and then managing the calls. I'm getting stuck on the detecting silence functions. I wanted to use the silence detecting as a quick method of substituting Answering Machine Detection. However, whenever WaitForSilence is supposed to be detecting silence, it always just ends the interval whether or not there is
2010 Jan 14
1
another fencing question
Hi, periodically one of on my two nodes cluster is fenced here are the logs: Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2- rc.minint.it (num 0) at 1.1.1.6:7777 Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR: link to 0 went down! Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR: status = -112 Jan 14 07:01:44
2006 Jan 09
0
[PATCH 01/11] ocfs2: event-driven quorum
This patch separates o2net and o2quo from knowing about one another as much as possible. This is the first in a series of patches that will allow userspace cluster interaction. Quorum is separated out first, and will ultimately only be associated with the disk heartbeat as a separate module. To do so, this patch performs the following changes: * o2hb_notify() is added to handle injection of
2017 Feb 21
2
no connectivity to some hosts behind tinc for the first few seconds
I have the following tinc setup: client -- tinc DC1 -- tinc DC2 -- 10.1.2.0/24 subnet It generally works well, however, there is one issue I'm not able to solve: *sometimes*, connectivity to *some* destinations does not work for the first few seconds. To demonstrate: $ mongo mongo.example.com:27017 MongoDB shell version: 3.2.12 connecting to: mongo.example.com:27017/test
2011 Mar 12
1
SASL abort causes 5s delay, triggered by UW libc-client
Since upgrding to Debian squeeze, the web mail system (Imp4/Horde3) suffers delays every time a new IMAP connection is needed. Tracing the authentication conversation, we find: 08:45:55.270609: 00000000 AUTHENTICATE GSSAPI 08:45:55.271277: + 08:45:55.271761: * 08:45:55.271782: 00000000 BAD Authentication aborted by client. 08:45:55.271815: 00000001 AUTHENTICATE PLAIN 08:46:00.271008: + and the
2013 Jul 02
1
Queue questions - Asterisk 11
Hi all, I have to questions about queues. Member is a phone like SIP/myphone and only one member in the queue. At first, DIALSTATUS doesn't return any status. How to now if a call in queue has been answered or if caller just hangup? Second, how to deal with timeout, I have strange behaviors. If I put timeout=60 in queue.conf and I call the queue passing also 60 as timeout value,
2020 Jun 30
2
CTDB RecLockLatencyMs vs RecoverInterval
Hi I have a question regarding CTDB RecLockLatencyMs tunable parameter. Is there any relationship between the RecLockLatencyMs property and the RecoverInterval property? Does one need to be larger than the other? Or if RecLockLatencyMs were increased to 5000ms, should some other setting be changed in proportion? We're using a geo-distributed etcd cluster for the CTDB recovery lock and I
2015 Apr 21
2
[BUG] imap-login segfault when running nmap -sV
Hi, I've noticed that nmap crashes my imap-login (also pop3-login) and narrowed it down to `nmap -sV -p 993 $host`. I've noticed that if I remove "ssl_protocols = !SSLv2 !SSLv3" from my config or enable SSLv3 rather than disabling it the segfault disappears. I'm running on Arch Linux with dovecot 2.2.16-1 and openssl 1.0.2.a-1. I've also attached a network capture, but
2020 Jun 30
0
CTDB RecLockLatencyMs vs RecoverInterval
Hi Bob, On Tue, 30 Jun 2020 17:00:11 -0400, Robert Buck via samba <samba at lists.samba.org> wrote: > I have a question regarding CTDB RecLockLatencyMs tunable parameter. Is > there any relationship between the RecLockLatencyMs property and > the RecoverInterval property? Does one need to be larger than the other? Or > if RecLockLatencyMs were increased to 5000ms, should some
2020 Jul 01
1
CTDB RecLockLatencyMs vs RecoverInterval
Thank you, Martin. Yes, we happen to be using Samba and CTDB v4.10.7, on Ubuntu. *Would these happen to include the defect?* *In your opinion, will 4s be an issue?* We happen to be running this on top of a geo-distributed etcd cluster, and in this particular case there was about 4200 miles between the two data centers. We're running a distributed NFS file system over a total of three data
2023 Apr 17
1
RTP address learning and timing problem
Hi Joshua, Thank you for that. From the code it kind of looks like STRICT_RTP_LEARN_TIMEOUT is a minimum, not a maximum: if (!ast_sockaddr_isnull(&rtp->strict_rtp_address) && STRICT_RTP_LEARN_TIMEOUT < ast_tvdiff_ms(ast_tvnow(), rtp->rtp_source_learn.start)) { ast_verb(4, "%p -- Strict RTP learning complete - Locking on source address %s\n", Our call shows: #