thr3ads.net - similar to: "Node crashed after remove a path"

Displaying 20 results from an estimated 100 matches similar to: "Node crashed after remove a path"

2007 Mar 08

ocfs2 cluster becomes unresponsive

We are running OCFS2 on SLES9 machines using a FC SAN. Without warning both nodes will become unresponsive. Can not access either machine via ssh or terminal (hangs after typing in username). However the machine still responds to pings. This continues until one node is rebooted, at which time the second node resumes normal operations. I am not entirely sure that this is an OCFS2 problem at all

Node fence on RHEL4 machine running 1.2.8-2

2008 Jul 14

Node fence on RHEL4 machine running 1.2.8-2

Hello, We have a four-node RHEL4 RAC cluster running OCFS2 version 1.2.8-2 and the 2.6.9-67.0.4hugemem kernel. The cluster has been really stable since we upgraded to 1.2.8-2 early this year, but this morning, one of the nodes fenced and rebooted itself, and I wonder if anyone could glance at the below remote syslogs and offer an opinion as to why. First, here's the output of

iscsi connection drop, comes back in seconds, then deadlock in cluster

2009 Nov 06

iscsi connection drop, comes back in seconds, then deadlock in cluster

Greetings ocfs2 folks, A client is experiencing some random deadlock issues within a cluster, wondering if anyone can point us in the right direction. The iSCSI connection seemed to have dropped on one node briefly, ultimately several hours later landing us in a complete deadlock scenario where multiple nodes (Node 7 and Node 8) had to be panic'd (by hand - they didn't ever panic on

servers blocked on ocfs2

2010 Dec 09

servers blocked on ocfs2

Hi, we have recently started to use ocfs2 on some RHEL 5.5 servers (ocfs2-1.4.7) Some days ago, two servers sharing an ocfs2 filesystem, and with quite virtual services, stalled, in what it seems on ocfs2 issue. This are the lines in their messages files: =====node heraclito (0)======================================== /Dec 4 09:15:06 heraclito kernel: o2net: connection to node parmenides

OCFS2 DLM problems

2008 Jan 23

OCFS2 DLM problems

Hello everyone, once again. We are running into a problem, which has shown now 2 times, possible 3 (once the systems looked different.) The environment is 6 HP DL360/380 g5 servers with eth0 being the public interface, eth1 and bond0 (eth2 and eth3) used for clusterware and bond0 also used for OCFS2. The bond0 interface is in active/passive mode. There are no network errors counters showing and

[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock

2023 Jun 27

[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock

As &qs->qs_lock is also acquired by the timer o2net_idle_timer() which executes under softirq context, code executing under process context should disable irq before acquiring the lock, otherwise deadlock could happen if the process context hold the lock then preempt by the timer. Possible deadlock scenario: o2quo_make_decision (workqueue) -> spin_lock(&qs->qs_lock);

[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock

2023 Jun 27

[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock

Nat Issue - I think

2010 Nov 13

Nat Issue - I think

Hi, I'm using qualify= on my asterisk server that provides outgoing pstn calls to a few companies. I've got one client in particular that has their own asterisk server which is connected to my server. This client seems to be having a nat issue. It's not a connectivity issue as i've tried constant pings and the line is up constantly. I'm getting the following: [2010-11-13

WaitForSilence NEVER detects silence,,Post

2015 Mar 30

WaitForSilence NEVER detects silence,,Post

I have a call server that runs on a few custom AGI scripts initiating calls and then managing the calls. I'm getting stuck on the detecting silence functions. I wanted to use the silence detecting as a quick method of substituting Answering Machine Detection. However, whenever WaitForSilence is supposed to be detecting silence, it always just ends the interval whether or not there is

WaitForSilence NEVER detects silence

2015 Mar 30

WaitForSilence NEVER detects silence

another fencing question

2010 Jan 14

another fencing question

Hi, periodically one of on my two nodes cluster is fenced here are the logs: Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2- rc.minint.it (num 0) at 1.1.1.6:7777 Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR: link to 0 went down! Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR: status = -112 Jan 14 07:01:44

[PATCH 01/11] ocfs2: event-driven quorum

2006 Jan 09

[PATCH 01/11] ocfs2: event-driven quorum

This patch separates o2net and o2quo from knowing about one another as much as possible. This is the first in a series of patches that will allow userspace cluster interaction. Quorum is separated out first, and will ultimately only be associated with the disk heartbeat as a separate module. To do so, this patch performs the following changes: * o2hb_notify() is added to handle injection of

no connectivity to some hosts behind tinc for the first few seconds

2017 Feb 21

no connectivity to some hosts behind tinc for the first few seconds

I have the following tinc setup: client -- tinc DC1 -- tinc DC2 -- 10.1.2.0/24 subnet It generally works well, however, there is one issue I'm not able to solve: *sometimes*, connectivity to *some* destinations does not work for the first few seconds. To demonstrate: $ mongo mongo.example.com:27017 MongoDB shell version: 3.2.12 connecting to: mongo.example.com:27017/test

SASL abort causes 5s delay, triggered by UW libc-client

2011 Mar 12

SASL abort causes 5s delay, triggered by UW libc-client

Since upgrding to Debian squeeze, the web mail system (Imp4/Horde3) suffers delays every time a new IMAP connection is needed. Tracing the authentication conversation, we find: 08:45:55.270609: 00000000 AUTHENTICATE GSSAPI 08:45:55.271277: + 08:45:55.271761: * 08:45:55.271782: 00000000 BAD Authentication aborted by client. 08:45:55.271815: 00000001 AUTHENTICATE PLAIN 08:46:00.271008: + and the

Queue questions - Asterisk 11

2013 Jul 02

Queue questions - Asterisk 11

Hi all, I have to questions about queues. Member is a phone like SIP/myphone and only one member in the queue. At first, DIALSTATUS doesn't return any status. How to now if a call in queue has been answered or if caller just hangup? Second, how to deal with timeout, I have strange behaviors. If I put timeout=60 in queue.conf and I call the queue passing also 60 as timeout value,

CTDB RecLockLatencyMs vs RecoverInterval

2020 Jun 30

CTDB RecLockLatencyMs vs RecoverInterval

Hi I have a question regarding CTDB RecLockLatencyMs tunable parameter. Is there any relationship between the RecLockLatencyMs property and the RecoverInterval property? Does one need to be larger than the other? Or if RecLockLatencyMs were increased to 5000ms, should some other setting be changed in proportion? We're using a geo-distributed etcd cluster for the CTDB recovery lock and I

[BUG] imap-login segfault when running nmap -sV

2015 Apr 21

[BUG] imap-login segfault when running nmap -sV

Hi, I've noticed that nmap crashes my imap-login (also pop3-login) and narrowed it down to `nmap -sV -p 993 $host`. I've noticed that if I remove "ssl_protocols = !SSLv2 !SSLv3" from my config or enable SSLv3 rather than disabling it the segfault disappears. I'm running on Arch Linux with dovecot 2.2.16-1 and openssl 1.0.2.a-1. I've also attached a network capture, but

CTDB RecLockLatencyMs vs RecoverInterval

2020 Jun 30

CTDB RecLockLatencyMs vs RecoverInterval

Hi Bob, On Tue, 30 Jun 2020 17:00:11 -0400, Robert Buck via samba <samba at lists.samba.org> wrote: > I have a question regarding CTDB RecLockLatencyMs tunable parameter. Is > there any relationship between the RecLockLatencyMs property and > the RecoverInterval property? Does one need to be larger than the other? Or > if RecLockLatencyMs were increased to 5000ms, should some

CTDB RecLockLatencyMs vs RecoverInterval

2020 Jul 01

CTDB RecLockLatencyMs vs RecoverInterval

Thank you, Martin. Yes, we happen to be using Samba and CTDB v4.10.7, on Ubuntu. *Would these happen to include the defect?* *In your opinion, will 4s be an issue?* We happen to be running this on top of a geo-distributed etcd cluster, and in this particular case there was about 4200 miles between the two data centers. We're running a distributed NFS file system over a total of three data

RTP address learning and timing problem

2023 Apr 17

RTP address learning and timing problem

Hi Joshua, Thank you for that. From the code it kind of looks like STRICT_RTP_LEARN_TIMEOUT is a minimum, not a maximum: if (!ast_sockaddr_isnull(&rtp->strict_rtp_address) && STRICT_RTP_LEARN_TIMEOUT < ast_tvdiff_ms(ast_tvnow(), rtp->rtp_source_learn.start)) { ast_verb(4, "%p -- Strict RTP learning complete - Locking on source address %s\n", Our call shows: #

similar to: Node crashed after remove a path