similar to: iscsi connection drop, comes back in seconds, then deadlock in cluster

Displaying 20 results from an estimated 300 matches similar to: "iscsi connection drop, comes back in seconds, then deadlock in cluster"

2009 Jul 22
2
OCFS2 Node restart
Hi, I have 6 nodes cluster with OCFS2 1.4.2 running on vmware virtual system RedHat 5.2 (2.6.18-128.1.16.el5) 64bit. Out of 6 nodes two nodes alf0 and alf3 reboot automatically, I enabled remote logging for kernel, and here is log. I noticed VM become non-response and suddenly reboots. I am running Alfresco (documents sharing) application all nodes are accessing common share on OCFS.
2008 Feb 04
0
[PATCH] o2net: Reconnect after idle time out.
Currently, o2net connects to a node on hb_up and disconnects on hb_down and net timeout. It disconnects on net timeout is ok, but it should attempt to reconnect back. This is because sometimes nodes get overloaded enough that the network connection breaks but the disk hb does not. And if we get into that situation, we either fence (unnecessarily) or wait for its disk hb to die (and sometimes hang
2010 Oct 23
1
Reg: ocfs2 two node cluster crashed, node2 crashed, when I rebooted node1 for maintenance.
Hi All, We have ocfs2 node cluster with oracle 11G RAC running, The node2 got crashed automatically, when i rebooted node one for maintenance. please check the log from node2 , before its got crashed. Oct 23 15:42:25 node2 kernel: ocfs2_dlm: Nodes in domain ("029C02C993E44E90879922E268FB161A"): 2 Oct 23 15:42:29 node2 kernel: ocfs2_dlm: Node 1 leaves domain
2008 Feb 13
2
[PATCH] o2net: Reconnect after idle time out.V2
Modification from V1 to V2: 1. Use atomic ops instead of spin_lock in timer. 2. Add some comments when querying connect_expired work. These comments are copied form Zach's mail.;) Currently, o2net connects to a node on hb_up and disconnects on hb_down and net timeout. It disconnects on net timeout is ok, but it should attempt to reconnect back. This is because sometimes nodes get
2008 Apr 29
0
Problems with Open-iSCSI and Infortrend A16E-G2130-4
Dear Srs, I'm getting this error when trying to connect to a Infortrend A16E- G2130-4 storage v?a iSCSI. > Apr 29 10:24:40 vz-10 kernel: scsi1 : iSCSI Initiator over TCP/IP > Apr 29 10:24:41 vz-10 kernel: Vendor: IFT Model: A16E- > G2130-4 Rev: 361F > Apr 29 10:24:41 vz-10 kernel: Type: Direct- > Access ANSI SCSI revision: 04 > Apr
2009 Jul 29
3
Error message whil booting system
Hi, When system booting getting error message "modprobe: FATAL: Module ocfs2_stackglue not found" in message. Some nodes reboot without any error message. ------------------------------------------------- ul 27 10:02:19 alf3 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team Jul 27 10:02:19 alf3 kernel: Netfilter messages via NETLINK v0.30. Jul 27 10:02:19 alf3 kernel:
2007 Feb 26
1
dlm timeouts and following errors -112
Hi list, I am experimenting with ocfs2 (rpm package: 1.2.2-0.2), using linux-ha 2.0.8 (all running on a SLES 10 x86-64, rpm packages from linux-ha.org) for the heartbeat. The three nodes are connected on a gigabit switch. From time to time I have problems to unmount a drive, and I have to reboot the whole system to fix the problem. When these lockups occur, I see these messages in
2009 Apr 20
2
BUG: soft lockup - CPU#1 stuck for 61s
?i, I have a cluster with 5 nodes hosting web application. All web servers save log info into shared access.log file. There is awstats log analyzer on the first node. Sometimes this node fails with the following messages (captured on another server) Apr 20 17:31:16 um-be-2 [145813.022112] o2net: connection to node um-fe-1 (num 1) at 192.168.10.10:7777 has been idle for 30.0 seconds, shutting it
2009 Mar 18
2
shutdown by o2net_idle_timer causes Xen to hang
Hello, we've had some serious trouble with a two-node Xen-based OCFS2 cluster. In brief: we had two incidents where one node detects an idle timeout and shuts the other node down which causes the other node and the Dom0 to hang. Both times this could only be resolved by rebooting the whole machine using the built-in IPMI card. All machines (including the other DomUs) run Centos 5.2
2009 Feb 04
1
Strange dmesg messages
Hi list, Something went wrong this morning and we have a node ( #0 ) reboot. Something blocked the NFS access from both nodes, one rebooted and the another we restarted the nfsd and it brought him back. Looking at node #0 - the one that rebooted - logs everything seems normal, but looking at the othere node dmesg's we saw this messages: First the o2net detected that node #0 was dead: (It
2007 Feb 26
5
Multiple uplinks, ssh connections hang
Folks, Ive got two ISP connections that I am using with: --- ip route add 192.168.200.0/24 dev eth2 src 192.168.200.11 table connection1 ip route add default via 192.168.200.1 table connection1 ip route add x.175.244.0/24 dev eth1 src x.175.244.2 table connection2 ip route add default via x.175.244.1 table connection2 ip rule add from 192.168.200.11 table connection1 ip rule add from x.175.244.2
2010 Dec 09
2
servers blocked on ocfs2
Hi, we have recently started to use ocfs2 on some RHEL 5.5 servers (ocfs2-1.4.7) Some days ago, two servers sharing an ocfs2 filesystem, and with quite virtual services, stalled, in what it seems on ocfs2 issue. This are the lines in their messages files: =====node heraclito (0)======================================== /Dec 4 09:15:06 heraclito kernel: o2net: connection to node parmenides
2014 Sep 26
2
One node hangs up issue requiring goog idea, thanks
Hi, all, As we use OCFS2, the network is not good. When the converting request message can?t send to the another node, there will be a node hangs up which will still waiting for the dlm. CAS2/logdir/var/log/syslog.1-6778-Sep 16 20:57:16 CAS2 kernel: [516366.623623] o2net: Connection to node CAS1 (num 1) at 10.172.254.1:7100 has been idle for 30.87 secs, shutting it down.
2013 Apr 28
2
Is it one issue. Do you have some good ideas, thanks a lot.
Hi, everyone I have some questions with the OCFS2 when using it as vm-store. With Ubuntu 1204, kernel version is 3.2.40, and ocfs2-tools version is 1.6.4. As the network configure change, there are some issues as the log below. Why is there the information of "Node 255 (he) is the Recovery Master for the dead node 255" in the syslog? Why the host ZHJD-VM6 is blocked until it reboot
2013 Apr 28
2
Is it one issue. Do you have some good ideas, thanks a lot.
Hi, everyone I have some questions with the OCFS2 when using it as vm-store. With Ubuntu 1204, kernel version is 3.2.40, and ocfs2-tools version is 1.6.4. As the network configure change, there are some issues as the log below. Why is there the information of "Node 255 (he) is the Recovery Master for the dead node 255" in the syslog? Why the host ZHJD-VM6 is blocked until it reboot
2012 Dec 10
0
what might cause iSCSI connection 1:0 error ISCSI_ERR_CONN_FAILED
Hi, I do have a centos 6.x server which accessed two different iscsistorages for a long time without any trouble. The storage-connection is done by a separate NIC and VLAN. The LAN access is on an other NIC. This weekend something broke and I don't have any clue what might be the problem or what caused it. The storages where mounted RO. In /var/loge/messages there are lot of messages; so
2011 May 10
3
ERROR: -91 after Kernel Upgrade
Hey guys, I have a OCFS2 Cluster mounted at 4 xen-server (gentoo). Today I upgraded the xen-kernel for tests at one server (server2) from 2.6.34-xen to 2.6.38-xen-r1. After reboot the server couldn''t mount the ocsfs2 device anymore. ocfs2-tools version: sys-fs/ocfs2-tools-1.4.3 Modules are loaded and /config type configfs and /dlm type ocfs2_dlmfs are mounted. server2 ~ # mount
2006 Jan 09
0
[PATCH 01/11] ocfs2: event-driven quorum
This patch separates o2net and o2quo from knowing about one another as much as possible. This is the first in a series of patches that will allow userspace cluster interaction. Quorum is separated out first, and will ultimately only be associated with the disk heartbeat as a separate module. To do so, this patch performs the following changes: * o2hb_notify() is added to handle injection of
2007 Aug 22
1
mount.ocfs2: Value too large ...
Hallo, I have two servers and both are connected to external array, each by own SAS connection. I need these servers to work simultaneously with data on array and I think that ocfs2 is suitable for this purpose. One server is P4 Xeon (Gentoo linux, i386, 2.6.22-r2) and second is Opteron (Gentoo linux, x86_64, 2.6.22-r2). Servers are connected by ethernet, adapters are both Intel
2002 Apr 17
1
concat
i have a function that returns a list containing a variety of variable types i am trying to run the function multiple times and return the output into a variable with a semi-consistent naming pattern i.e., for ten trials i want to return the list into variables trial1,trial2,...trial10 is there a generic way to get this to happen i have a similar process that does the same thing to an external