Displaying 20 results from an estimated 300 matches similar to: "servers blocked on ocfs2"
2007 Mar 08
4
ocfs2 cluster becomes unresponsive
We are running OCFS2 on SLES9 machines using a FC SAN. Without warning both nodes will become unresponsive. Can not access either machine via ssh or terminal (hangs after typing in username). However the machine still responds to pings. This continues until one node is rebooted, at which time the second node resumes normal operations.
I am not entirely sure that this is an OCFS2 problem at all
2010 Jan 14
1
another fencing question
Hi,
periodically one of on my two nodes cluster is fenced here are the logs:
Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2-
rc.minint.it (num 0) at 1.1.1.6:7777
Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR:
link to 0 went down!
Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR:
status = -112
Jan 14 07:01:44
2007 Nov 29
1
Troubles with two node
Hi all,
I'm running OCFS2 on two system with OpenSUSE 10.2 connected on fibre
channel with a shared storage (HP MSA1500 + HP PROLIANT MSA20).
The cluster has two node (web-ha1 and web-ha2), sometimes (1 or 2 times
on a month) the OCFS2 stop to work on both system. On the first node I'm
getting no error in log files and after a forced shoutdown of the first
node on the second I can see
2009 Mar 18
2
shutdown by o2net_idle_timer causes Xen to hang
Hello,
we've had some serious trouble with a two-node Xen-based OCFS2
cluster. In brief: we had two incidents where one node detects an idle
timeout and shuts the other node down which causes the other node and
the Dom0 to hang. Both times this could only be resolved by rebooting
the whole machine using the built-in IPMI card.
All machines (including the other DomUs) run Centos 5.2
2008 Jan 23
1
OCFS2 DLM problems
Hello everyone, once again.
We are running into a problem, which has shown now 2 times, possible 3
(once the systems looked different.)
The environment is 6 HP DL360/380 g5 servers with eth0 being the public
interface, eth1 and bond0 (eth2 and eth3) used for clusterware and bond0
also used for OCFS2. The bond0 interface is in active/passive mode.
There are no network errors counters showing and
2013 Apr 28
2
Is it one issue. Do you have some good ideas, thanks a lot.
Hi, everyone
I have some questions with the OCFS2 when using it as vm-store.
With Ubuntu 1204, kernel version is 3.2.40, and ocfs2-tools version is 1.6.4.
As the network configure change, there are some issues as the log below.
Why is there the information of "Node 255 (he) is the Recovery Master for the dead node 255" in the syslog?
Why the host ZHJD-VM6 is blocked until it reboot
2013 Apr 28
2
Is it one issue. Do you have some good ideas, thanks a lot.
Hi, everyone
I have some questions with the OCFS2 when using it as vm-store.
With Ubuntu 1204, kernel version is 3.2.40, and ocfs2-tools version is 1.6.4.
As the network configure change, there are some issues as the log below.
Why is there the information of "Node 255 (he) is the Recovery Master for the dead node 255" in the syslog?
Why the host ZHJD-VM6 is blocked until it reboot
2008 Jul 14
1
Node fence on RHEL4 machine running 1.2.8-2
Hello,
We have a four-node RHEL4 RAC cluster running OCFS2 version 1.2.8-2 and
the 2.6.9-67.0.4hugemem kernel. The cluster has been really stable since
we upgraded to 1.2.8-2 early this year, but this morning, one of the
nodes fenced and rebooted itself, and I wonder if anyone could glance at
the below remote syslogs and offer an opinion as to why.
First, here's the output of
2011 Dec 20
8
ocfs2 - Kernel panic on many write/read from both
Sorry i don`t copy everything:
TEST-MAIL1# echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc
debugfs.ocfs2 1.6.4
5239722 26198604 246266859
TEST-MAIL1# echo "ls //orphan_dir:0001"|debugfs.ocfs2 /dev/dm-0|wc
debugfs.ocfs2 1.6.4
6074335 30371669 285493670
TEST-MAIL2 ~ # echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc
debugfs.ocfs2 1.6.4
5239722 26198604
2007 Oct 08
2
OCF2 and LVM
Does anybody knows if is there a certified procedure in to
backup a RAC DB 10.2.0.3 based on OCFS2 ,
via split mirror or snaphots technology ?
Using Linux LVM and OCFS2, does anybody knows if is
possible to dinamically extend an OCFS filesystem,
once the underlying LVM Volume has been extended ?
Thanks in advance
Riccardo Paganini
2009 May 12
2
add error check for ocfs2_read_locked_inode() call
After upgrading from 2.6.28.10 to 2.6.29.3 I've saw following new errors
in kernel log:
May 12 14:46:41 falcon-cl5
May 12 14:46:41 falcon-cl5 (6757,7):ocfs2_read_locked_inode:466 ERROR:
status = -22
Only one node is mounted volumes in cluster:
/dev/sde on /home/apache/users/D1 type ocfs2
(rw,_netdev,noatime,heartbeat=local)
/dev/sdd on /home/apache/users/D2 type ocfs2
2009 Jul 29
3
Error message whil booting system
Hi,
When system booting getting error message "modprobe: FATAL: Module
ocfs2_stackglue not found" in message. Some nodes reboot without any error
message.
-------------------------------------------------
ul 27 10:02:19 alf3 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team
Jul 27 10:02:19 alf3 kernel: Netfilter messages via NETLINK v0.30.
Jul 27 10:02:19 alf3 kernel:
2007 Feb 06
2
Network 10 sec timeout setting?
Hello!
Hey didnt a setting for the 10 second network timeout get into the
2.6.20 kernel?
if so how do we set this?
I am getting
OCFS2 1.3.3
(2201,0):o2net_connect_expired:1547 ERROR: no connection established
with node 1 after 10.0 seconds, giving up and returning errors.
(2458,0):dlm_request_join:802 ERROR: status = -107
(2458,0):dlm_try_to_join_domain:950 ERROR: status = -107
2011 Mar 04
1
node eviction
Hello... I wonder if someone have had similar problem like this... a node evicts almost in a weekly basis and I have not found the root cause yet....
Mar 2 10:20:57 xirisoas3 kernel: ocfs2_dlm: Node 1 joins domain 129859624F7042EAB9829B18CA65FC88
Mar 2 10:20:57 xirisoas3 kernel: ocfs2_dlm: Nodes in domain ("129859624F7042EAB9829B18CA65FC88"): 1 2 3 4
Mar 3 16:18:02 xirisoas3 kernel:
2009 Feb 04
1
Strange dmesg messages
Hi list,
Something went wrong this morning and we have a node ( #0 ) reboot.
Something blocked the NFS access from both nodes, one rebooted and the
another we restarted the nfsd and it brought him back.
Looking at node #0 - the one that rebooted - logs everything seems
normal, but looking at the othere node dmesg's we saw this messages:
First the o2net detected that node #0 was dead: (It
2010 Aug 26
1
[PATCH 2/5] ocfs2/dlm: add lockres as parameter to dlm_new_lock()
Wether the dlm_lock needs to access lvb or not depends on dlm_lock_resource it belongs to. So a new parameter "struct dlm_lock_resource *res" is added to dlm_new_lock() so that we can know if we need to allocate lvb for the dlm_lock. And we have to make the lockres availale for calling dlm_new_lock().
Signed-off-by: Wengang Wang <wen.gang.wang at oracle.com>
---
2009 Nov 06
0
iscsi connection drop, comes back in seconds, then deadlock in cluster
Greetings ocfs2 folks,
A client is experiencing some random deadlock issues within a cluster,
wondering if anyone can point us in the right direction. The iSCSI
connection seemed to have dropped on one node briefly, ultimately
several hours later landing us in a complete deadlock scenario where
multiple nodes (Node 7 and Node 8) had to be panic'd (by hand - they
didn't ever panic on
2010 Apr 05
1
Kernel Panic, Server not coming back up
I have a relatively new test environment setup that is a little different
from your typical scenario. This is my first time using OCFS2, but I
believe it should work the way I have it setup.
All of this is setup on VMWare virtual hosts. I have two front-end web
servers and one backend administrative server. They all share 2 virtual
hard drives within VMware (independent, persistent, &
2007 Dec 03
2
errors ocfs2 with Ubuntu/Dapper/amd64
I keep encountering the following error on Ubuntu Dapper 6.06 LTS amd64.
It seems to happen from time to time. Does any one know what this is
and if there is a way to fix it?
Nov 29 01:19:14 <hostname> kernel: [221867.166529]
(11588,0):ocfs2_lock_create:818 ERROR: Dlm error "DLM_IVLOCKID"
while calling dlmlock on resource M000000000000000593c7cc33386710:
bad
2014 Mar 22
1
Issues to manage RAM on openvz guests
Hi all,
I'm playing with virsh, and I succeed to mange easily some KVM nodes.
But I have issues to size RAM of openvz guests.
virsh set by default the the memory to 256M on each guest instead of that I
specify in the XML
I have to modify by end the variable PHYSPAGES on each
/etc/vz/conf/id_of_my_vz.conf
Please follow the XML dump and the steps that I did.
Cheers,Aurelien
Log
----
1)