similar to: 6 node cluster with unexplained reboots

Displaying 20 results from an estimated 10000 matches similar to: "6 node cluster with unexplained reboots"

2008 Jan 23
1
OCFS2 DLM problems
Hello everyone, once again. We are running into a problem, which has shown now 2 times, possible 3 (once the systems looked different.) The environment is 6 HP DL360/380 g5 servers with eth0 being the public interface, eth1 and bond0 (eth2 and eth3) used for clusterware and bond0 also used for OCFS2. The bond0 interface is in active/passive mode. There are no network errors counters showing and
2010 May 21
2
fsck.ocfs2 using huge amount of memory?
We are setting up 2 new EL5 U4 machines to replace our current database servers running our demo environment. We use 3Par SANs and their snap clone options. The current production system we snap clone from is EL4 U5 with ocfs2 1.2.9, the new servers have ocfs2 1.4.3 installed. Part of the refresh process is to run fsck.ocfs2 on the volume to recover, but right now as I am trying to run it on our
2008 Feb 25
2
OCFS2 and Cloning
I am working currently on cloning on a regular basis our production OCFS2 volumes to our test environment. For the database (Oracle 10G R2 RAC) we put it into backup mode, then execute a Snapclone on our 3Par SAN. Then we use RemoteCopy and SnapClone to our development 3Par SAN. To recover the OCFS2 volume I got through the following steps: Stop database umount /export/<volume name> Log
2011 Mar 11
3
What could cause slow down betwen OCFS2 1.2.9 and 1.4.4
We upgraded our production database cluster (6 node) from EL4 Update 5 to EL5 Update 5, including upgrading OCFS2 from 1.2.9 to 1.4.4. We are now noticing slowdown of batch jobs in Oracle, while hotbackup runs faster. One thing we saw is that journal mode changed from write-back to ordered, as we don't specify journal mode during mount. Oracle sees this as slowdown based on higher IO latency,
2008 Feb 17
2
Anyone have an idea how to find file i/o throughput?
We got a remote Oracle 10g R2 standby running on OCFS2. Initial when we started the standby, read I/O was < 5MB/sec on average. Since then it has grown to over 40MB/sec (longer average, it peaks much higher). Here is a graph showing this: http://www.alameda.net/~ulf/dbphx01.png We also have a local standby running (on EXT3) which is not showing the same symptom. I am trying to find where all
2008 Oct 22
2
Another node is heartbeating in our slot! errors with LUN removal/addition
Greetings, Last night I manually unpresented and deleted a LUN (a SAN snapshot) that was presented to one node in a four node RAC environment running OCFS2 v1.4.1-1. The system then rebooted with the following error: Oct 21 16:45:34 ausracdb03 kernel: (27,1):o2hb_write_timeout:166 ERROR: Heartbeat write timeout to device dm-24 after 120000 milliseconds Oct 21 16:45:34 ausracdb03 kernel:
2009 Jan 15
2
[PATCH] ocfs2: return f_fsid info in ocfs2_statfs()
Currently f_fsid of struct kstatfs returned from ocfs2_statfs() is undefined (at least it should be filled with 0). Since in some conditions, f_fsid value might be used as (f_fsid, ino) pare to uniquely identify a file, ocfs2 should return a defined unique f_fsid value from ocfs2_statfs(). This patch uses uuid_hash as a unique ID to initiate f_fsid value, the 32bits width is enough for ocfs2
2010 May 30
4
OCFS2 performance - disk random access time problem
Hello. I plan to use OCFS2 + DRBD for email server. Problem: I use "seeker" for testing http://www.linuxinsight.com/how_fast_is_your_disk.html And get this: Results: 65 seeks/second, 15.23 ms random access time Then I do rm of many files - it fals to 10 seeks/second and performance is terrible. What can I do to increase it? What`s wrong? Below is many info. What we have: Debian
2009 Jan 16
2
[PATCH] ocfs2: return f_fsid info in ocfs2_statfs(), v4
Currently f_fsid of struct kstatfs returned from ocfs2_statfs() is undefined (vfs layer fills 0 as default). Since in some conditions, f_fsid value might be used as (f_fsid, ino) pair to uniquely identify a file, ocfs2 should return a unique defined f_fsid value from ocfs2_statfs(). Because uuid_str is identified no mater on big or litlle endian machine, it's also endian consistent to use
2010 Aug 12
3
[PATCH 1/2] ocfs2: Fix metaecc error messages
Like tools, the checksum validate function now prints the values in hex. Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com> --- fs/ocfs2/blockcheck.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ocfs2/blockcheck.c b/fs/ocfs2/blockcheck.c index ec6d123..c7ee03c 100644 --- a/fs/ocfs2/blockcheck.c +++ b/fs/ocfs2/blockcheck.c @@ -439,7 +439,7 @@ int
2008 Jul 21
5
OCFS processes active after a umount [SEC=UNOFFICIAL]
Hello, I have two OCFS file file systems mounted at /ocfs_1 and /ocfs_2. I have unmounted both OCFS file systems and was trying to then offline and unload OCFS. The offline command failed with - # ./o2cb offline Stopping O2CB cluster ocfs2: Failed Unable to stop cluster as heartbeat region still active Looking at the processes on this box shows a number of OCFS processes are still active -
2009 Jan 14
15
Backport patches to ocfs2 1.4 tree from mainline
Found 15 patches (out of 162) that appeared relevant to ocfs2 1.4. Please review. Sunil
2010 Oct 08
23
O2CB global heartbeat - hopefully final drop!
All, This is hopefully the final drop of the patches for adding global heartbeat to the o2cb stack. The diff from the previous set is here: http://oss.oracle.com/~smushran/global-hb-diff-2010-10-07 Implemented most of the suggestions provided by Joel and Wengang. The most important one was to activate the feature only at the end, Also, got mostly a clean run with checkpatch.pl. Sunil
2009 Apr 17
26
OCFS2 1.4: Patches backported from mainline
Please review the list of patches being applied to the ocfs2 1.4 tree. All patches list the mainline commit hash. Thanks Sunil
2008 Sep 25
1
ocfs2 filesystem seems out of sync
Hi there I recently installed an OCFS2 filesystem on our FC-SAN. Everything seemed to work fine and I could read & write the filesystem from both servers that are mounting it. After a while though, writes coming from one node do not appear on the other node and vice versa. I am not sure what's causing this, and not very experienced at debugging filesystems. If anybody has any
2007 Jan 23
1
ocfs2 kernel bug in Fedora Core 4 update kernel
OS: Fedora Core release 4 (Stentz) KERNEL: Linux rack1.ape 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:57:02 EDT 2006 i686 i686 i386 GNU/Linux CLUSTER: 11 Linux kernels, mixed environment FC4,FC5,FC6 SAN: FC Infortrend storage, QLogic16 port FC switch, FC adapter LSI FC929X (21224,1):ocfs2_truncate_file:242 ERROR: bug expression: le64_to_cpu(fe->i_size) != i_size_read(inode)
2008 Apr 02
10
[PATCH 0/62] Ocfs2 updates for 2.6.26-rc1
The following series of patches comprises the bulk of our outstanding changes for Ocfs2. Aside from the usual set of cleanups and fixes that were inappropriate for 2.6.25, there are a few highlights: The '/sys/o2cb' directory has been moved to '/sys/fs/o2cb'. The new location meshes better with modern sysfs layout. A symbolic link has been placed in the old location so as to
2011 Oct 18
12
Unable to stop cluster as heartbeat region still active
Hi, I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. My problem is that all the time when i try to run /etc/init.d/o2cb stop it fails with this error: Stopping O2CB cluster CLUSTER: Failed Unable to stop cluster as heartbeat region still active There is no active mount point. I tried to manually stop the heartdbeat with
2005 Feb 11
3
OCFS file system used as archived redo destination is corrupted
we started using an ocfs file system about 4 months ago as the shared archived redo destination for the 4-node rac instances (HP dl380, msa1000, RH AS 2.1) . last night we are seeing some weird behavior, and my guess is the inode directory in the file system is getting corrupted. I've always had a bad feeling about OCFS not being very robust at handling constant file creation and deletion
2007 Jul 07
2
Adding new nodes to OCFS2?
I looked around, found older post which seems not applicable anymore. I have a cluster of 2 nodes right now, which has 3 OCFS2 file systems. All the file systems were formatted with 4 node slots. I added the two news nodes (by hand, by ocfs2console and o2cb_ctl), so my /etc/ofcfs/cluster.conf looks right: node: ip_port = 7777 ip_address = 192.168.201.1 number = 0