Hi all! I have a four-node OCFS2-kluster running RHEL 5.7 (x86_64) on ProLiants DL360-G5's with 6 GB RAM each. Two webservers running Apache and two MySQL-servers. I've shut down OCFS2 on the DB-servers since it's never really been used there, so it's only live on the webservers. OCFS2 is a 14-disk RAID-10 connected via dedicated QLogic iSCSI-NICs via a Gigabit switch. The serves are connected to the same switch on a separate VLAN for the heartbeat. They also have dedicated Gbps-NICs for frontend and backend traffic. The problem is that last couple of weeks the write-performance has intermittently slowed down to a crawl and right now the system is pretty much unusable. httpd has its DocumentRoot on /san and these processes often go into D-state. "iostat -dmx 1" often reports ~99 %util and writing anything hangs disturbingly often. My biggest problem right now is that I've got very little to work with (that is no juicy kernel-panics and such) and the only OCFS-related stuff in the logs look like this: ---CUT--- Nov 14 14:13:49 web02 kernel: INFO: task httpd:3959 blocked for more than 120 seconds. Nov 14 14:13:49 web02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:13:49 web02 kernel: httpd D ffff810001004420 0 3959 3149 3962 3954 (NOTLB) Nov 14 14:13:49 web02 kernel: ffff81018c407ea8 0000000000000082 ffff81019d600140 ffffffff8875b54b Nov 14 14:13:49 web02 kernel: ffff810184210000 000000000000000a ffff81018eb957a0 ffffffff80314b60 Nov 14 14:13:49 web02 kernel: 000000b90056ee85 000000000018d3ad ffff81018eb95988 0000000000000003 Nov 14 14:13:49 web02 kernel: Call Trace: Nov 14 14:13:49 web02 kernel: [<ffffffff8875b54b>] :ocfs2:ocfs2_inode_lock_full+0x5fb/0xfe2 Nov 14 14:13:49 web02 kernel: [<ffffffff8001b035>] cp_new_stat+0xe5/0xfd Nov 14 14:13:49 web02 kernel: [<ffffffff80063c53>] __mutex_lock_slowpath+0x60/0x9b Nov 14 14:13:49 web02 kernel: [<ffffffff80063c9d>] .text.lock.mutex+0xf/0x14 Nov 14 14:13:49 web02 kernel: [<ffffffff80013f1b>] generic_file_llseek+0x2a/0x8b Nov 14 14:13:49 web02 kernel: [<ffffffff80025788>] sys_lseek+0x40/0x60 Nov 14 14:13:49 web02 kernel: [<ffffffff8005d116>] system_call+0x7e/0x83 ---CUT--- Also, I see these kind of errors regarding eth4 (heartbeat-network): # grep 'e1000e.*Detected' /var/log/messages Nov 13 13:25:00 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang: Nov 14 09:10:55 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang: Nov 14 21:54:49 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang: Nov 14 21:58:35 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang: Nov 15 01:42:33 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang: Nov 15 13:34:01 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang: Nov 15 20:55:24 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang: Nov 16 02:42:55 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang: However, they happen relatively seldom and I've tried flood-pinging hosts over that VLAN and I've never experienced timeouts nor packet-loss so really don't think that's the issue. I managed to catch an strace running a dd to the drive and it hanged during the 8th 10MB-write for approx 1 minute and 30 seconds. Things are looking surprisingly cheerful right now as only maybe 1 out of 10 tries hangs like this. ---CUT--- # strace dd if=/dev/zero of=/san/testfile bs=1M count=10 ... open("/san/testfile", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 1 ... read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576<APPROX 1,5 MIN DELAY>) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 close(0) = 0 close(1) = 0 ... write(2, "10+0 records in\n10+0 records out"..., 3310+0 records in 10+0 records out ) = 33 write(2, "10485760 bytes (10 MB) copied", 2910485760 bytes (10 MB) copied) = 29 write(2, ", 86.8347 seconds, 121 kB/s\n", 28, 86.8347 seconds, 121 kB/s ) = 28 ... ---CUT--- I've made sure the disk-enclosure is OK, running latest firmware and experiencing no errors, no other NIC-related problems other than the one mentioned above (eth4). I realize that there's not much pointing towards OCFS having problems but I cannot for the life of me comprehend the nature of this problem. If nothing else, I'd appreciate any help in determining that OCFS can be eliminated from the equation! I've added some more information below. Please let me know if you need any additional information! Thanks in advance, -- Andy Herrero # uname -a Linux web01 2.6.18-274.7.1.el5 #1 SMP Mon Oct 17 11:57:14 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux # rpm -qa | grep ocfs ocfs2-tools-1.4.4-1.el5 ocfs2-2.6.18-238.9.1.el5-1.4.7-1.el5 ocfs2-2.6.18-274.el5-1.4.7-1.el5 ocfs2-2.6.18-274.7.1.el5-1.4.7-1.el5 # cat /etc/ocfs2/cluster.conf cluster: name = ocfs2 node_count = 4 node: name = web01 cluster = ocfs2 number = 0 ip_address = 10.60.2.11 ip_port = 7777 node: name = web02 cluster = ocfs2 number = 1 ip_address = 10.60.2.12 ip_port = 7777 node: name = db01 cluster = ocfs2 number = 2 ip_address = 10.60.2.13 ip_port = 7777 node: name = db02 cluster = ocfs2 number = 3 ip_address = 10.60.2.14 ip_port = 7777 # cat /etc/sysconfig/o2cb # # This is a configuration file for automatic startup of the O2CB # driver. It is generated by running /etc/init.d/o2cb configure. # On Debian based systems the preferred method is running # 'dpkg-reconfigure ocfs2-tools'. # # O2CB_ENABLED: 'true' means to load the driver on boot. O2CB_ENABLED=true # O2CB_STACK: The name of the cluster stack backing O2CB. O2CB_STACK=o2cb # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start. O2CB_BOOTCLUSTER=ocfs2 # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead. O2CB_HEARTBEAT_THRESHOLD=61 # O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is considered dead. O2CB_IDLE_TIMEOUT_MS # O2CB_KEEPALIVE_DELAY_MS: Max time in ms before a keepalive packet is sent O2CB_KEEPALIVE_DELAY_MS # O2CB_RECONNECT_DELAY_MS: Min time in ms between connection attempts O2CB_RECONNECT_DELAY_MS # mounted.ocfs2 -f Device FS Nodes /dev/sda1 ocfs2 web01, web02 # df -h /dev/sda1 Filesystem Size Used Avail Use% Mounted on /dev/sda1 6.4T 5.4T 1001G 85% /san # df -i /dev/sda1 Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 1708654583 1446362212 262292371 85% /san # debugfs.ocfs2 -R "ls -l //" /dev/sda1 6 drwxr-xr-x 7 0 0 4096 17-Dec-2008 16:17 . 6 drwxr-xr-x 7 0 0 4096 17-Dec-2008 16:17 .. 7 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 bad_blocks 8 -rw-r--r-- 1 0 0 831488 17-Dec-2008 16:17 global_inode_alloc 9 -rw-r--r-- 1 0 0 4096 17-Dec-2008 16:47 slot_map 10 -rw-r--r-- 1 0 0 1048576 17-Dec-2008 16:17 heartbeat 11 -rw-r--r-- 1 0 0 6998649171968 17-Dec-2008 16:17 global_bitmap 12 drwxr-xr-x 2 0 0 221184 16-Nov-2011 11:08 orphan_dir:0000 13 drwxr-xr-x 2 0 0 32768 16-Nov-2011 12:01 orphan_dir:0001 14 drwxr-xr-x 2 0 0 45056 14-Nov-2011 13:45 orphan_dir:0002 15 drwxr-xr-x 2 0 0 73728 14-Nov-2011 17:07 orphan_dir:0003 16 drwxr-xr-x 2 0 0 4096 17-Dec-2008 16:17 orphan_dir:0004 17 -rw-r--r-- 1 0 0 310378496 17-Dec-2008 16:17 extent_alloc:0000 18 -rw-r--r-- 1 0 0 62914560 17-Dec-2008 16:17 extent_alloc:0001 19 -rw-r--r-- 1 0 0 243269632 17-Dec-2008 16:17 extent_alloc:0002 20 -rw-r--r-- 1 0 0 146800640 17-Dec-2008 16:17 extent_alloc:0003 21 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 extent_alloc:0004 22 -rw-r--r-- 1 0 0 1044381696 17-Dec-2008 16:17 inode_alloc:0000 23 -rw-r--r-- 1 0 0 792723456 17-Dec-2008 16:17 inode_alloc:0001 24 -rw-r--r-- 1 0 0 1132462080 17-Dec-2008 16:17 inode_alloc:0002 25 -rw-r--r-- 1 0 0 1040187392 17-Dec-2008 16:17 inode_alloc:0003 26 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 inode_alloc:0004 27 -rw-r--r-- 1 0 0 268435456 17-Dec-2008 16:40 journal:0000 28 -rw-r--r-- 1 0 0 268435456 17-Dec-2008 16:40 journal:0001 29 -rw-r--r-- 1 0 0 268435456 17-Dec-2008 16:40 journal:0002 30 -rw-r--r-- 1 0 0 268435456 17-Dec-2008 16:40 journal:0003 31 -rw-r--r-- 1 0 0 268435456 17-Dec-2008 16:40 journal:0004 32 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 local_alloc:0000 33 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 local_alloc:0001 34 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 local_alloc:0002 35 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 local_alloc:0003 36 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 local_alloc:0004 37 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 truncate_log:0000 38 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 truncate_log:0001 39 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 truncate_log:0002 40 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 truncate_log:0003 41 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 truncate_log:0004 # hdparm -tT /dev/sda1 /dev/sda1: Timing cached reads: 22396 MB in 2.00 seconds = 11225.69 MB/sec Timing buffered disk reads: 238 MB in 3.00 seconds = 79.32 MB/sec