thr3ads.net - similar to: "One node hangs up issue requiring goog idea, thanks"

Displaying 18 results from an estimated 18 matches similar to: "One node hangs up issue requiring goog idea, thanks"

2008 Jul 24

sparc quota bug

Hi, I am getting the following error: dovecot: Jul 23 18:04:44 Error: child 7600 (imap) killed with signal 10 How to reproduce: Try to delete or move a mail to another folder. The mail actually gets copied to the other folder but the original isn't removed (when using webmail). If using thunderbird this ends up in an infinite loop creating new mails in the destination folder (without

expire plugin error

2009 Jan 22

expire plugin error

Hi, I am using the db backend for the expire plugin and found the following error message in the logfile: dovecot: Jan 22 10:10:55 Error: dict: secondary db: unable to allocate space from the buffer cache dovecot: Jan 22 10:10:55 Error: dict: Failed to initialize dictionary 'expire' dovecot: Jan 22 10:10:55 Error: IMAP(xxxxxxxxxx): read(/var/dovecot/dict-server) failed: Remote

ocfs2 - Kernel panic on many write/read from both

2011 Dec 20

ocfs2 - Kernel panic on many write/read from both

Sorry i don`t copy everything: TEST-MAIL1# echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 5239722 26198604 246266859 TEST-MAIL1# echo "ls //orphan_dir:0001"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 6074335 30371669 285493670 TEST-MAIL2 ~ # echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 5239722 26198604

Possible locking problem in DLM code

2009 Feb 26

Possible locking problem in DLM code

Hi, with lockdep enabled, I get the warning below. I've verified it and it seems to be really correct - we acquire dlm->ast_lock and lockres->spinlock in different orders in dlm_queue_bast() and dlm_lockres_release_asm(). I'm not sure whether this can really lead to a deadlock and / or how to fix this... Honza ======================================================= [

Trust relationship failed...

2004 Mar 22

Trust relationship failed...

I'm hoping someone can give me a hand in figuring out this problem, I have seen several other similar problems in searching, but nothing that exactly matches what I am seeing here. I have recently migrated a client from a Samba server running 2.2.7 (Redhat 9) to 3.0.2 (Fedora 1). The samba installation is running as a PDC for 5 Win XP workstations and was working perfectly prior to the

OCFS2 DLM problems

2008 Jan 23

OCFS2 DLM problems

Hello everyone, once again. We are running into a problem, which has shown now 2 times, possible 3 (once the systems looked different.) The environment is 6 HP DL360/380 g5 servers with eth0 being the public interface, eth1 and bond0 (eth2 and eth3) used for clusterware and bond0 also used for OCFS2. The bond0 interface is in active/passive mode. There are no network errors counters showing and

[PATCH 1/4] ocfs2/dlm: Retract fix for race between purge and migrate

2009 Feb 03

[PATCH 1/4] ocfs2/dlm: Retract fix for race between purge and migrate

Mainline commit d4f7e650e55af6b235871126f747da88600e8040 attempts to delay the dlm_thread from sending the drop ref message if the lockres is being migrated. The problem is that we make the dlm_thread wait for the migration to complete. This causes a deadlock as dlm_thread also participates in the lockres migration process. A better fix for the original oss bugzilla#1012 is in testing.

another fencing question

2010 Jan 14

another fencing question

Hi, periodically one of on my two nodes cluster is fenced here are the logs: Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2- rc.minint.it (num 0) at 1.1.1.6:7777 Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR: link to 0 went down! Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR: status = -112 Jan 14 07:01:44

question about oracle shared home install

2009 Jun 09

question about oracle shared home install

Hi All, Scenario: I'm trying to install 9i rac on a 2 node cluster on OCFS2 OS: Oracle enterprise linux To my understanding, OCFS2 supports shared home installs which to my knowledge is not only can i have datafile and control files but also clustermanager files and binaries (pretty much everything: no files or executables need to kept local to any nodes). I have one single shared file for

iscsi connection drop, comes back in seconds, then deadlock in cluster

2009 Nov 06

iscsi connection drop, comes back in seconds, then deadlock in cluster

Greetings ocfs2 folks, A client is experiencing some random deadlock issues within a cluster, wondering if anyone can point us in the right direction. The iSCSI connection seemed to have dropped on one node briefly, ultimately several hours later landing us in a complete deadlock scenario where multiple nodes (Node 7 and Node 8) had to be panic'd (by hand - they didn't ever panic on

servers blocked on ocfs2

2010 Dec 09

servers blocked on ocfs2

Hi, we have recently started to use ocfs2 on some RHEL 5.5 servers (ocfs2-1.4.7) Some days ago, two servers sharing an ocfs2 filesystem, and with quite virtual services, stalled, in what it seems on ocfs2 issue. This are the lines in their messages files: =====node heraclito (0)======================================== /Dec 4 09:15:06 heraclito kernel: o2net: connection to node parmenides

ocfs2 configuration/performance questions...

2009 Aug 03

ocfs2 configuration/performance questions...

Hi all, I'm trying to determine the performance implications of various configurations for ocfs2. (I'm new to ocfs2, but have read through all the docs for both 1.2 and 1.4, so please be gentle :) This would be a 1.4 installation. I searched through www.mail-archive.com and didn't see anything appropriate. If there are other links, please let me know. In my case, I want to

add error check for ocfs2_read_locked_inode() call

2009 May 12

add error check for ocfs2_read_locked_inode() call

After upgrading from 2.6.28.10 to 2.6.29.3 I've saw following new errors in kernel log: May 12 14:46:41 falcon-cl5 May 12 14:46:41 falcon-cl5 (6757,7):ocfs2_read_locked_inode:466 ERROR: status = -22 Only one node is mounted volumes in cluster: /dev/sde on /home/apache/users/D1 type ocfs2 (rw,_netdev,noatime,heartbeat=local) /dev/sdd on /home/apache/users/D2 type ocfs2

Self-fencing issues (RHEL4)

2006 Apr 18

Self-fencing issues (RHEL4)

Hi. I'm running RHEL4 for my test system, Adaptec Firewire controllers, Maxtor One Touch III shared disk (see the details below), 100Mb/s dedicated interconnect. It panics with no load about each 20 minutes (error message from netconsole attached) Any clues? Yegor --- [root at rac1 ~]# cat /proc/fs/ocfs2/version OCFS2 1.2.0 Tue Mar 7 15:51:20 PST 2006 (build

ocfs2 cluster becomes unresponsive

2007 Mar 08

ocfs2 cluster becomes unresponsive

We are running OCFS2 on SLES9 machines using a FC SAN. Without warning both nodes will become unresponsive. Can not access either machine via ssh or terminal (hangs after typing in username). However the machine still responds to pings. This continues until one node is rebooted, at which time the second node resumes normal operations. I am not entirely sure that this is an OCFS2 problem at all

[SUGGESSTION 1/1] OCFS2: automatic dlm hash table size

2009 Jun 08

[SUGGESSTION 1/1] OCFS2: automatic dlm hash table size

backgroud: ocfs2 dlm uses a hash table to store dlm_lock_resource objects. the often used lookup is performed on the hash table. problem: for usages that there are huge number of inodes(thus huge number of dlm_lock_resource objects) in a ocfs2 volume, the lookup performance becomes a problem. the lookup holds spin_lock which could put all others cpus into the state of aquring the spinlock. if

OCFS2 1.4: Patches backported from mainline

2009 Apr 17

OCFS2 1.4: Patches backported from mainline

Please review the list of patches being applied to the ocfs2 1.4 tree. All patches list the mainline commit hash. Thanks Sunil

Tracking down hangs

2010 Jun 03

Tracking down hangs

We're using a storage solution involving two SunFire X4500 servers using DRBD to replicate a 15TB partition across the network with ocfs2 on top. We're sharing the partition from one server over NFS and the other is mounted read-only at present. The DBRD backing store is software RAID 60 on 40 disks. We've been seeing periodic issues whereby our NFS clients (Debian Lenny) are very

similar to: One node hangs up issue requiring goog idea, thanks