thr3ads.net - Ocfs2 users - [Ocfs2-users] unmounted volumes [Sep 2007]

If this information is useful, please help other people find it:
Share via:

Charlie Sharkey

2007-Sep-26 07:49 UTC

[Ocfs2-users] unmounted volumes

I have seen a problem on a two node system where some (but not all) of
the ocfs2 
volumes on node 2 became unmounted. Correcting the problem only required
issuing the
mount command, but I am curious if anyone has an explanation of what may
have 
happened. One thing I should mention is that the hostname node 1 was
changed
about a week ago. At that time corrections were made to the cluster.conf
and hosts
files of both nodes, and both machines were rebooted. 

Any ideas ?
 
System Info:
-------------
 
SuSe Sles10 SP1    2.6.16.46-0.12-smp
 
/proc/fs/ocfs2/version
OCFS2 1.2.5-SLES-r2997 Tue Mar 27 16:33:19 EDT 2007 (build sles)

rpm -qa | grep ocfs2
ocfs2-tools-1.2.3-0.7
ocfs2console-1.2.3-0.7

/etc/sysconfig/o2cb
#
# This is a configuration file for automatic startup of the O2CB
# driver. It is generated by running /etc/init.d/o2cb configure.
# Please use that method to modify this file
#
# O2CB_ENABELED: 'true' means to load the driver on boot.
O2CB_ENABLED=true

# O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.
O2CB_BOOTCLUSTER=ocfs2

# O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead.

#O2CB_HEARTBEAT_THRESHOLD=240 bti changed, 240 was default
O2CB_HEARTBEAT_THRESHOLD=90

# O2CB_HEARTBEAT_MODE: Whether to use the native "kernel" or the
"user"
# driven heartbeat (for example, for integration with heartbeat 2.0.x)

O2CB_HEARTBEAT_MODE="kernel"

# O2CB_IDLE_TIMEOUT_MS: The time frame in which all cluster memberships
# for a given ocfs2 filesystem must be configured on all nodes.
O2CB_IDLE_TIMEOUT_MS=120000

# O2CB_RECONNECT_DELAY_MS: How long to wait for the other nodes to
recognise us
O2CB_RECONNECT_DELAY_MS=2000

# O2CB_KEEPALIVE_DELAY_MS: How often to remind our peers that we are
alive
O2CB_KEEPALIVE_DELAY_MS=5000


Messages from CN2:   (this is the node that had the unmounted volumes)

Sep 25 05:45:01 CN2 run-crons[22005]: time.cron returned 1
Sep 25 06:41:58 CN2 syslog-ng[3969]: STATS: dropped 0
Sep 25 06:45:01 CN2 run-crons[23956]: time.cron returned 1
Sep 25 07:41:58 CN2 syslog-ng[3969]: STATS: dropped 0
Sep 25 07:45:01 CN2 run-crons[25893]: time.cron returned 1
Sep 25 08:41:59 CN2 syslog-ng[3969]: STATS: dropped 0
Sep 25 08:45:01 CN2 run-crons[27833]: time.cron returned 1
Sep 25 09:42:00 CN2 syslog-ng[3969]: STATS: dropped 0
Sep 25 09:45:01 CN2 run-crons[29770]: time.cron returned 1
Sep 25 09:58:13 CN2 zmd: NetworkManagerModule (WARN): Failed to connect
to NetworkManager
Sep 25 09:58:19 CN2 zmd: Daemon (WARN): Not starting remote web server
Sep 25 10:40:33 CN2 sshd[31688]: Accepted publickey for root from
192.168.100.3 port 48567 ssh2
Sep 25 10:42:00 CN2 syslog-ng[3969]: STATS: dropped 0
Sep 25 10:43:25 CN2 sshd[31798]: Accepted publickey for root from
192.168.100.3 port 48569 ssh2
Sep 25 10:43:54 CN2 kernel: ocfs2_dlm: Nodes in domain
("967C5C3174B341A399FAF031B4D544FE"): 0 1 
Sep 25 10:43:54 CN2 kernel: (31849,1):ocfs2_find_slot:261 slot 1 is
already allocated to this node!
Sep 25 10:43:54 CN2 kernel: (31849,1):ocfs2_check_volume:1654 File
system was not unmounted cleanly, recovering volume.
Sep 25 10:43:54 CN2 kernel: kjournald starting.  Commit interval 5
seconds
Sep 25 10:43:54 CN2 kernel: ocfs2: Mounting device (253,5) on (node 1,
slot 1)
Sep 25 10:43:58 CN2 kernel: ocfs2_dlm: Nodes in domain
("655FEE13B3604FCE8E780BA2F525EB6A"): 0 1 
Sep 25 10:43:58 CN2 kernel: (31860,1):ocfs2_find_slot:261 slot 1 is
already allocated to this node!
Sep 25 10:43:58 CN2 kernel: (31860,1):ocfs2_check_volume:1654 File
system was not unmounted cleanly, recovering volume.
Sep 25 10:43:59 CN2 kernel: kjournald starting.  Commit interval 5
seconds
Sep 25 10:43:59 CN2 kernel: ocfs2: Mounting device (253,9) on (node 1,
slot 1)
Sep 25 10:45:01 CN2 run-crons[31894]: time.cron returned 1


Messages from CN1: (this is the node whos hostname was changed. From:
CN1 to: bustech-bu)

Sep 25 05:45:10 bustech-bu su: (to nobody) root on none
Sep 25 06:30:01 bustech-bu run-crons[31122]: time.cron returned 1
Sep 25 06:42:38 bustech-bu syslog-ng[3901]: STATS: dropped 0
Sep 25 07:30:01 bustech-bu run-crons[597]: time.cron returned 1
Sep 25 07:42:38 bustech-bu syslog-ng[3901]: STATS: dropped 0
Sep 25 07:55:00 bustech-bu kernel: (6823,3):ocfs2_broadcast_vote:725
ERROR: status = -92
Sep 25 07:55:00 bustech-bu kernel: (6823,3):ocfs2_do_request_vote:798
ERROR: status = -92
Sep 25 07:55:00 bustech-bu kernel: (6823,3):ocfs2_rename:1099 ERROR:
status = -92
     The above three messages continue over and over again.
Sep 25 07:55:10 bustech-bu kernel: (6823,0):ocfs2_broadcast_vote:725
ERROR: status = -92
Sep 25 07:55:10 bustech-bu kernel: (6823,0):ocfs2_do_request_vote:798
ERROR: status = -92
Sep 25 07:55:10 bustech-bu kernel: (6823,0):ocfs2_rename:1099 ERROR:
status = -92
Sep 25 08:30:01 bustech-bu run-crons[2562]: time.cron returned 1
Sep 25 08:42:38 bustech-bu syslog-ng[3901]: STATS: dropped 7647
Sep 25 09:30:01 bustech-bu run-crons[4528]: time.cron returned 1
Sep 25 09:42:39 bustech-bu syslog-ng[3901]: STATS: dropped 0
Sep 25 10:30:01 bustech-bu run-crons[6600]: time.cron returned 1
Sep 25 10:40:55 bustech-bu sshd[7010]: Accepted publickey for root from
192.168.100.3 port 42585 ssh2
Sep 25 10:42:39 bustech-bu syslog-ng[3901]: STATS: dropped 0
Sep 25 10:43:54 bustech-bu kernel: ocfs2_dlm: Node 1 joins domain
967C5C3174B341A399FAF031B4D544FE
Sep 25 10:43:54 bustech-bu kernel: ocfs2_dlm: Nodes in domain
("967C5C3174B341A399FAF031B4D544FE"): 0 1 
Sep 25 10:43:58 bustech-bu kernel: ocfs2_dlm: Node 1 joins domain
655FEE13B3604FCE8E780BA2F525EB6A
Sep 25 10:43:58 bustech-bu kernel: ocfs2_dlm: Nodes in domain
("655FEE13B3604FCE8E780BA2F525EB6A"): 0 1

Sunil Mushran

2007-Sep-26 11:00 UTC

head link

[Ocfs2-users] unmounted volumes

Automatic umount?

The messages do not indicate a umount. How did you detect that the volumes
were umounted?

As in, did you see "mount -t ocfs2" or did you also do "cat
/proc/mounts".

When ocfs2 umounts, it prints the umount message in syslog. I don't
see that message.

Charlie Sharkey wrote:>  
> I have seen a problem on a two node system where some (but not all) of
> the ocfs2 
> volumes on node 2 became unmounted. Correcting the problem only required
> issuing the
> mount command, but I am curious if anyone has an explanation of what may
> have 
> happened. One thing I should mention is that the hostname node 1 was
> changed
> about a week ago. At that time corrections were made to the cluster.conf
> and hosts
> files of both nodes, and both machines were rebooted. 
>
> Any ideas ?
>  
> System Info:
> -------------
>  
> SuSe Sles10 SP1    2.6.16.46-0.12-smp
>  
> /proc/fs/ocfs2/version
> OCFS2 1.2.5-SLES-r2997 Tue Mar 27 16:33:19 EDT 2007 (build sles)
>
> rpm -qa | grep ocfs2
> ocfs2-tools-1.2.3-0.7
> ocfs2console-1.2.3-0.7
>
> /etc/sysconfig/o2cb
> #
> # This is a configuration file for automatic startup of the O2CB
> # driver. It is generated by running /etc/init.d/o2cb configure.
> # Please use that method to modify this file
> #
> # O2CB_ENABELED: 'true' means to load the driver on boot.
> O2CB_ENABLED=true
>
> # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.
> O2CB_BOOTCLUSTER=ocfs2
>
> # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead.
>
> #O2CB_HEARTBEAT_THRESHOLD=240 bti changed, 240 was default
> O2CB_HEARTBEAT_THRESHOLD=90
>
> # O2CB_HEARTBEAT_MODE: Whether to use the native "kernel" or the
"user"
> # driven heartbeat (for example, for integration with heartbeat 2.0.x)
>
> O2CB_HEARTBEAT_MODE="kernel"
>
> # O2CB_IDLE_TIMEOUT_MS: The time frame in which all cluster memberships
> # for a given ocfs2 filesystem must be configured on all nodes.
> O2CB_IDLE_TIMEOUT_MS=120000
>
> # O2CB_RECONNECT_DELAY_MS: How long to wait for the other nodes to
> recognise us
> O2CB_RECONNECT_DELAY_MS=2000
>
> # O2CB_KEEPALIVE_DELAY_MS: How often to remind our peers that we are
> alive
> O2CB_KEEPALIVE_DELAY_MS=5000
>
>
> Messages from CN2:   (this is the node that had the unmounted volumes)
>
> Sep 25 05:45:01 CN2 run-crons[22005]: time.cron returned 1
> Sep 25 06:41:58 CN2 syslog-ng[3969]: STATS: dropped 0
> Sep 25 06:45:01 CN2 run-crons[23956]: time.cron returned 1
> Sep 25 07:41:58 CN2 syslog-ng[3969]: STATS: dropped 0
> Sep 25 07:45:01 CN2 run-crons[25893]: time.cron returned 1
> Sep 25 08:41:59 CN2 syslog-ng[3969]: STATS: dropped 0
> Sep 25 08:45:01 CN2 run-crons[27833]: time.cron returned 1
> Sep 25 09:42:00 CN2 syslog-ng[3969]: STATS: dropped 0
> Sep 25 09:45:01 CN2 run-crons[29770]: time.cron returned 1
> Sep 25 09:58:13 CN2 zmd: NetworkManagerModule (WARN): Failed to connect
> to NetworkManager
> Sep 25 09:58:19 CN2 zmd: Daemon (WARN): Not starting remote web server
> Sep 25 10:40:33 CN2 sshd[31688]: Accepted publickey for root from
> 192.168.100.3 port 48567 ssh2
> Sep 25 10:42:00 CN2 syslog-ng[3969]: STATS: dropped 0
> Sep 25 10:43:25 CN2 sshd[31798]: Accepted publickey for root from
> 192.168.100.3 port 48569 ssh2
> Sep 25 10:43:54 CN2 kernel: ocfs2_dlm: Nodes in domain
> ("967C5C3174B341A399FAF031B4D544FE"): 0 1 
> Sep 25 10:43:54 CN2 kernel: (31849,1):ocfs2_find_slot:261 slot 1 is
> already allocated to this node!
> Sep 25 10:43:54 CN2 kernel: (31849,1):ocfs2_check_volume:1654 File
> system was not unmounted cleanly, recovering volume.
> Sep 25 10:43:54 CN2 kernel: kjournald starting.  Commit interval 5
> seconds
> Sep 25 10:43:54 CN2 kernel: ocfs2: Mounting device (253,5) on (node 1,
> slot 1)
> Sep 25 10:43:58 CN2 kernel: ocfs2_dlm: Nodes in domain
> ("655FEE13B3604FCE8E780BA2F525EB6A"): 0 1 
> Sep 25 10:43:58 CN2 kernel: (31860,1):ocfs2_find_slot:261 slot 1 is
> already allocated to this node!
> Sep 25 10:43:58 CN2 kernel: (31860,1):ocfs2_check_volume:1654 File
> system was not unmounted cleanly, recovering volume.
> Sep 25 10:43:59 CN2 kernel: kjournald starting.  Commit interval 5
> seconds
> Sep 25 10:43:59 CN2 kernel: ocfs2: Mounting device (253,9) on (node 1,
> slot 1)
> Sep 25 10:45:01 CN2 run-crons[31894]: time.cron returned 1
>
>
> Messages from CN1: (this is the node whos hostname was changed. From:
> CN1 to: bustech-bu)
>
> Sep 25 05:45:10 bustech-bu su: (to nobody) root on none
> Sep 25 06:30:01 bustech-bu run-crons[31122]: time.cron returned 1
> Sep 25 06:42:38 bustech-bu syslog-ng[3901]: STATS: dropped 0
> Sep 25 07:30:01 bustech-bu run-crons[597]: time.cron returned 1
> Sep 25 07:42:38 bustech-bu syslog-ng[3901]: STATS: dropped 0
> Sep 25 07:55:00 bustech-bu kernel: (6823,3):ocfs2_broadcast_vote:725
> ERROR: status = -92
> Sep 25 07:55:00 bustech-bu kernel: (6823,3):ocfs2_do_request_vote:798
> ERROR: status = -92
> Sep 25 07:55:00 bustech-bu kernel: (6823,3):ocfs2_rename:1099 ERROR:
> status = -92
>      The above three messages continue over and over again.
> Sep 25 07:55:10 bustech-bu kernel: (6823,0):ocfs2_broadcast_vote:725
> ERROR: status = -92
> Sep 25 07:55:10 bustech-bu kernel: (6823,0):ocfs2_do_request_vote:798
> ERROR: status = -92
> Sep 25 07:55:10 bustech-bu kernel: (6823,0):ocfs2_rename:1099 ERROR:
> status = -92
> Sep 25 08:30:01 bustech-bu run-crons[2562]: time.cron returned 1
> Sep 25 08:42:38 bustech-bu syslog-ng[3901]: STATS: dropped 7647
> Sep 25 09:30:01 bustech-bu run-crons[4528]: time.cron returned 1
> Sep 25 09:42:39 bustech-bu syslog-ng[3901]: STATS: dropped 0
> Sep 25 10:30:01 bustech-bu run-crons[6600]: time.cron returned 1
> Sep 25 10:40:55 bustech-bu sshd[7010]: Accepted publickey for root from
> 192.168.100.3 port 42585 ssh2
> Sep 25 10:42:39 bustech-bu syslog-ng[3901]: STATS: dropped 0
> Sep 25 10:43:54 bustech-bu kernel: ocfs2_dlm: Node 1 joins domain
> 967C5C3174B341A399FAF031B4D544FE
> Sep 25 10:43:54 bustech-bu kernel: ocfs2_dlm: Nodes in domain
> ("967C5C3174B341A399FAF031B4D544FE"): 0 1 
> Sep 25 10:43:58 bustech-bu kernel: ocfs2_dlm: Node 1 joins domain
> 655FEE13B3604FCE8E780BA2F525EB6A
> Sep 25 10:43:58 bustech-bu kernel: ocfs2_dlm: Nodes in domain
> ("655FEE13B3604FCE8E780BA2F525EB6A"): 0 1 
>
>  
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users@oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>

Charlie Sharkey

2008-Feb-01 08:01 UTC

head link

[Ocfs2-users] Dlm question

Hi,

I'm having some dlm issues on a system. It looks like
the scenario went something like: 

  19:45:29  -->  19:47:31   16 dlm locks are released using
o2dlm_unlock(). 
                            ocfs2 logs an error into /var/log/messages,
but returns 
                            ok to the application

  19:45:15                  a dlm lock (o2dlm_lock()) is put on P00000
-- ok

  19:49:37                  lock on P00000 is released -- ok
 
  19:49:40                  a lock is attempted P00000. and the lock
fails. Returned error
                            is "Trylock failed"
  

Here is the data from /var/log/messages:

Jan 31 19:45:29 N1 kernel: (25033,1):dlmfs_unlink:512 ERROR: unlink
P50005, error -16 from destroy
Jan 31 19:45:43 N1 kernel: (25038,1):dlmfs_unlink:512 ERROR: unlink
P20010, error -16 from destroy
Jan 31 19:45:44 N1 kernel: (25030,1):dlmfs_unlink:512 ERROR: unlink
P20002, error -16 from destroy
Jan 31 19:45:59 N1 kernel: (25034,3):dlmfs_unlink:512 ERROR: unlink
P60006, error -16 from destroy
Jan 31 19:46:07 N1 kernel: (25043,0):dlmfs_unlink:512 ERROR: unlink
P70015, error -16 from destroy
Jan 31 19:46:08 N1 kernel: (25035,0):dlmfs_unlink:512 ERROR: unlink
P70007, error -16 from destroy
Jan 31 19:46:10 N1 kernel: (25041,1):dlmfs_unlink:512 ERROR: unlink
P50013, error -16 from destroy
Jan 31 19:46:25 N1 kernel: (25040,1):dlmfs_unlink:512 ERROR: unlink
P40012, error -16 from destroy
Jan 31 19:46:30 N1 kernel: (25042,1):dlmfs_unlink:512 ERROR: unlink
P60014, error -16 from destroy
Jan 31 19:46:30 N1 kernel: (25032,1):dlmfs_unlink:512 ERROR: unlink
P40004, error -16 from destroy
Jan 31 19:47:07 N1 kernel: (25028,1):dlmfs_unlink:512 ERROR: unlink
P00000, error -16 from destroy
Jan 31 19:47:08 N1 kernel: (25036,1):dlmfs_unlink:512 ERROR: unlink
P00008, error -16 from destroy
Jan 31 19:47:09 N1 kernel: (25029,1):dlmfs_unlink:512 ERROR: unlink
P10001, error -16 from destroy
Jan 31 19:47:19 N1 kernel: (25037,0):dlmfs_unlink:512 ERROR: unlink
P10009, error -16 from destroy
Jan 31 19:47:30 N1 kernel: (25039,1):dlmfs_unlink:512 ERROR: unlink
P30011, error -16 from destroy
Jan 31 19:47:31 N1 kernel: (25031,1):dlmfs_unlink:512 ERROR: unlink
P30003, error -16 from destroy


Here is data from the application dlm log file

01/31/2008 19:42:50  C000: Dlm Lock fd/id 150/P00000, returning: ok
01/31/2008 19:42:50  C001: Dlm Lock fd/id 152/P10001, returning: ok
01/31/2008 19:42:50  C002: Dlm Lock fd/id 154/P20002, returning: ok
01/31/2008 19:42:51  C003: Dlm Lock fd/id 156/P30003, returning: ok
01/31/2008 19:42:51  C004: Dlm Lock fd/id 158/P40004, returning: ok
01/31/2008 19:42:52  C005: Dlm Lock fd/id 160/P50005, returning: ok
01/31/2008 19:42:52  C006: Dlm Lock fd/id 162/P60006, returning: ok
01/31/2008 19:42:52  C007: Dlm Lock fd/id 164/P70007, returning: ok
01/31/2008 19:42:53  C008: Dlm Lock fd/id 166/P00008, returning: ok
01/31/2008 19:42:53  C009: Dlm Lock fd/id 168/P10009, returning: ok
01/31/2008 19:42:53  C00A: Dlm Lock fd/id 170/P20010, returning: ok
01/31/2008 19:42:54  C00B: Dlm Lock fd/id 172/P30011, returning: ok
01/31/2008 19:42:54  C00C: Dlm Lock fd/id 174/P40012, returning: ok
01/31/2008 19:42:54  C00D: Dlm Lock fd/id 178/P50013, returning: ok
01/31/2008 19:42:55  C00E: Dlm Lock fd/id 180/P60014, returning: ok
01/31/2008 19:42:58  C00F: Dlm Lock fd/id 182/P70015, returning: ok
01/31/2008 19:45:29  C005: Dlm UnLock.  fd/id 160/P50005, returning ok
01/31/2008 19:45:43  C00A: Dlm UnLock.  fd/id 170/P20010, returning ok
01/31/2008 19:45:44  C002: Dlm UnLock.  fd/id 154/P20002, returning ok
01/31/2008 19:45:59  C006: Dlm UnLock.  fd/id 162/P60006, returning ok
01/31/2008 19:46:07  C00F: Dlm UnLock.  fd/id 182/P70015, returning ok
01/31/2008 19:46:08  C007: Dlm UnLock.  fd/id 164/P70007, returning ok
01/31/2008 19:46:10  C00D: Dlm UnLock.  fd/id 178/P50013, returning ok
01/31/2008 19:46:25  C00C: Dlm UnLock.  fd/id 174/P40012, returning ok
01/31/2008 19:46:30  C00E: Dlm UnLock.  fd/id 180/P60014, returning ok
01/31/2008 19:46:30  C004: Dlm UnLock.  fd/id 158/P40004, returning ok
01/31/2008 19:47:07  C000: Dlm UnLock.  fd/id 150/P00000, returning ok
01/31/2008 19:47:08  C008: Dlm UnLock.  fd/id 166/P00008, returning ok
01/31/2008 19:47:09  C001: Dlm UnLock.  fd/id 152/P10001, returning ok
01/31/2008 19:47:19  C009: Dlm UnLock.  fd/id 168/P10009, returning ok
01/31/2008 19:47:30  C00B: Dlm UnLock.  fd/id 172/P30011, returning ok
01/31/2008 19:47:31  C003: Dlm UnLock.  fd/id 156/P30003, returning ok
01/31/2008 19:49:15  C000: Dlm Lock fd/id 150/P00000, returning: ok
01/31/2008 19:49:37  C000: Dlm UnLock.  fd/id 150/P00000, returning ok
01/31/2008 19:49:40  C000: Dlm Lock fd/id 150/P00000, returning: Trylock
failed


I also had a problem with this system the day before this. Here is
The data from that:

  SYSTEM MAP: /boot/System.map-2.6.16.46-0.14.PTF.284042.0-smp
DEBUG KERNEL: ../vmlinux.debug (2.6.16.46-0.14.PTF.284042.0-smp)
    DUMPFILE: vmcore
        CPUS: 4
        DATE: Wed Jan 30 17:44:49 2008
      UPTIME: 9 days, 00:55:51
LOAD AVERAGE: 1.10, 1.06, 1.01
       TASKS: 341
    NODENAME: N1
     RELEASE: 2.6.16.46-0.14.PTF.284042.0-smp
     VERSION: #1 SMP Thu May 17 14:00:09 UTC 2007
     MACHINE: i686  (2327 Mhz)
      MEMORY: 2 GB
       PANIC: "kernel BUG at fs/ocfs2/dlm/dlmmaster.c:2780!"
         PID: 31585
     COMMAND: "masx"
        TASK: dab912d0  [THREAD_INFO: d5840000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 31585  TASK: dab912d0  CPU: 0   COMMAND: "masx"
 #0 [d5841d78] crash_kexec at c013bb1a
 #1 [d5841dbc] die at c01055fe
 #2 [d5841dec] do_invalid_op at c0105ce2
 #3 [d5841e9c] error_code (via invalid_op) at c0104e4d
    EAX: 00000051  EBX: ca668280  ECX: 00000000  EDX: 00000296  EBP:
da0c1c00
    DS:  007b      ESI: da0c1c00  ES:  007b      EDI: ca668280
    CS:  0060      EIP: fb835e95  ERR: ffffffff  EFLAGS: 00010296
 #4 [d5841ed0] dlm_empty_lockres at fb835e95
 #5 [d5841ee0] dlm_unregister_domain at fb827305
 #6 [d5841f18] dlmfs_clear_inode at fb6c2eae
 #7 [d5841f24] clear_inode at c0175dfe
 #8 [d5841f30] generic_delete_inode at c0175eee
 #9 [d5841f3c] iput at c0175838
#10 [d5841f48] dput at c01744e0
#11 [d5841f54] do_rmdir at c016e63d
#12 [d5841fb8] sysenter_entry at c0103bd4
    EAX: 00000028  EBX: 08299988  ECX: 00000000  EDX: 08273be4
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 082ebf28
    SS:  007b      ESP: bf999f7c  EBP: bf999fa8
    CS:  0073      EIP: ffffe410  ERR: 00000028  EFLAGS: 00000246


This is a Suse Sles10 SP1 system, with a suse nfs patch.
Ocfs2 tools version 1.2.3-0.7
Ocfs2 version  1.2.5-SLES-r2997

I was hoping you would have some ideas on this.

Also, another question. I have been trying to run one of the debugging 
Scripts, for example, scanlocks.  I keep getting the message 'Module
debugfs
not loaded'. I don't see any debugfs.ko on the system. Isn't it a
part
of 
The ocfs2 tools ?

Thank you,

Charlie

Ocfs2 users - Sep 2007 - unmounted volumes

[Ocfs2-users] unmounted volumes

[Ocfs2-users] unmounted volumes

[Ocfs2-users] Dlm question