First of all, make o2cd dependent on iSCSI (so that it starts AFTER it abnd
STOPS before it). I recommend to make sshd start BEFORE both - it allows you
to have emergency access to the system if you did anything wrong.
Second. iSCSI is very reluctant on shutdown.
I'd better manually remove iscsi shutdown from K* files at all, so that it
never stops. You are lucky that
your system did not froze (when I experimented with LVM2 on iSCSI, I had
many such scenarios).
In all other things, such combination work fine for me (except that I was
not able to make OCFSv2 work stable as a document storage on i386 servers).
----- Original Message -----
From: "Steve Feehan" <sfeehan at gmail.com>
To: <ocfs2-users at oss.oracle.com>
Sent: Friday, July 14, 2006 6:40 AM
Subject: [Ocfs2-users] kernel panics on sles 10 rc3
> I've just setup ocfs2 on a shared iSCSI disk (from a NetApp) on SLES
> 10 RC3. Both clients are Xen guests. Perhaps I should direct this
> question to a SUSE list, but I hoped that someone here might be able
> to offer guidance.
>
> The configuration was simple and I had a working setup very quickly.
> Unfortunately each time I reboot one of the nodes it panics during
> shutdown. For example, I've included the shutdown output at the end of
> this mail.
>
> I can often (not always) trigger the panic by doing:
>
> slesvm1:~ # /etc/init.d/o2cb status
> Module "configfs": Loaded
> Filesystem "configfs": Mounted
> Module "ocfs2_nodemanager": Loaded
> Module "ocfs2_dlm": Loaded
> Module "ocfs2_dlmfs": Loaded
> Filesystem "ocfs2_dlmfs": Mounted
> Checking cluster ocfs2: Online
> Checking heartbeat: Active
> slesvm1:~ #
> slesvm1:~ # mount | grep ocfs
> ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
> /dev/sda1 on /oracle type ocfs2 (rw,_netdev,heartbeat=local)
> slesvm1:~ #
> slesvm1:~ # /etc/init.d/ocfs2 stop
> Stopping Oracle Cluster File System (OCFS2) done
> slesvm1:~ #
> slesvm1:~ # mount | grep ocfs
> ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
> (reverse-i-search)`stop': /etc/init.d/ocfs2 stop
> (reverse-i-search)`':
> slesvm1:~ #
> slesvm1:~ # /etc/init.d/o2cb stop
> Cleaning heartbeat on ocfs2: OK
> Stopping cluster ocfs2: OK
> Unloading module "ocfs2": OK
> Unmounting ocfs2_dlmfs filesystem: OK
> Unloading module "ocfs2_dlmfs": OK
> Unmounting configfs filesystem: OK
> Unloading module "configfs": OK
> slesvm1:~ #
> slesvm1:~ #
> slesvm1:~ # Oops: 0000 [#1]
> SMP
> last sysfs file: /block/sda/removable
> Modules linked in: sg sd_mod ipv6 iscsi_tcp libiscsi
> scsi_transport_iscsi scsi_mod apparmor aamatch_pcre loop dm_mod
> reiserfs xenblk xennet
> CPU: 0
> EIP: 0061:[<c0127491>] Not tainted VLI
> EFLAGS: 00210083 (2.6.16.20-0.12-xen #1)
> EIP is at cascade+0x11/0x40
> eax: c1213c80 ebx: d1241d6c ecx: 0000000a edx: c121448c
> esi: c12144dc edi: c1213c80 ebp: 0000000a esp: c0383ec8
> ds: 007b es: 007b ss: 0069
> Process swapper (pid: 0, threadinfo=c0382000 task=c03265c0)
> Stack: <0>00000000 c1214478 c1213c80 c0383ef8 c0128510 00000000
> 00000000 00000000
> 44b79c9c 00002b39 00000000 00000000 c0383ef8 c0383ef8 00000001
c036e108> c0382000 c03ab180 c01234f5 c03ade40 0000000a 00000000 c0382000
00000001> Call Trace:
> [<c0128510>] run_timer_softirq+0xb0/0x1c0
> [<c01234f5>] __do_softirq+0x85/0x110
> [<c0123605>] do_softirq+0x85/0x90
> [<c010687c>] do_IRQ+0x3c/0x70
> [<c024d111>] evtchn_do_upcall+0x91/0xb0
> [<c01050e8>] hypervisor_callback+0x2c/0x34
> [<c0102f5d>] xen_idle+0x4d/0xb0
> [<c01030e6>] cpu_idle+0x66/0xe0
> [<c038476f>] start_kernel+0x2ef/0x3a0
> [<c0384210>] unknown_bootoption+0x0/0x270
> Code: 71 14 e8 f3 fd ff ff 8b 0b 39 cb 75 dd 5b 5e c3 8d 76 00 8d bc
> 27 00 00 00 00 55 89 cd 57 89 c7 56 8d 34 ca 53 8b 1e 39 de 74 14
<39>
> 7b 14 89 da 75 19 8b 1b 89 f8 e8 bf fd ff ff 39 de 75 ec 89
> <0>Kernel panic - not syncing: Fatal exception in interrupt
>
> Does anyone have an idea what the problem might be? Any additional
> information I can provide that might help to track it down?
>
> Thanks in advance for any input.
>
> Steve
>
>
>
> Example shutdown output:
> --------------------------------------------------------------------------
--------------> INIT: Switching to runlevel: 6
> INIT: Sending processes the TERM signal
> Boot logging started on /dev/tty1(/dev/console) at Thu Jul 13 08:28:20
2006> Master Resource Control: previous runlevel: 5, switching to runlevel:6
> Shutting down CRON daemon done
> Shutting down auditd done
> Shutting down irqbalance done
> Shutting down cupsd done
> Unloading AppArmor profiles done
> Shutting down ZENworks Management Daemon done
> Shutting down Name Service Cache Daemon done
> Shutting down mail service (Postfix) done
> Saving random seed done
> Umount SMB/ CIFS File Systems done
> Shutting down slpd done
> Shutting down service gdm done
> Shutting down powersaved done
> Stopping Oracle Cluster File System (OCFS2) done
> Cleaning heartbeat on ocfs2: OK
> Stopping cluster ocfs2: OK
> Unloading module "ocfs2": OK
> Unmounting ocfs2_dlmfs filesystem: OK
> Unloading module "ocfs2_dlmfs": OK
> Unmounting configfs filesystem: OK
> Unloading module "configfs": OK
> Shutting down SSH daemon done
> Remove Net File System (NFS)
unused> Shutting down RPC portmap daemon done
> Logging out from iqn.1992-08.com.netapp:sn.84166997: done
> Stopping iSCSI initiator service: done
> Shutting down syslog services done
> Shutting down network interfaces:
> eth0
> eth0 configuration: eth-id-00:16:3e:dc:9b:b8 done
> Shutting down service network . . . . . . . . . . . . . done.
> Shutting down HAL daemon done
> Shutting down D-BUS daemon done
> Shutting down resource manager done
> Running /etc/init.d/halt.local done
> Sending all processes the TERM signal... done
> Sending all processes the KILL signal... done
> Turning off swap done
> Unloading AppArmor profiles done
> done
> Unmounting file systems
> securityfs umounted
> devpts umounted
> debugfs umounted
> sysfs umounted
> /dev/hda2 umounted done
> done
> Shutting down MD Raid done
> Stopping udevd: done
> proc umounted
> Unable to handle kernel paging request at virtual address d13f1d6c
> printing eip:
> c01272c1
> *pde = ma 06093067 pa 009cc067
> *pte = ma 00000000 pa fffff000
> Oops: 0002 [#1]
> SMP
> last sysfs file: /class/net/eth0/address
> Modules linked in: joydev st sr_mod ide_cd cdrom ide_core xfs_quota
> xfs exportfs sg sd_mod xt_pkttype ipt_LOG xt_limit scsi_mod
> ip6t_REJECT xt_tcpudp ipt_REJECT xt_state iptable_mangle iptable_nat
> ip_nat iptable_filter ip6table_mangle ip_conntrack nfnetlink ip_tables
> ip6table_filter ip6_tables x_tables ipv6 apparmor aamatch_pcre loop
> dm_mod reiserfs xenblk xennet
> CPU: 0
> EIP: 0061:[<c01272c1>] Not tainted VLI
> EFLAGS: 00010006 (2.6.16.20-0.12-xen #1)
> EIP is at internal_add_timer+0x61/0xa0
> eax: d13f1d6c ebx: c1213c80 ecx: c12144d4 edx: ce09527c
> esi: 036c82ec edi: 036c894b ebp: 00000000 esp: c0383e70
> ds: 007b es: 007b ss: 0069
> Process swapper (pid: 0, threadinfo=c0382000 task=c03265c0)
> Stack: <0>ce09527c c1213c80 c012777d 00000000 ce095080 00000008
> c1213c80 ce095080
> c02734ac 00000001 c02b2879 00000001 c1214de0 c013698d c1214e00
c0382000> 00000000 000cd1f9 00000000 c1214de4 35147d9a 0000d143 ce095080
00000100> Call Trace:
> [<c012777d>] __mod_timer+0x8d/0xc0
> [<c02734ac>] sk_reset_timer+0xc/0x20
> [<c02b2879>] tcp_write_timer+0x119/0x650
> [<c013698d>] hrtimer_run_queues+0x4d/0x180
> [<c01285c9>] run_timer_softirq+0x169/0x1c0
> [<c02b2760>] tcp_write_timer+0x0/0x650
> [<c01234f5>] __do_softirq+0x85/0x110
> [<c0123605>] do_softirq+0x85/0x90
> [<c010687c>] do_IRQ+0x3c/0x70
> [<c024d111>] evtchn_do_upcall+0x91/0xb0
> [<c01050e8>] hypervisor_callback+0x2c/0x34
> [<c0102f5d>] xen_idle+0x4d/0xb0
> [<c01030e6>] cpu_idle+0x66/0xe0
> [<c038476f>] start_kernel+0x2ef/0x3a0
> [<c0384210>] unknown_bootoption+0x0/0x270
> Code: c1 e8 11 25 f8 01 00 00 8d 8c 18 0c 0c 00 00 eb 12 85 c9 79 48
> 89 f0 8d 76 00 25 ff 00 00 00 8d 4c c3 0c 8b 41 04 89 0a 89 51 04
<89>
> 10 8b 1c 24 8b 74 24 04 89 42 04 83 c4 08 c3 c1 e8 05 25 f8
> <0>Kernel panic - not syncing: Fatal exception in interrupt
>
> --
> Steve Feehan
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>