thr3ads.net - zfs discuss - [zfs-discuss] ZFS panic while mounting lofi device? [Jun 2006]

If this information is useful, please help other people find it:
Share via:

Nathanael Burton

2006-Jun-13 23:48 UTC

[zfs-discuss] ZFS panic while mounting lofi device?

I believe ZFS is causing a panic whenever I attempt to mount an iso image (SXCR
build 39) that happens to reside on a ZFS file system.  The problem is 100%
reproducible.  I''m quite new to OpenSolaris, so I may be incorrect in
saying it''s ZFS'' fault.  Also, let me know if you need any
additional information or debug output to help diagnose things.

Config:
[b]bash-3.00# uname -a
SunOS mathrock-opensolaris 5.11 opensol-20060605 i86pc i386 i86pc[/b]

Scenario:
[b]bash-3.00# mount -F hsfs -o ro `lofiadm -a
/data/OS/Solaris/sol-nv-b39-x86-dvd.iso` /tmp/test[/b]

After typing that the system hangs, the network drops, panics, and reboots.
"/data" is a ZFS file system built on a raidz pool of 3 disks.

[b]bash-3.00# zpool status sata
  pool: sata
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        sata        ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c2t0d0  ONLINE       0     0     0
            c2t1d0  ONLINE       0     0     0
            c2t2d0  ONLINE       0     0     0

errors: No known data errors
bash-3.00# zfs list sata/data
NAME                   USED  AVAIL  REFER  MOUNTPOINT
sata/data             16.9G   533G  16.9G  /data[/b]

Error:
[b]Jun 13 19:33:01 mathrock-opensolaris pseudo: [ID 129642 kern.info]
pseudo-device: lofi0
Jun 13 19:33:01 mathrock-opensolaris genunix: [ID 936769 kern.info] lofi0 is
/pseudo/lofi at 0
Jun 13 19:33:04 mathrock-opensolaris unix: [ID 836849 kern.notice]
Jun 13 19:33:04 mathrock-opensolaris ^Mpanic[cpu1]/thread=d1fafde0:
Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 920532 kern.notice]
page_unlock: page c51b29e0 is not locked
Jun 13 19:33:04 mathrock-opensolaris unix: [ID 100000 kern.notice]
Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice] d1fafb54
unix:page_unlock+160 (c51b29e0)
Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice] d1fafbb0
zfs:zfs_getpage+27a (d1e897c0, 3
000, 0, )
Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice] d1fafc0c
genunix:fop_getpage+36 (d1e897c0
, 8000, 0, )
Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice] d1fafca0
genunix:segmap_fault+202 (ce043f
58, fec23310,)
Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice] d1fafd08
genunix:segmap_getmapflt+6fc (fe
c23310, d1e897c0,)
Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice] d1fafd78
lofi:lofi_strategy_task+2c8 (d2b
6bee0, 0, 0, 0, )
Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice] d1fafdc8
genunix:taskq_thread+194 (c5e87f
30, 0)
Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice] d1fafdd8
unix:thread_start+8 ()
Jun 13 19:33:04 mathrock-opensolaris unix: [ID 100000 kern.notice]
Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 672855 kern.notice] syncing
file systems...
[/b]
 
 
This message posted from opensolaris.org

Mark Maybee

2006-Jun-14 18:37 UTC

head link

[zfs-discuss] ZFS panic while mounting lofi device?

Nathanael,

This looks like a bug.  We are trying to clean up after an error in
zfs_getpage() when we trigger this panic.  Can you make a core file
available?  I''d like to take a closer look.

I''ve filed a bug to track this:

	6438702 error handling in zfs_getpage() can trigger "page not locked"
panic

-Mark

Nathanael Burton wrote:> I believe ZFS is causing a panic whenever I attempt to mount an iso image
(SXCR build 39) that happens to reside on a ZFS file system.  The problem is
100% reproducible.  I''m quite new to OpenSolaris, so I may be incorrect
in saying it''s ZFS'' fault.  Also, let me know if you need any
additional information or debug output to help diagnose things.
> 
> Config:
> [b]bash-3.00# uname -a
> SunOS mathrock-opensolaris 5.11 opensol-20060605 i86pc i386 i86pc[/b]
> 
> Scenario:
> [b]bash-3.00# mount -F hsfs -o ro `lofiadm -a
/data/OS/Solaris/sol-nv-b39-x86-dvd.iso` /tmp/test[/b]
> 
> After typing that the system hangs, the network drops, panics, and reboots.
"/data" is a ZFS file system built on a raidz pool of 3 disks.
> 
> [b]bash-3.00# zpool status sata
>   pool: sata
>  state: ONLINE
>  scrub: none requested
> config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         sata        ONLINE       0     0     0
>           raidz1    ONLINE       0     0     0
>             c2t0d0  ONLINE       0     0     0
>             c2t1d0  ONLINE       0     0     0
>             c2t2d0  ONLINE       0     0     0
> 
> errors: No known data errors
> bash-3.00# zfs list sata/data
> NAME                   USED  AVAIL  REFER  MOUNTPOINT
> sata/data             16.9G   533G  16.9G  /data[/b]
> 
> Error:
> [b]Jun 13 19:33:01 mathrock-opensolaris pseudo: [ID 129642 kern.info]
pseudo-device: lofi0
> Jun 13 19:33:01 mathrock-opensolaris genunix: [ID 936769 kern.info] lofi0
is /pseudo/lofi at 0
> Jun 13 19:33:04 mathrock-opensolaris unix: [ID 836849 kern.notice]
> Jun 13 19:33:04 mathrock-opensolaris ^Mpanic[cpu1]/thread=d1fafde0:
> Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 920532 kern.notice]
page_unlock: page c51b29e0 is not locked
> Jun 13 19:33:04 mathrock-opensolaris unix: [ID 100000 kern.notice]
> Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice]
d1fafb54 unix:page_unlock+160 (c51b29e0)
> Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice]
d1fafbb0 zfs:zfs_getpage+27a (d1e897c0, 3
> 000, 0, )
> Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice]
d1fafc0c genunix:fop_getpage+36 (d1e897c0
> , 8000, 0, )
> Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice]
d1fafca0 genunix:segmap_fault+202 (ce043f
> 58, fec23310,)
> Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice]
d1fafd08 genunix:segmap_getmapflt+6fc (fe
> c23310, d1e897c0,)
> Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice]
d1fafd78 lofi:lofi_strategy_task+2c8 (d2b
> 6bee0, 0, 0, 0, )
> Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice]
d1fafdc8 genunix:taskq_thread+194 (c5e87f
> 30, 0)
> Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 353471 kern.notice]
d1fafdd8 unix:thread_start+8 ()
> Jun 13 19:33:04 mathrock-opensolaris unix: [ID 100000 kern.notice]
> Jun 13 19:33:04 mathrock-opensolaris genunix: [ID 672855 kern.notice]
syncing file systems...
> [/b]
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Nathanael Burton

2006-Jun-14 22:58 UTC

head link

[zfs-discuss] Re: ZFS panic while mounting lofi device?

Do you want the vmcore file from /var/crash or something else?  Where can I
upload it to, supportfiles.sun.com?  The bzip''d vmcore file is ~35MB.

Thanks,

Nate
 
 
This message posted from opensolaris.org

Nathanael Burton

2006-Jun-16 02:18 UTC

head link

[zfs-discuss] Re: ZFS panic while mounting lofi device?

Mark,

I might know a little bit more about what''s causing this particular
panic.  I''m currently running OpenSolaris as a guest OS under VMware
Server RC1 on a CentOS 4.3 host OS.  I have 3 - 300GB (~280GB usable) SATA disks
in the server that are all formatted under CentOS like so:

[b][root at mathrock-centos sdb]# fdisk -l /dev/sda
Disk /dev/sda: 300.0 GB, 300069052416 bytes
255 heads, 63 sectors/track, 36481 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        3187    25599546   fd  Linux raid autodetect
/dev/sda2            3188       36481   267434055   bf  Solaris[/b]

So I use the first ~25GB/disk in a Linux software RAID 5, the rest of the disk
~240GB (usable) is given to OpenSolaris (via VMware) as a raw physical disk
partition.  OpenSolaris still thinks that those disks that it''s been
given are the full size (~280GB) -- PROBLEM 1.

Next I can create a simple ZFS pool using one of the SATA disks like so:
[b]bash-3.00# zpool create sata c2t0d0[/b]

Then I copy an iso file from my OpenBSD file server via ftp... As soon as data
starts writing into the ZFS file system I notice zpool CKSUM errors -- PROBLEM
2.  The first time I saw this problem occur I never checked the output of zpool
status, and I believe I must have had a bunch of CKSUM errors then too.  Current
info:
[b]bash-3.00# pwd
/data
bash-3.00# ls -al
total 1423398
drwxr-xr-x   2 root     sys            3 Jun 15 20:55 .
drwxr-xr-x  43 root     root        1024 Jun 15 20:57 ..
-rw-r--r--   1 root     root     728190976 Sep 23  2005
KNOPPIX_V4.0.2CD-2005-09-23-EN.iso
bash-3.00# zpool status
  pool: sata
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using ''zpool clear'' or replace the device with
''zpool replace''.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        sata        ONLINE       0     0    20
          c2t0d0    ONLINE       0     0    20

errors: No known data errors
bash-3.00# zfs list
NAME                   USED  AVAIL  REFER  MOUNTPOINT
sata                   695M   273G  24.5K  /sata
sata/data              695M   273G   695M  /data
sata/mp3s             24.5K   273G  24.5K  /mp3s[/b]

Now, I attempt to mount the iso file via lofiadm and the panic occurs:
[b]bash-3.00# mount -F hsfs `lofiadm -a
/data/KNOPPIX_V4.0.2CD-2005-09-23-EN.iso` /tmp/test[/b]

I have also tested the above scenario but instead of giving OpenSolaris the SATA
disk via raw physical disk access I create a VMware vmdk disk image file on the
SATA disk and give that to OpenSolaris.  In this case I can successfully create
a ZFS file system, copy the same iso to it, and mount it via lofiadm.

So I have a new panic/crash dump -- it''s absolutely huge, ~400MB after
tar and bzip.  If you still want it I can upload it to sunsolve as you
requested.  Or if there is a way to make it smaller let me know.

Thanks,

Nate
 
 
This message posted from opensolaris.org

Mark Maybee

2006-Jun-16 18:19 UTC

head link

[zfs-discuss] Re: ZFS panic while mounting lofi device?

Nate,

Thanks for investigating this.  Sounds like ZFS is either conflicting
with the Linux partition or running off the end of its partition in the
VMware configuration you set up.  The result is the CKSUM errors you
are observing.  This could well lead to errors when we try to pagefault
in the iso image blocks at mount.

There is still a bug here I think in ZFS in the way it is handling these
pagefault errors.  We should not be panicing.

Given your analysis, I don''t think I need your crash dump.

Thanks for using (and finding a bug in) ZFS!

-Mark

Nathanael Burton wrote:> Mark,
> 
> I might know a little bit more about what''s causing this
particular panic.  I''m currently running OpenSolaris as a guest OS
under VMware Server RC1 on a CentOS 4.3 host OS.  I have 3 - 300GB (~280GB
usable) SATA disks in the server that are all formatted under CentOS like so:
> 
> [b][root at mathrock-centos sdb]# fdisk -l /dev/sda
> Disk /dev/sda: 300.0 GB, 300069052416 bytes
> 255 heads, 63 sectors/track, 36481 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sda1   *           1        3187    25599546   fd  Linux raid
autodetect
> /dev/sda2            3188       36481   267434055   bf  Solaris[/b]
> 
> So I use the first ~25GB/disk in a Linux software RAID 5, the rest of the
disk ~240GB (usable) is given to OpenSolaris (via VMware) as a raw physical disk
partition.  OpenSolaris still thinks that those disks that it''s been
given are the full size (~280GB) -- PROBLEM 1.
> 
> Next I can create a simple ZFS pool using one of the SATA disks like so:
> [b]bash-3.00# zpool create sata c2t0d0[/b]
> 
> Then I copy an iso file from my OpenBSD file server via ftp... As soon as
data starts writing into the ZFS file system I notice zpool CKSUM errors --
PROBLEM 2.  The first time I saw this problem occur I never checked the output
of zpool status, and I believe I must have had a bunch of CKSUM errors then too.
Current info:
> [b]bash-3.00# pwd
> /data
> bash-3.00# ls -al
> total 1423398
> drwxr-xr-x   2 root     sys            3 Jun 15 20:55 .
> drwxr-xr-x  43 root     root        1024 Jun 15 20:57 ..
> -rw-r--r--   1 root     root     728190976 Sep 23  2005
KNOPPIX_V4.0.2CD-2005-09-23-EN.iso
> bash-3.00# zpool status
>   pool: sata
>  state: ONLINE
> status: One or more devices has experienced an unrecoverable error.  An
>         attempt was made to correct the error.  Applications are
unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
>         using ''zpool clear'' or replace the device with
''zpool replace''.
>    see: http://www.sun.com/msg/ZFS-8000-9P
>  scrub: none requested
> config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         sata        ONLINE       0     0    20
>           c2t0d0    ONLINE       0     0    20
> 
> errors: No known data errors
> bash-3.00# zfs list
> NAME                   USED  AVAIL  REFER  MOUNTPOINT
> sata                   695M   273G  24.5K  /sata
> sata/data              695M   273G   695M  /data
> sata/mp3s             24.5K   273G  24.5K  /mp3s[/b]
> 
> Now, I attempt to mount the iso file via lofiadm and the panic occurs:
> [b]bash-3.00# mount -F hsfs `lofiadm -a
/data/KNOPPIX_V4.0.2CD-2005-09-23-EN.iso` /tmp/test[/b]
> 
> I have also tested the above scenario but instead of giving OpenSolaris the
SATA disk via raw physical disk access I create a VMware vmdk disk image file on
the SATA disk and give that to OpenSolaris.  In this case I can successfully
create a ZFS file system, copy the same iso to it, and mount it via lofiadm.
> 
> So I have a new panic/crash dump -- it''s absolutely huge, ~400MB
after tar and bzip.  If you still want it I can upload it to sunsolve as you
requested.  Or if there is a way to make it smaller let me know.
> 
> Thanks,
> 
> Nate
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reasonably Related Threads

Search for more reasonably related threads

zfs discuss - Jun 2006 - ZFS panic while mounting lofi device?

[zfs-discuss] ZFS panic while mounting lofi device?

[zfs-discuss] ZFS panic while mounting lofi device?

[zfs-discuss] Re: ZFS panic while mounting lofi device?

[zfs-discuss] Re: ZFS panic while mounting lofi device?

[zfs-discuss] Re: ZFS panic while mounting lofi device?

Reasonably Related Threads