thr3ads.net - zfs discuss - [zfs-discuss] zfs/lofi/share panic [May 2010]

If this information is useful, please help other people find it:
Share via:

Frank Middleton

2010-May-24 20:52 UTC

[zfs-discuss] zfs/lofi/share panic

Many many moons ago, I submitted a CR into bugs about a
highly reproducible panic that occurs if you try to re-share
a  lofi mounted image. That CR has AFAIK long since
disappeared - I even forget what it was called.

This server is used for doing network installs. Let''s say
you have a 64 bit iso lofi-mounted and shared. You do the
install, and then wish to switch to a 32 bit iso. You unshare,
umount, delete the loopback, and then lofiadm the new iso,
mount it and then share it. Panic, every time.

Is this such a rare use-case that no one is interested? I have
the backtrace and cores if anyone wants them, although
such were submitted with the original CR. This is pretty
frustrating since you start to run out of ideas for mountpoint
names after a while unless you forget and get the panic.

FWIW (even on a freshly booted system after a panic)
# lofiadm zyzzy.iso /dev/lofi/1
# mount -F hsfs /dev/lofi/1 /mnt
mount: /dev/lofi/1 is already mounted or /mnt is busy
# mount -O -F hsfs /dev/lofi/1 /mnt
# share /mnt
#

If you unshare /mnt and then do this again, it will panic.
This has been a bug since before Open Solaris came out.

It doesn''t happen if the iso is originally on UFS, but
UFS really isn''t an option any more.  FWIW the dataset
containing the isos has the sharenfs attribute set,
although it doesn;t have to be actually mounted by
any remote NFS for this panic to occur.

Suggestions for a workaround most welcome!

Thanks

Jan Kryl

2010-May-27 18:45 UTC

head link

[zfs-discuss] zfs/lofi/share panic

Hi Frank,

On 24/05/10 16:52 -0400, Frank Middleton wrote:>  Many many moons ago, I submitted a CR into bugs about a
>  highly reproducible panic that occurs if you try to re-share
>  a  lofi mounted image. That CR has AFAIK long since
>  disappeared - I even forget what it was called.
> 
>  This server is used for doing network installs. Let''s say
>  you have a 64 bit iso lofi-mounted and shared. You do the
>  install, and then wish to switch to a 32 bit iso. You unshare,
>  umount, delete the loopback, and then lofiadm the new iso,
>  mount it and then share it. Panic, every time.
> 
>  Is this such a rare use-case that no one is interested? I have
>  the backtrace and cores if anyone wants them, although
>  such were submitted with the original CR. This is pretty
>  frustrating since you start to run out of ideas for mountpoint
>  names after a while unless you forget and get the panic.
> 
>  FWIW (even on a freshly booted system after a panic)
>  # lofiadm zyzzy.iso /dev/lofi/1
>  # mount -F hsfs /dev/lofi/1 /mnt
>  mount: /dev/lofi/1 is already mounted or /mnt is busy
>  # mount -O -F hsfs /dev/lofi/1 /mnt
>  # share /mnt
>  #
> 
>  If you unshare /mnt and then do this again, it will panic.
>  This has been a bug since before Open Solaris came out.
> 
>  It doesn''t happen if the iso is originally on UFS, but
>  UFS really isn''t an option any more.  FWIW the dataset
>  containing the isos has the sharenfs attribute set,
>  although it doesn;t have to be actually mounted by
>  any remote NFS for this panic to occur.
> 
>  Suggestions for a workaround most welcome!
> the bug (6798273) has been closed as incomplete with following
note:

"I cannot reproduce any issue with the given testcase on b137."

So you should test this with b137 or newer build. There have
been some extensive changes going to treeclimb_* functions,
so the bug is probably fixed or will be in near future.

Let us know if you can still reproduce the panic on
recent build.

thanks
-jan

Kyle McDonald

2010-May-27 19:09 UTC

head link

[zfs-discuss] zfs/lofi/share panic

On 5/27/2010 2:45 PM, Jan Kryl wrote:> Hi Frank,
>
> On 24/05/10 16:52 -0400, Frank Middleton wrote:
>   
>>  Many many moons ago, I submitted a CR into bugs about a
>>  highly reproducible panic that occurs if you try to re-share
>>  a  lofi mounted image. That CR has AFAIK long since
>>  disappeared - I even forget what it was called.
>>
>>  This server is used for doing network installs. Let''s say
>>  you have a 64 bit iso lofi-mounted and shared. You do the
>>  install, and then wish to switch to a 32 bit iso. You unshare,
>>  umount, delete the loopback, and then lofiadm the new iso,
>>  mount it and then share it. Panic, every time.
>>
>>  Is this such a rare use-case that no one is interested? I have
>>  the backtrace and cores if anyone wants them, although
>>  such were submitted with the original CR. This is pretty
>>  frustrating since you start to run out of ideas for mountpoint
>>  names after a while unless you forget and get the panic.
>>
>>  FWIW (even on a freshly booted system after a panic)
>>  # lofiadm zyzzy.iso /dev/lofi/1
>>  # mount -F hsfs /dev/lofi/1 /mnt
>>  mount: /dev/lofi/1 is already mounted or /mnt is busy
>>  # mount -O -F hsfs /dev/lofi/1 /mnt
>>  # share /mnt
>>  #
>>
>>  If you unshare /mnt and then do this again, it will panic.
>>  This has been a bug since before Open Solaris came out.
>>
>>  It doesn''t happen if the iso is originally on UFS, but
>>  UFS really isn''t an option any more.  FWIW the dataset
>>  containing the isos has the sharenfs attribute set,
>>  although it doesn;t have to be actually mounted by
>>  any remote NFS for this panic to occur.
>>
>>  Suggestions for a workaround most welcome!
>>
>>     
> the bug (6798273) has been closed as incomplete with following
> note:
>
> "I cannot reproduce any issue with the given testcase on b137."
>
> So you should test this with b137 or newer build. There have
> been some extensive changes going to treeclimb_* functions,
> so the bug is probably fixed or will be in near future.
>
> Let us know if you can still reproduce the panic on
> recent build.
>
>   I don''t know if the code path is the same enough, bu you should also
try
it like this:

# mount -F hsfs zyzzy.iso /mnt

For many builds now, (Open)Solaris hasn''t needed the
''lofiadm'' step for
ISO''s (and possibly other FS''s that can be guessed)

I now put ISO''s (for installs just like you) directly in my
/etc/vfstab.

  -Kyle
> thanks
> -jan
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Carson Gaspar

2010-May-27 19:21 UTC

head link

[zfs-discuss] zfs/lofi/share panic

Jan Kryl wrote:> the bug (6798273) has been closed as incomplete with following
> note:
> 
> "I cannot reproduce any issue with the given testcase on b137."
> 
> So you should test this with b137 or newer build. There have
> been some extensive changes going to treeclimb_* functions,
> so the bug is probably fixed or will be in near future.
> 
> Let us know if you can still reproduce the panic on
> recent build.
The most recent build available outside of Oracle is still 134, or am I 
missing something?

-- 
Carson

Garrett D''Amore

2010-May-27 19:25 UTC

head link

[zfs-discuss] zfs/lofi/share panic

On 5/27/2010 12:21 PM, Carson Gaspar wrote:> Jan Kryl wrote:
>> the bug (6798273) has been closed as incomplete with following
>> note:
>>
>> "I cannot reproduce any issue with the given testcase on
b137."
>>
>> So you should test this with b137 or newer build. There have
>> been some extensive changes going to treeclimb_* functions,
>> so the bug is probably fixed or will be in near future.
>>
>> Let us know if you can still reproduce the panic on
>> recent build.
>
> The most recent build available outside of Oracle is still 134, or am 
> I missing something?
That''s the latest binary build.  It is possible to build something
newer
yourself, but doing so will take some unusual effort.

     - Garrett

Dennis Clarke

2010-May-27 21:16 UTC

head link

[zfs-discuss] zfs/lofi/share panic

>>  FWIW (even on a freshly booted system after a panic)
>>  # lofiadm zyzzy.iso /dev/lofi/1
>>  # mount -F hsfs /dev/lofi/1 /mnt
>>  mount: /dev/lofi/1 is already mounted or /mnt is busy
>>  # mount -O -F hsfs /dev/lofi/1 /mnt
>>  # share /mnt
>>  #
>>
>>  If you unshare /mnt and then do this again, it will panic.
>>  This has been a bug since before Open Solaris came out.
>>
I just tried this with a UFS based filesystem just for a lark.

root at aequitas:/# mkdir /testfs
root at aequitas:/# mount -F ufs -o noatime,nologging /dev/dsk/c0d1s0 /testfs
root at aequitas:/# ls -l /testfs/sol\-nv\-b130\-x86\-dvd.iso
-rw-r--r-- 1 root root 3818782720 Feb  5 16:02
/testfs/sol-nv-b130-x86-dvd.iso

root at aequitas:/# lofiadm -a /testfs/sol-nv-b130-x86-dvd.iso
May 27 21:08:58 aequitas pseudo: pseudo-device: lofi0
May 27 21:08:58 aequitas genunix: lofi0 is /pseudo/lofi at 0
May 27 21:08:58 aequitas rootnex: xsvc0 at root: space 0 offset 0
May 27 21:08:58 aequitas genunix: xsvc0 is /xsvc at 0,0
May 27 21:08:58 aequitas pseudo: pseudo-device: devinfo0
May 27 21:08:58 aequitas genunix: devinfo0 is /pseudo/devinfo at 0
/dev/lofi/1
root at aequitas:/# mount -F hsfs -o ro /dev/lofi/1 /mnt
root at aequitas:/# share -F nfs -o nosub,nosuid,sec=sys,ro,anon=0 /mnt

Then at a Sol 10 server :

# uname -a
SunOS jupiter 5.10 Generic_142900-11 sun4u sparc SUNW,Sun-Fire-480R

# dfshares aequitas
RESOURCE                                  SERVER ACCESS    TRANSPORT
  aequitas:/mnt                         aequitas  -         -
#
# mount -F nfs -o bg,intr,nosuid,ro,vers=4 aequitas:/mnt /mnt

# ls /mnt
Copyright                    autorun.inf
JDS-THIRDPARTYLICENSEREADME  autorun.sh
License                      boot
README.txt                   installer
Solaris_11                   sddtool
Sun_HPC_ClusterTools
# umount aequitas:/mnt
# dfshares aequitas
RESOURCE                                  SERVER ACCESS    TRANSPORT
  aequitas:/mnt                         aequitas  -         -

Then back at the snv_138 box I unshare and re-share and ... nothing bad
happens.

root at aequitas:/# unshare /mnt
root at aequitas:/# share -F nfs -o nosub,nosuid,sec=sys,ro,anon=0 /mnt
root at aequitas:/# unshare /mnt
root at aequitas:/#

Guess I must now try this with a ZFS fs under that iso file.


-- 
Dennis Clarke
dclarke at opensolaris.ca  <- Email related to the open source Solaris
dclarke at blastwave.org   <- Email related to open source for Solaris

Frank Middleton

2010-May-30 18:20 UTC

head link

[zfs-discuss] zfs/lofi/share panic

On 05/27/10 05:16 PM, Dennis Clarke wrote:
> I just tried this with a UFS based filesystem just for a lark.
It never failed on UFS, regardless of the contents of /etc/dfs/dfstab.
> Guess I must now try this with a ZFS fs under that iso file.
Just tried it again with b134  *with* "share /mnt" in /etc/dfs/dfstab.

# mount -O -F hsfs /export/iso_images/moblin-2.1-PR-Final-ivi-201002090924.img
/mnt
# ls /mnt
isolinux  LiveOS
# unshare /mnt
/mnt: path doesn''t exist
# share /mnt
# unshare /mnt
# share /mnt

Panic ensues (the following observed on the serial console); note that
the dataset is not UFS!

# May 30 13:35:44 host5 ufs: NOTICE: mount: not a UFS magic number (0x0)

panic[cpu1]/thread=30001f5f560: BAD TRAP: type=31 rp=2a1014769a0 addr=218
mmu_fsr=0 occurred in module "nfssrv" due to a NULL pointer
dereference

Tried again after it rebooted

Edited /etc/dfs/dfstab  to remove the share /mnt
# unshare /mnt
# mount -O -F hsfs /backups/icon/moblin-2.1-PR-Final-ivi-201002090924.img /mnt
# ls /mnt
isolinux  LiveOS
# unshare /mnt
/mnt: bad path
# share /mnt
# unshare  /mnt
# share /mnt

No panic. So the problem all along appears to be what happens if you
mount -O to an already shared mountpoint. Deliberately sharing before
mounting (but with nothing in /etc/dfs/dfstab) resulted in a slightly
different panic (more like the ones documented in the CR):

panic[cpu1]/thread=30002345e0: BAD TRAP: type=34 rp=2a100f84460
addr=ffffff6f6c2f5267 mmu_fsr=0

unshare: alignment error:

So CR6798273 should be amended to show the following:

To reproduce, share (say) /mnt
mount -O some-image-file /mnt
share /mnt
unshare /mnt
share/mnt
unshare ./mnt
Highly reproducible panic ensues.

Workaround - make sure mountpoints are not shared before
mounting iso images stored on a ZFS dataset.

So the problem, now seen to be relatively trivial, isn''t fixed. at
least
in b134. For all of you who responded both off and on the list and
motivated this experiment,  much thanks. Perhaps someone with
access to a more recent build could try this, and if it still happens,
update and reopen CR6798273, although it doesn''t seem very
important now.

Regards -- Frank

zfs discuss - May 2010 - zfs/lofi/share panic

[zfs-discuss] zfs/lofi/share panic

[zfs-discuss] zfs/lofi/share panic

[zfs-discuss] zfs/lofi/share panic

[zfs-discuss] zfs/lofi/share panic

[zfs-discuss] zfs/lofi/share panic

[zfs-discuss] zfs/lofi/share panic

[zfs-discuss] zfs/lofi/share panic