Jordan Vaughan
2009-Nov-11 07:17 UTC
[zfs-discuss] libzfs zfs_create() fails on sun4u daily bits (daily.1110)
I encountered a strange libzfs behavior while testing a zone fix and want to make sure that I found a genuine bug. I''m creating zones whose zonepaths reside in ZFS datasets (i.e., the parent directories of the zones'' zonepaths are ZFS datasets). In this scenario, zoneadm(1M) attempts to create ZFS datasets for zonepaths. zoneadm(1M) has done this for a long time (since zones started working on ZFS?) and worked fine until recently. Now I''m seeing zoneadm(1M) fail to create ZFS datasets for new zones while running sparc bits from the daily.1110 ONNV nightly build: ----8<---- root krb-v210-4 [20:05:57 0]# zoneadm list -cv ID NAME STATUS PATH BRAND IP 0 global running / native shared - godel installed /export/godel native shared - turing configured /export/turing native shared root krb-v210-4 [20:05:58 0]# zonecfg -z turing info zonename: turing zonepath: /export/turing brand: native autoboot: false bootargs: pool: limitpriv: scheduling-class: ip-type: shared hostid: 900d833f inherit-pkg-dir: dir: /lib inherit-pkg-dir: dir: /platform inherit-pkg-dir: dir: /sbin inherit-pkg-dir: dir: /usr root krb-v210-4 [20:06:07 0]# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 17.2G 16.0G 66K /rpool rpool/ROOT 8.51G 16.0G 21K legacy rpool/ROOT/snv_126 8.51G 16.0G 8.51G / rpool/dump 4.00G 16.0G 4.00G - rpool/export 688M 16.0G 688M /export rpool/export/home 21K 16.0G 21K /export/home rpool/swap 4G 20.0G 5.60M - zonepool 9.28M 33.2G 28K /zonepool zonepool/zones 21K 33.2G 21K /zonepool/zones root krb-v210-4 [20:06:11 0]# uname -a SunOS krb-v210-4 5.11 onnv-gate:2009-11-10 sun4u sparc SUNW,Sun-Fire-V210 root krb-v210-4 [20:06:30 0]# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT rpool 33.8G 13.2G 20.6G 39% 1.00x ONLINE - zonepool 33.8G 9.43M 33.7G 0% 1.00x ONLINE - root krb-v210-4 [20:06:46 0]# zpool get all rpool NAME PROPERTY VALUE SOURCE rpool size 33.8G - rpool capacity 39% - rpool altroot - default rpool health ONLINE - rpool guid 12880404862636496284 local rpool version 19 local rpool bootfs rpool/ROOT/snv_126 local rpool delegation on default rpool autoreplace off default rpool cachefile - default rpool failmode continue local rpool listsnapshots off default rpool autoexpand off default rpool dedupratio 1.00x - rpool free 20.6G - rpool allocated 13.2G - root krb-v210-4 [20:06:51 0]# zfs get all rpool/export NAME PROPERTY VALUE SOURCE rpool/export type filesystem - rpool/export creation Sun Oct 25 19:43 2009 - rpool/export used 688M - rpool/export available 16.0G - rpool/export referenced 688M - rpool/export compressratio 1.00x - rpool/export mounted yes - rpool/export quota none default rpool/export reservation none default rpool/export recordsize 128K default rpool/export mountpoint /export local rpool/export sharenfs off default rpool/export checksum on default rpool/export compression off default rpool/export atime on default rpool/export devices on default rpool/export exec on default rpool/export setuid on default rpool/export readonly off default rpool/export zoned off default rpool/export snapdir hidden default rpool/export aclmode groupmask default rpool/export aclinherit restricted default rpool/export canmount on default rpool/export shareiscsi off default rpool/export xattr on default rpool/export copies 1 default rpool/export version 4 - rpool/export utf8only off - rpool/export normalization none - rpool/export casesensitivity sensitive - rpool/export vscan off default rpool/export nbmand off default rpool/export sharesmb off default rpool/export refquota none default rpool/export refreservation none default rpool/export primarycache all default rpool/export secondarycache all default rpool/export usedbysnapshots 0 - rpool/export usedbydataset 688M - rpool/export usedbychildren 21K - rpool/export usedbyrefreservation 0 - rpool/export logbias latency default rpool/export dedup off default rpool/export mlslabel none default ----8<---- The above information looks normal to me. If I try to install the zone named "turing," then I get the following: ----8<---- root krb-v210-4 [20:07:22 0]# zoneadm -z turing install cannot create ZFS dataset rpool/export/turing: Unknown error Preparing to install zone <turing>. Creating list of files to copy from the global zone. [...] ----8<---- The install continues and succeeds, but no ZFS dataset was created. I traced the error message in the latest ON source and located it in create_zfs_zonepath() in usr/src/cmd/zoneadm/zfs.c: ----8<---- if (zfs_create(g_zfs, zfs_name, ZFS_TYPE_FILESYSTEM, props) != 0 || (zhp = zfs_open(g_zfs, zfs_name, ZFS_TYPE_DATASET)) == NULL) { (void) fprintf(stderr, gettext("cannot create ZFS dataset %s: " "%s\n"), zfs_name, libzfs_error_description(g_zfs)); nvlist_free(props); return; } ----8<---- zfs_open() can''t be at fault because zoneadm(1M) is single-threaded, zoneadm(1M) doesn''t destroy the new dataset if it successfully creates it, and the dataset never appears in the output of `zfs list` during installation. I also stepped through zoneadm(1M)''s execution of create_zfs_zonepath() and observed that zfs_create() returns -1 (i.e., it fails): ----8<---- root krb-v210-4 [20:27:25 1]# stoppedproc zoneadm -z turing install root krb-v210-4 [20:27:32 0]# mdb -p `pgrep zoneadm` Loading modules: [ ld.so.1 libc.so.1 ] > create_zfs_zonepath::bp > :c mdb: stop at create_zfs_zonepath mdb: target stopped at: create_zfs_zonepath: save %sp, -0x468, %sp mdb: You''ve got symbols! Loading modules: [ libumem.so.1 libtopo.so.1 libavl.so.1 ] > create_zfs_zonepath+0xd8::dis -n 3 create_zfs_zonepath+0xcc: ld [%fp - 0x4], %o3 create_zfs_zonepath+0xd0: mov %i5, %o1 create_zfs_zonepath+0xd4: ld [%l2 + 0xb8], %o0 create_zfs_zonepath+0xd8: call +0x165d8 <PLT:zfs_create> create_zfs_zonepath+0xdc: mov 0x1, %o2 create_zfs_zonepath+0xe0: tst %o0 create_zfs_zonepath+0xe4: bne,pn %icc, +0x104 <create_zfs_zonepath+0x1e8> > create_zfs_zonepath+0xd8::bp > :c mdb: stop at create_zfs_zonepath+0xd8 mdb: target stopped at: create_zfs_zonepath+0xd8: call +0x165d8 <PLT:zfs_create> mdb: You''ve got symbols! Loading modules: [ libuutil.so.1 libnvpair.so.1 ] > ::regs %g0 = 0x0000000000000000 %l0 = 0x00025400 %g1 = 0x0000000000067740 %l1 = 0x00000000 %g2 = 0x0000000000000000 %l2 = 0x00037c00 zoneadm`_iob+0x88 %g3 = 0x0000000000000000 %l3 = 0x00000000 %g4 = 0xfffffffffffffc14 %l4 = 0x00000000 %g5 = 0x0000000000067718 %l5 = 0x00000000 %g6 = 0x0000000000000000 %l6 = 0x00000000 %g7 = 0x00000000ff1f2a00 %l7 = 0x00000000 %o0 = 0x0000000000038b20 %i0 = 0xffbff830 %o1 = 0x00000000ffbfeb34 %i1 = 0x00000001 %o2 = 0x0000000000000000 %i2 = 0x00052c30 %o3 = 0x000000000005af78 %i3 = 0x000254ac %o4 = 0x0000000000025478 %i4 = 0x00000000 %o5 = 0x00000000feb87a3c %i5 = 0xffbfeb34 %o6 = 0x00000000ffbfead0 %i6 = 0xffbfef38 %o7 = 0x0000000000020474 create_zfs_zonepath+0x80 %i7 = 0x0001a224 install_func+0x488 %psr = 0xfe401001 impl=0xf ver=0xe icc=nZvc ec=0 ef=4096 pil=0 s=0 ps=0 et=0 cwp=0x1 %y = 0x00000000 %pc = 0x000204cc create_zfs_zonepath+0xd8 %npc = 0x000204d0 create_zfs_zonepath+0xdc %sp = 0xffbfead0 %fp = 0xffbfef38 %wim = 0x00000000 %tbr = 0x00000000 > 0x00000000ffbfeb34/S 0xffbfeb34: rpool/export/turing > 0x000000000005af78::nvlist sharenfs=''off'' > create_zfs_zonepath+0xe0::bp > :c mdb: stop at create_zfs_zonepath+0xe0 mdb: target stopped at: create_zfs_zonepath+0xe0: tst %o0 > ::regs !fgrep %o0 %o0 = 0xffffffffffffffff %i0 = 0xffbff830 ----8<---- Clearly, zfs_create() is failing even though its parameters are fine. No one has changed anything about zonepath ZFS dataset creation in zoneadm(1M) in a long time. I also discovered that although creating ZFS datasets with zfs(1M) works, creating ZFS datasets with user-specified properties via zfs(1M) fails: ----8<---- root krb-v210-4 [23:06:06 0]# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 17.2G 16.0G 66K /rpool rpool/ROOT 8.52G 16.0G 21K legacy rpool/ROOT/snv_126 8.52G 16.0G 8.52G / rpool/dump 4.00G 16.0G 4.00G - rpool/export 688M 16.0G 688M /export rpool/export/home 21K 16.0G 21K /export/home rpool/swap 4G 20.0G 5.60M - zonepool 139K 33.2G 28K /zonepool zonepool/zones 21K 33.2G 21K /zonepool/zones root krb-v210-4 [23:06:07 0]# zfs create rpool/export/test root krb-v210-4 [23:06:17 0]# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 17.2G 16.0G 66K /rpool rpool/ROOT 8.52G 16.0G 21K legacy rpool/ROOT/snv_126 8.52G 16.0G 8.52G / rpool/dump 4.00G 16.0G 4.00G - rpool/export 688M 16.0G 688M /export rpool/export/home 21K 16.0G 21K /export/home rpool/export/test 21K 16.0G 21K /export/test rpool/swap 4G 20.0G 5.60M - zonepool 139K 33.2G 28K /zonepool zonepool/zones 21K 33.2G 21K /zonepool/zones root krb-v210-4 [23:06:24 0]# zfs create -o sharenfs=off rpool/export/test2 internal error: Unknown error zsh: IOT instruction (core dumped) zfs create -o sharenfs=off rpool/export/test2 root krb-v210-4 [23:06:39 134]# zfs create -o sharenfs=on rpool/export/test2 internal error: Unknown error zsh: IOT instruction (core dumped) zfs create -o sharenfs=on rpool/export/test2 root krb-v210-4 [23:06:47 134]# zfs create -o mountpoint=/mnt rpool/export/test2 internal error: Unknown error zsh: IOT instruction (core dumped) zfs create -o mountpoint=/mnt rpool/export/test2 ----8<---- Notice that the case in which sharenfs is set to off does exactly the same thing as zoneadm(1M): It attempts to create a dataset with sharenfs=off. If this is a genuine bug, then it will be a serious issue for OpenSolaris zones (which require ZFS-backed zonepaths) and ZFS dataset creation in general. NOTE: I''m only seeing this aberrant behavior on sparc (both sun4u and sun4v). I haven''t observed it on x86 (though I haven''t tested it on 32-bit x86). Here''s output that I gathered from a sun4v machine: ----8<---- root t5120-sfb-01 [23:00:49 134]# uname -a SunOS t5120-sfb-01 5.11 onnv-gate:2009-11-10 sun4v sparc SUNW,SPARC-Enterprise-T 5120 root t5120-sfb-01 [23:04:43 0]# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 40.2G 93.7G 66K /rpool rpool/ROOT 8.43G 93.7G 21K legacy rpool/ROOT/snv_126 8.43G 93.7G 8.43G / rpool/dump 15.9G 93.7G 15.9G - rpool/export 44K 93.7G 23K /export rpool/export/home 21K 93.7G 21K /export/home rpool/swap 15.9G 110G 4.59M - root t5120-sfb-01 [23:04:46 0]# zfs create rpool/export/test root t5120-sfb-01 [23:04:56 0]# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 40.2G 93.7G 66K /rpool rpool/ROOT 8.43G 93.7G 21K legacy rpool/ROOT/snv_126 8.43G 93.7G 8.43G / rpool/dump 15.9G 93.7G 15.9G - rpool/export 65K 93.7G 23K /export rpool/export/home 21K 93.7G 21K /export/home rpool/export/test 21K 93.7G 21K /export/test rpool/swap 15.9G 110G 4.59M - root t5120-sfb-01 [23:04:59 0]# zfs create -o sharenfs=off rpool/export/test2 internal error: Unknown error zsh: IOT instruction (core dumped) zfs create -o sharenfs=off rpool/export/test2 root t5120-sfb-01 [23:05:16 134]# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 40.2G 93.7G 66K /rpool rpool/ROOT 8.43G 93.7G 21K legacy rpool/ROOT/snv_126 8.43G 93.7G 8.43G / rpool/dump 15.9G 93.7G 15.9G - rpool/export 66K 93.7G 24K /export rpool/export/home 21K 93.7G 21K /export/home rpool/export/test 21K 93.7G 21K /export/test rpool/swap 15.9G 110G 4.59M - root t5120-sfb-01 [23:05:26 0]# zfs create -o mountpoint=/mnt rpool/export/test2 internal error: Unknown error zsh: IOT instruction (core dumped) zfs create -o mountpoint=/mnt rpool/export/test2 ----8<---- I''d appreciate any advice you might have regarding this problem. Thanks! Jordan