thr3ads.net - zfs discuss - [zfs-discuss] zfs panic when untarring over nfs to a zpool ramdisk [Mar 2006]

If this information is useful, please help other people find it:
Share via:

Naveen Nalam

2006-Mar-22 02:13 UTC

[zfs-discuss] zfs panic when untarring over nfs to a zpool ramdisk

I''ve created a ramdisk using ''ramdiskadm'', and then
from a linux nfs client I do an untar of an xemacs source tarball. My ramdisk is
100megs, and the source tar file is 57meg. I later tried untarring two 35 meg
tarballs - the first one untars fine, then the 2nd untar causes the server to
panic. I was doing a ''zfs list'' just before the panic, and it
didn''t appear that the pool was near full capacity (it was around 60%
used or so).

Am I using zfs incorrectly? or is this a bug in the interaction b/w a ramdisk
and zfs? (I can upload the core dump somewhere if needed)
Thanks,
Naveen

This is on a dual dual-core opteron w/ 4gb ram. Bfu''d from SXCRb34 to
opensol-20060320.
Sun Microsystems Inc.   SunOS 5.11      opensol-20060320        Mar. 21, 2006
SunOS Internal Development:  stevel 2006-03-21 [tonic.20060320]
bfu''ed from /nn/320/archives-20060320/i386 on 2006-03-21
Sun Microsystems Inc.   SunOS 5.11      snv_34  October 2007

--------------------
Setting up the server zpool and nfs share:

bash-3.00# ramdiskadm -a myramdisk 100m
/dev/ramdisk/myramdisk
bash-3.00# zpool create ramtank /dev/ramdisk/myramdisk
warning: device in use checking failed: No such device
bash-3.00# zfs list ramtank
NAME                   USED  AVAIL  REFER  MOUNTPOINT
ramtank               23.5K  79.5M    512  /ramtank
bash-3.00# zfs set sharenfs=''root=@10.10/16'' ramtank
--------------------
>From the client:
[root at qa8 ~]# mount pfs1:/ramtank /nnram
[root at qa8 ~]# cd /nnram/
[root at qa8 nnram]# tar -xf /tmp/xemacs-21.5.18.tar

---------------------

Kernel panic:

bash-3.00# mdb unix.7 vmcore.7
Loading modules: [ unix krtld genunix specfs dtrace cpu.AuthenticAMD.15 uppc
pcplusmp ufs ip sctp usba fcp fctl emlxs nca lofs cpc fcip random zfs logindmux
ptm sppp nfs ]
> ::statusdebugging crash dump vmcore.7 (64-bit) from pfs1
operating system: 5.11 opensol-20060320 (i86pc)
panic message: really out of space
dump content: kernel pages only
> ::stackvpanic()
zio_write_allocate_gang_members+0x39a(ffffffff9ad94580)
zio_dva_allocate+0xa7(ffffffff9ad94580)
zio_next_stage+0x12a(ffffffff9ad94580)
zio_checksum_generate+0x96(ffffffff9ad94580)
zio_next_stage+0x12a(ffffffff9ad94580)
zio_wait_for_children+0x5e(ffffffff9ad94580, 1, ffffffff9ad947c0)
zio_wait_children_ready+0x22(ffffffff9ad94580)
zio_next_stage_async+0x196(ffffffff9ad94580)
zio_nowait+0x13(ffffffff9ad94580)
zio_write_allocate_gang_members+0x1eb(ffffffff99cedb80)
zio_dva_allocate+0xa7(ffffffff99cedb80)
zio_next_stage+0x12a(ffffffff99cedb80)
zio_checksum_generate+0x96(ffffffff99cedb80)
zio_next_stage+0x12a(ffffffff99cedb80)
zio_wait_for_children+0x5e(ffffffff99cedb80, 1, ffffffff99ceddc0)
zio_wait_children_ready+0x22(ffffffff99cedb80)
zio_next_stage_async+0x196(ffffffff99cedb80)
zio_nowait+0x13(ffffffff99cedb80)
zio_write_allocate_gang_members+0x1eb(ffffffff99100640)
zio_dva_allocate+0xa7(ffffffff99100640)
zio_next_stage+0x12a(ffffffff99100640)
zio_checksum_generate+0x96(ffffffff99100640)
zio_next_stage+0x12a(ffffffff99100640)
zio_write_compress+0x2a4(ffffffff99100640)
zio_next_stage+0x12a(ffffffff99100640)
zio_wait_for_children+0x5e(ffffffff99100640, 1, ffffffff99100880)
zio_wait_children_ready+0x22(ffffffff99100640)
zio_next_stage_async+0x196(ffffffff99100640)
zio_nowait+0x13(ffffffff99100640)
arc_write+0x12e(ffffffff938e09c0, ffffffff82de4600, 6, 2, 30, ffffffffa4f59040)
dbuf_sync+0x94c(ffffffff9f085c98, ffffffff938e09c0, ffffffff8d10f200)
dnode_sync+0x47c(ffffffff9eea0f80, 0, ffffffff938e09c0, ffffffff8d10f200)
dmu_objset_sync_dnodes+0xb0(ffffffff93782740, ffffffff93782820,
ffffffff8d10f200)
dmu_objset_sync+0x10d(ffffffff93782740, ffffffff8d10f200)
dsl_dataset_sync+0x59(ffffffff9ae2da00, ffffffff8d10f200)
dsl_pool_sync+0xa3(ffffffff9a3a7a00, 30)
spa_sync+0x122(ffffffff82de4600, 30)
txg_sync_thread+0x230(ffffffff9a3a7a00)
thread_start+8()
> *panic_thread::findstack -vstack pointer for thread fffffe8001591c80: fffffe8001590bc0
  fffffe8001590df0 __dprintf+0xf9()
  fffffe8001590ea0 metaslab_alloc+0xfe(fffffffff07c8658, 200, fffffe8001590dc0,
fffffe8001590eb0)
  fffffe8001590f10 zio_write_allocate_gang_members+0x39a(ffffffff9ad94580)
  fffffe8001590f50 zio_dva_allocate+0xa7(ffffffff9ad94580)
  fffffe8001590f80 zio_next_stage+0x12a(ffffffff9ad94580)
  fffffe8001590fc0 zio_checksum_generate+0x96(ffffffff9ad94580)
  fffffe8001590ff0 zio_next_stage+0x12a(ffffffff9ad94580)
  fffffe8001591040 zio_wait_for_children+0x5e(ffffffff9ad94580, 1,
ffffffff9ad947c0)
...
This message posted from opensolaris.org

Matthew Ahrens

2006-Mar-22 02:42 UTC

head link

[zfs-discuss] zfs panic when untarring over nfs to a zpool ramdisk

On Tue, Mar 21, 2006 at 06:13:45PM -0800, Naveen Nalam
wrote:> My ramdisk is 100megs, and the source tar file is 57meg.
> Am I using zfs incorrectly? or is this a bug in the interaction b/w a
> ramdisk and zfs? (I can upload the core dump somewhere if needed)
> > ::status
> panic message: really out of space
The problem is (as the message implies) you have run out of space.  You
should have gotten an ENOSPC error to the application, but space
accounting can be tricky, and any errors are particularly likely to bite
you on a very small pool (eg. the 100MB one you are using).

What version of zfs are you running on?  In particular, the fix for
6391873 "metadata compression should be turned back on" fixes a bug of
this ilk.  This fix was putback in build 36.

Another question is, why would you even be close to running out if the
source tar file is only 57% the size of the pool?  Again, 6391873 may be
the culprit.

If you can send me the output of the following commands, that would help
to diagnose the problem:

	# echo "::walk spa | ::spa_space" | mdb <dump>
	# zdb -bb <pool>

hrm, I guess the pool is probably gone by the time you reboot since it
is a ramdisk, which will make it impossible to run zdb on it.  In that
case, you can either (a) trust my guess that you are hitting 6391873
(assuming you do not have the fix for it), or (b) re-do your experiment
with a 100MB file backing your pool ("mkfile -n 100m /var/tmp/zfile;
zpool create poolname /var/tmp/zfile"), then run zdb -bb after tickling
the bug.

--matt

Naveen Nalam

2006-Mar-22 03:25 UTC

head link

[zfs-discuss] Re: zfs panic when untarring over nfs to a zpool ramdisk

I should also add that when I untar the file locally on the ramdisk, things are
fine. I''m only getting the panic when I do it over NFS.

I then redid my test using a 100meg file as my pool (via mkfile). This also
panics when doing it over NFS. The server then went into a reboot loop after
that. I had to boot into safe mode, delete the /etc/zfs/zpool.cache file and
reboot. Produces a similar stack trace in the core file.

I''m using the latest BFU kernel that was put out yesterday (0320). I
was also hitting this bug with the bfu kernel that was put out on 0313.

-Naveen

-----
This is the output from when I untar it locally on the ramdisk (no NFS). Shows
that I don''t end up using the whole pool.
bash-3.00# tar -xf /nn/xemacs-21.5.18.tar
bash-3.00# zfs list ramtank
NAME                   USED  AVAIL  REFER  MOUNTPOINT
ramtank               60.7M  18.8M  60.6M  /ramtank
This message posted from opensolaris.org

zfs discuss - Mar 2006 - zfs panic when untarring over nfs to a zpool ramdisk

[zfs-discuss] zfs panic when untarring over nfs to a zpool ramdisk

[zfs-discuss] zfs panic when untarring over nfs to a zpool ramdisk

[zfs-discuss] Re: zfs panic when untarring over nfs to a zpool ramdisk