thr3ads.net - zfs discuss - [zfs-discuss] zfs null pointer deref, getting data out of single-user mode [Jan 2009]

If this information is useful, please help other people find it:
Share via:

David Dyer-Bennet

2009-Jan-13 03:17 UTC

[zfs-discuss] zfs null pointer deref, getting data out of single-user mode

My home NAS box, that I''d upgraded to Solaris 2008.11 after a series of
crashes leaving the smf database damaged, and which ran for 4 days
cleanly, suddenly fell right back to where the old one had been before.

Looking at the logs, I see something similar to (this is manually
transcribed to paper and retyped):

Bad trap: type=e (page fault) rp=f..f00050e3250 addr=28 module ZFS null
pointer dereference.

(That''s "rp =" some number of ''f''s and then
those exact hex digits).

Lots of other data was logged, and it looks as if a kernel dump was written.

I have multiple instances of this crash in my logs now.  I don''t know
how
the kernel dump space works, I don''t know if I only have the latest
dump,
or what.

Who needs this information?  And how can I get it off the system?  I''ve
played with various attempts to mount a thumb drive, and googled around,
and I can''t find any clues on how to do it.  This is in the mode text
mode
boot gets into when the smf database is corrupt -- maintenance mode, or
some such.  I have to give the username and password that I established on
installation for the admin user.  Not that the thumb drive will help for
the kernel dump anyway, if I read the logs correctly.

So now I''ve been down for more than a week, *and* I think I destroyed
all
my file permissions last night trying to get the final steps done and the
system back into service.  And I really don''t know what I''m
going to do; I
got to this point because I decided to reinstall over the old nv76 I was
running rather than try to recover it (I recovered the data zpool), and it
seems to have not advanced me any.  On the one hand, that sounds like
hardware; but the log entries for the crash are about ZFS null pointer
derefs, which does NOT sound like hardware.

I''ll be reinstalling tomorrow night, unless somebody says they need the
data, in which case I can work on getting the data out, if anybody can
give me some clues on *how* to get the data out.

-- 
David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

Anton B. Rang

2009-Jan-20 05:04 UTC

head link

[zfs-discuss] zfs null pointer deref,

If you''ve got enough space on /var, and you had a dump partition
configured, you should find a bunch of "vmcore.[n]" files in
/var/crash by now.  The system normally dumps the kernel core into the dump
partition (which can be the swap partition) and then copies it into /var/crash
on the next successful reboot.

There''s likely also a stack printed at the time of the crash; that
might be enough for the ZFS developers to determine if this is a known (or even
fixed) bug. It''s also retrievable from the core. If it''s not a
known bug, or if more data is needed, the developers might want a copy of the
core....
-- 
This message posted from opensolaris.org

Anton B. Rang

2009-Jan-21 04:36 UTC

head link

[zfs-discuss] zfs null pointer deref,

Sigh. Richard points out in private email that automatic savecore functionality
is disabled in OpenSolaris; you need to manually set up a dump device and save
core files if you want them. However, the stack may be sufficient to ID the bug.
-- 
This message posted from opensolaris.org

Richard Elling

2009-Jan-21 18:11 UTC

head link

[zfs-discuss] zfs null pointer deref,

Anton B. Rang wrote:> Sigh. Richard points out in private email that automatic savecore
functionality is disabled in OpenSolaris; you need to manually set up a dump
device and save core files if you want them. However, the stack may be
sufficient to ID the bug.
The dump device is there, you just need to copy the data from
the dump device to a file system, using savecore.
  -- richard

Julien Gabel

2009-Jan-21 18:51 UTC

head link

[zfs-discuss] zfs null pointer deref,

> Sigh. Richard points out in private email that automatic savecore
> functionality is disabled in OpenSolaris; you need to manually
> set up a dump device and save core files if you want them.
> However, the stack may be sufficient to ID the bug.
The dump device is present, so no need to set up one.  If you
enable savecore using dumpadm(1m), you must create the
configured savecore directory manually though.

-- 
julien.
http://blog.thilelli.net/

Seemingly Similar Threads

Search for more apparently analagous threads

zfs discuss - Jan 2009 - zfs null pointer deref, getting data out of single-user mode

[zfs-discuss] zfs null pointer deref, getting data out of single-user mode

[zfs-discuss] zfs null pointer deref,

[zfs-discuss] zfs null pointer deref,

[zfs-discuss] zfs null pointer deref,

[zfs-discuss] zfs null pointer deref,

Seemingly Similar Threads