thr3ads.net - zfs discuss - [zfs-discuss] ZFS related panic! [Nov 2005]

If this information is useful, please help other people find it:
Share via:

John Weekley

2005-Nov-19 16:07 UTC

[zfs-discuss] ZFS related panic!

> My current zfs setup lookst like this:
> > homepool              3.63G  34.1G     8K  /homepool
> > homepool/db           61.6M  34.1G  8.50K  /var/db
> > homepool/db/pgsql     61.5M  34.1G  61.5M 
> > /var/db/pgsql
> > homepool/home         3.57G  34.1G  10.0K  /users
> > homepool/home/carrie     8K  34.1G     8K 
> > /users/carrie
> > homepool/home/posssumhaw     8K  34.1G     8K 
> > /users/posssumhaw
> > homepool/home/weekleyj  3.57G  34.1G  3.57G 
> > /users/weekleyj
> > 
> > NAME                    SIZE    USED   AVAIL    CAP 
> > HEALTH     ALTROOT
> > homepool               38.0G   3.63G   34.4G     9% 
> > ONLINE     -
> > 
> > 
> > I was copying over the b27 boot cd via "find .
> > -print | cpio
> > -pdmv /home/weekleyj/CD"  and terminated it with a
> > ^C.
> > At that point the machine panicked with all sorts of
> > ZFS info, and 
> > dropped core in /var/crash/fugly.  So far, I haven''t
> > been able to
> > reproduce.
> > Relevant info from /var/adm/messaages
Nov 19 09:09:34 fugly genunix: [ID 809409 kern.notice] ZFS: I/O failure (write
on /dev/dsk/c0d1 off 8504e8200: zio d5
3a5600 [L0 unallocated] vdev=0 offset=8500e8200 size=200L/200P/200A fletcher4
uncompressed LE contiguous birth=41837
fill=0 cksum=bcd3026a:5d6fdb3032:174cfede75e7:3e7752513e700): error 6
Nov 19 09:09:34 fugly unix: [ID 100000 kern.notice]
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5c7c zfs:zio_done+199
(d53a5600)
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5c9c
zfs:zio_next_stage+73 (d53a5600)
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5cbc
zfs:zio_wait_for_children+58 (d53a5600, 13, d53a5)
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5cdc
zfs:zio_wait_children_done+18 (d53a5600)
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5cf8
zfs:zio_next_stage+73 (d53a5600)
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5d2c
zfs:zio_vdev_io_assess+c6 (d53a5600)
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5d40
zfs:zio_next_stage+73 (d53a5600)
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5d54
zfs:vdev_disk_io_done+2b (d53a5600, 0, d51e5d)
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5d64
zfs:vdev_io_done+18 (d53a5600, fe964d5d,)
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5d78
zfs:zio_vdev_io_done+e (d53a5600, 0, 0, 0, )
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5dc8
genunix:taskq_thread+16c (d45d51f0, 0)
Nov 19 09:09:34 fugly genunix: [ID 353471 kern.notice] d51e5dd8
unix:thread_start+8 ()
Nov 19 09:09:34 fugly unix: [ID 100000 kern.notice]
Nov 19 09:09:34 fugly genunix: [ID 672855 kern.notice] syncing file systems...
Nov 19 09:09:34 fugly genunix: [ID 904073 kern.notice]  done
Nov 19 09:09:35 fugly genunix: [ID 111219 kern.notice] dumping to
/dev/dsk/c0d0s1, offset 123863040, content: kernel
Nov 19 09:09:56 fugly genunix: [ID 409368 kern.notice] ^M100% done: 69052 pages
dumped, compression ratio 1.58,
Nov 19 09:09:56 fugly genunix: [ID 851671 kern.notice] dump succeeded

I''ve got the core if anyone wants it.

Thanks,

John
This message posted from opensolaris.org

Jeff Bonwick

2005-Nov-19 20:39 UTC

head link

[zfs-discuss] Re: ZFS related panic!

> I''ve got the core if anyone wants it.
Yes, I want it.  Just let me know where I can get it.

Also, any information about the hardware would be helpful.
This wouldn''t be a Tyan 2885 by any chance, would it?

Thanks,

Jeff
This message posted from opensolaris.org

John Weekley

2005-Nov-19 21:18 UTC

head link

[zfs-discuss] Re: ZFS related panic!

Hi Jeff
Thanks for the reply.  I''ll tar them up and put them on my server for
you, and email you with account info separately.

The motherboard is an old Asus K7M ,  Athlon 750Mhz 384 MB RAM onboard ATA
controller.
Primary master Seagate 40G ATA disk
Secondary Master Maxtor 40G ATA
Secondary Slave 12x  Lite-on DVD RW

Ancient hardware, but Solaris is quite usable on this stuff!
This message posted from opensolaris.org

John Weekley

2005-Nov-21 21:51 UTC

head link

[zfs-discuss] Re: ZFS related panic!

Did you receive the login info to retreive the core, Jeff?
This message posted from opensolaris.org

Bart Smaalders

2005-Nov-21 22:23 UTC

head link

[zfs-discuss] Re: ZFS related panic!

John Weekley wrote:> Did you receive the login info to retreive the core, Jeff?
> This message posted from opensolaris.org
> _______________________________________________
I''ve had the same thing happen to me; same stack trace.

By any chance were you plugging or unplugging any usb/firewire
disks when this happened?

- Bart


-- 
Bart Smaalders			Solaris Kernel Performance
barts at cyber.eng.sun.com		http://blogs.sun.com/barts

John Weekley

2005-Nov-22 00:35 UTC

head link

[zfs-discuss] Re: Re: ZFS related panic!

Nope, these are all internal ATA disks.  I just killed  a cpio from CD to a ZFS
filesystem
with a ^C.
This message posted from opensolaris.org

Eric Lowe

2005-Dec-01 23:25 UTC

head link

[zfs-discuss] Re: ZFS related panic!

Ugh. I''ve managed to hit this issue twice today on oceana.central. I
hit it on 11/15 nevada bits so I upgraded to last night''s build and I
hit it again. Core is NFS exported... (yes this is the flaky machine that is
producing sporadic uncorrectable errors on my raidz. But it''s still
rude of the box to just panic instead of retrying.)
This message posted from opensolaris.org

Jeff Bonwick

2005-Dec-02 06:15 UTC

head link

[zfs-discuss] Re: ZFS related panic!

> Ugh. I''ve managed to hit this issue twice today on oceana.central.
> [...] it''s still rude of the box to just panic instead of
retrying.
Actually, ZFS does retry.  The problem is that x86 disk I/O, across
the board, is very broken in 27a.  For more details see here:

6354389 cmlb_partinfo can return bogus info for apparently any device
6205971 ON bits shouldn''t be using obsoleted DDI DMA interfaces

All it takes is either low memory or a hot-plug event to cause the
disk geometry information to just *disappear* and render the device
inaccessible.  There''s really nothing we can do about it in ZFS.
Both of these are P1 driver bugs with fixes in progress.

Jeff

John Weekley

2005-Dec-02 14:56 UTC

head link

[zfs-discuss] Re: ZFS related panic!

I''ve offered the dump, but so far there haven''t been any
takers...
This message posted from opensolaris.org

Eric Lowe

2005-Dec-02 23:04 UTC

head link

[zfs-discuss] Re: ZFS related panic!

On Thu, Dec 01, 2005 at 10:15:43PM -0800, Jeff Bonwick wrote:
| Actually, ZFS does retry.  The problem is that x86 disk I/O, across
| the board, is very broken in 27a.  For more details see here:
| 
| 6354389 cmlb_partinfo can return bogus info for apparently any device
| 6205971 ON bits shouldn''t be using obsoleted DDI DMA interfaces
| 
| All it takes is either low memory or a hot-plug event to cause the
| disk geometry information to just *disappear* and render the device
| inaccessible.  There''s really nothing we can do about it in ZFS.
| Both of these are P1 driver bugs with fixes in progress.

Jeff, thanks for the info, this looks like exactly my problem. Certainly
no hotplug here, I was burning CDs with the images on ZFS using an ATA
burner, this may just be a failure mode nobody else has seen yet.

-- 
Eric Lowe       Solaris Kernel Development              Austin, Texas
Sun Microsystems.  We make the net work.                x64155/+1(512)401-1155

Jens Kleemann

2005-Dec-19 19:13 UTC

head link

[zfs-discuss] Re: ZFS related panic!

why are you asking for a tyan 2885 ?

... just asking because i have a 2881 (Thunder K8SR) and i experienced some
similar problems (this board complains about the 131 errata of the opterons -
but the latest bios does not have a fix for that).

System setup:
TYAN Thunder K8SR
2 * Opteron 270
4 GB Ram
1 SCSI (73GB) Boot & System Disk (Adaptec SCSI 29160 Adapter)
4 WD250 SATA (as raidz pool)

The first installation went quite good until i mounted a zfs filesystem to
/export and that mount point was already occupied by a scsi - partition ->
complained, but system ran up to the next reboot. Then it panicked constantly
complaining that it cannot mount the zfs filesystem on a directory which is not
empty ()  ... i reinstalled because i was just playing around

rem: a zone was configured on a zfs filesystem

Next install went fine, all installed and started copying my old fileserver
stuff to the new raidz pool - after 4 hours i came back home and machine was
rebooting constantly and complaining about bad raidz blocks....

rem: i was using a zfs filesystem for 2 zones i configured 

any ideas ?
This message posted from opensolaris.org

Jens Kleemann

2005-Dec-23 20:07 UTC

head link

[zfs-discuss] Re: ZFS related panic!

its me again.....

 i installed Solaris 10 (01/06) and experienced the same problems (total
freezes) - but it could be the XServer as i found out in various posts on yahoo
groups. I stopped working on the Desktop and disabled dtlogin ..... no freezes
sofar.


jens
This message posted from opensolaris.org

Possibly Parallel Threads

Search for more apparently analagous threads

zfs discuss - Nov 2005 - ZFS related panic!

[zfs-discuss] ZFS related panic!

[zfs-discuss] Re: ZFS related panic!

[zfs-discuss] Re: ZFS related panic!

[zfs-discuss] Re: ZFS related panic!

[zfs-discuss] Re: ZFS related panic!

[zfs-discuss] Re: Re: ZFS related panic!

[zfs-discuss] Re: ZFS related panic!

[zfs-discuss] Re: ZFS related panic!

[zfs-discuss] Re: ZFS related panic!

[zfs-discuss] Re: ZFS related panic!

[zfs-discuss] Re: ZFS related panic!

[zfs-discuss] Re: ZFS related panic!

Possibly Parallel Threads