thr3ads.net - zfs discuss - [zfs-discuss] Saving scrub results before scrub completes [Jan 2007]

If this information is useful, please help other people find it:
Share via:

Siegfried Nikolaivich

2007-Jan-10 16:26 UTC

[zfs-discuss] Saving scrub results before scrub completes

On 27-Dec-06, at 9:45 PM, George Wilson wrote:
> Siegfried,
>
> Can you provide the panic string that you are seeing? We should be
> able to pull out the persistent error log information from the
> corefile. You can take a look at spa_get_errlog() function as a
> starting point.
>
This is the panic string that I am seeing:

Dec 26 18:55:51 FServe unix: [ID 836849 kern.notice]
Dec 26 18:55:51 FServe ^Mpanic[cpu1]/thread=fffffe8000929c80:
Dec 26 18:55:51 FServe genunix: [ID 683410 kern.notice] BAD TRAP:
type=e (#pf Page fault) rp=fffffe8000929980 addr=ffffff00b3e621f0
Dec 26 18:55:51 FServe unix: [ID 100000 kern.notice]
Dec 26 18:55:51 FServe unix: [ID 839527 kern.notice] sched:
Dec 26 18:55:51 FServe unix: [ID 753105 kern.notice] #pf Page fault
Dec 26 18:55:51 FServe unix: [ID 532287 kern.notice] Bad kernel fault
at addr=0xffffff00b3e621f0
Dec 26 18:55:51 FServe unix: [ID 243837 kern.notice] pid=0,
pc=0xfffffffff3eaa2b0, sp=0xfffffe8000929a78, eflags=0x10282
Dec 26 18:55:51 FServe unix: [ID 211416 kern.notice] cr0:
8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f0<xmme,fxsr,pge,mce,pae,pse>
Dec 26 18:55:51 FServe unix: [ID 354241 kern.notice] cr2:
ffffff00b3e621f0 cr3: a3ec000 cr8: c
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]    rdi:
fffffe80dd69ad40 rsi: ffffff00b3e62040 rdx:                0
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]    rcx:
ffffffff9c6bd6ce  r8:                1  r9:         ffffffff
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]    rax:
ffffff00b3e62208 rbx: ffffff00b3e62040 rbp: fffffe8000929ab0
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]    r10:
ffffffff982421c8 r11:                1 r12: ffffff00b3e62208
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]    r13:
ffffffff81204468 r14:              1c8 r15: fffffe80dd69ad40
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]    fsb:
ffffffff80000000 gsb: ffffffff80f1d000  ds:               43
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]
es:               43  fs:                0  gs:              1c3
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]
trp:                e err:                0 rip: fffffffff3eaa2b0
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]
cs:               28 rfl:            10282 rsp: fffffe8000929a78
Dec 26 18:55:51 FServe unix: [ID 266532 kern.notice]
ss:               30
Dec 26 18:55:51 FServe unix: [ID 100000 kern.notice]
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929890 unix:real_mode_end+6ad1 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929970 unix:trap+d77 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929980 unix:cmntrap+13f ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929ab0 zfs:vdev_queue_offset_compare+0 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929ae0 genunix:avl_add+1f ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929b60 zfs:vdev_queue_io_to_issue+1ec ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929ba0 zfs:zfsctl_ops_root+33bc48b1 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929bc0 zfs:vdev_disk_io_done+11 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929bd0 zfs:vdev_io_done+12 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929be0 zfs:zio_vdev_io_done+1b ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929c60 genunix:taskq_thread+bc ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929c70 unix:thread_start+8 ()
Dec 26 18:55:51 FServe unix: [ID 100000 kern.notice]
Dec 26 18:55:51 FServe genunix: [ID 672855 kern.notice] syncing file
systems...
Dec 26 18:55:51 FServe genunix: [ID 733762 kern.notice]  3
Dec 26 18:55:52 FServe genunix: [ID 904073 kern.notice]  done
Dec 26 18:55:53 FServe genunix: [ID 111219 kern.notice] dumping to /
dev/dsk/c1d0s1, offset 1719074816, content: kernel


Additionally, but perhaps not related, I came across this while
looking at the logs:

Dec 26 17:53:00 FServe marvell88sx: [ID 812950 kern.warning] WARNING:
marvell88sx0: error on port 1:
Dec 26 17:53:00 FServe marvell88sx: [ID 517869 kern.info]
SError interrupt
Dec 26 17:53:00 FServe marvell88sx: [ID 517869 kern.info]       EDMA
self disabled
Dec 26 17:53:00 FServe marvell88sx: [ID 517869 kern.info]
command request queue parity error
Dec 26 17:53:00 FServe marvell88sx: [ID 131198 kern.info]       SErrors:
Dec 26 17:53:00 FServe marvell88sx: [ID 517869
kern.info]               Recovered communication error
Dec 26 17:53:00 FServe marvell88sx: [ID 517869
kern.info]               PHY ready change
Dec 26 17:53:00 FServe marvell88sx: [ID 517869
kern.info]               10-bit to 8-bit decode error
Dec 26 17:53:00 FServe marvell88sx: [ID 517869
kern.info]               Disparity error

This happened right before a system hang.  I have this other strange
problem where if I send certain files over the network (CIFS or NFS),
the machine slows to a crawl until it is "hung".  This is
reproducible every time with the same "special" files, but it does
not happen locally, only over the network.  I already posted about
this in network-discuss and am currently investigating the issue.

> Additionally, you can look at the corefile using mdb and take a
> look at the vdev error stats. Here''s an example (hopefully the
> formatting doesn''t get messed up):
>
Excellent information, thanks!  It looks like there are no read/write/
chksum errors.

I now at least have a way of checking the scrub results until the
panic is fixed (hopefully someday).


Siegfried


> > ::spa -v
> ADDR                 STATE NAME
> 0000060004473680    ACTIVE test
>
>     ADDR             STATE     AUX          DESCRIPTION
>     0000060004bcb500 HEALTHY   -            root
>     0000060004bcafc0 HEALTHY   -              /dev/dsk/c0t2d0s0
>
> > 0000060004bcb500::vdev -re
> ADDR             STATE     AUX          DESCRIPTION
> 0000060004bcb500 HEALTHY   -            root
>
>                READ        WRITE         FREE        CLAIM
> IOCTL
>     OPS           0            0            0
> 0            0
>     BYTES         0            0            0
> 0            0
>     EREAD         0
>     EWRITE        0
>     ECKSUM        0
>
> 0000060004bcafc0 HEALTHY   -              /dev/dsk/c0t2d0s0
>
>                READ        WRITE         FREE        CLAIM
> IOCTL
>     OPS        0x17        0x1d2            0
> 0            0
>     BYTES  0x19c000     0x11da00            0
> 0            0
>     EREAD         0
>     EWRITE        0
>     ECKSUM        0
>
> This will show you and read/write/cksum errors.
>
> Thanks,
> George
>
>
> Siegfried Nikolaivich wrote:
>> Hello All,
>> I am wondering if there is a way to save the scrub results right
>> before the scrub is complete.
>> After upgrading to Solaris 10U3 I still have ZFS panicing right as
>> the scrub completes.  The scrub results seem to be "cleared"
when
>> system boots back up, so I never get a chance to see them.
>> Does anyone know of a simple way?
>>   This message posted from opensolaris.org
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

George Wilson

2007-Jan-10 16:26 UTC

head link

[zfs-discuss] Saving scrub results before scrub completes

Siegfried,

Can you provide the panic string that you are seeing? We should be able
to pull out the persistent error log information from the corefile. You
can take a look at spa_get_errlog() function as a starting point.

Additionally, you can look at the corefile using mdb and take a look at
the vdev error stats. Here''s an example (hopefully the formatting
doesn''t get messed up):

 > ::spa -v
ADDR                 STATE NAME

0000060004473680    ACTIVE test

     ADDR             STATE     AUX          DESCRIPTION

     0000060004bcb500 HEALTHY   -            root
     0000060004bcafc0 HEALTHY   -              /dev/dsk/c0t2d0s0

 > 0000060004bcb500::vdev -re
ADDR             STATE     AUX          DESCRIPTION
0000060004bcb500 HEALTHY   -            root

                READ        WRITE         FREE        CLAIM        IOCTL
     OPS           0            0            0            0            0
     BYTES         0            0            0            0            0
     EREAD         0
     EWRITE        0
     ECKSUM        0

0000060004bcafc0 HEALTHY   -              /dev/dsk/c0t2d0s0

                READ        WRITE         FREE        CLAIM        IOCTL
     OPS        0x17        0x1d2            0            0            0
     BYTES  0x19c000     0x11da00            0            0            0
     EREAD         0
     EWRITE        0
     ECKSUM        0

This will show you and read/write/cksum errors.

Thanks,
George


Siegfried Nikolaivich wrote:> Hello All,
>
> I am wondering if there is a way to save the scrub results right before the
scrub is complete.
>
> After upgrading to Solaris 10U3 I still have ZFS panicing right as the
scrub completes.  The scrub results seem to be "cleared" when system
boots back up, so I never get a chance to see them.
>
> Does anyone know of a simple way?
>
>
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Siegfried Nikolaivich

2007-Jan-10 16:26 UTC

head link

[zfs-discuss] Saving scrub results before scrub completes

Hello All,

I am wondering if there is a way to save the scrub results right before the
scrub is complete.

After upgrading to Solaris 10U3 I still have ZFS panicing right as the scrub
completes.  The scrub results seem to be "cleared" when system boots
back up, so I never get a chance to see them.

Does anyone know of a simple way?


This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

zfs discuss - Jan 2007 - Saving scrub results before scrub completes

[zfs-discuss] Saving scrub results before scrub completes

[zfs-discuss] Saving scrub results before scrub completes

[zfs-discuss] Saving scrub results before scrub completes