thr3ads.net - zfs discuss - [zfs-discuss] marvell88sx2 driver build126 [Nov 2009]

If this information is useful, please help other people find it:
Share via:

Tim Cook

2009-Nov-02 05:27 UTC

[zfs-discuss] marvell88sx2 driver build126

I''ve sent this to the driver list as well, but since the zfs folks tend
to
be intimately involved with the marvell driver stack, I figured I''d
give you
guys a shot too.



Does anyone happen to know if there was a driver change with build 126?  I
had a pool that was 2x5+1 raidz vdev''s.  I moved all the data off
temporarily, changed it to one 10+2 raidz2 vdev, and am in the process of
moving all the data back.

I''ve had two drives "fail" in the last 3 hours that have been
running fine
for over a year, and presented absolutely no issues moving the data out of
the original zpool.  My first inclination is this is a driver issue.

I''m currently running 2xMarvell SAT2-MV8 SATA controllers.  6 disks on
the
first controller, 7 on the second (one hot spare).


zpool status
  pool: fserv
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using ''zpool online'' or replace the
device with
        ''zpool replace''.
 scrub: resilver completed after 1h38m with 0 errors on Sun Nov  1 18:42:16
2009
config:

        NAME          STATE     READ WRITE CKSUM
        fserv         DEGRADED     0     0     0
          raidz2-0    DEGRADED     0     0     0
            c8t0d0    ONLINE       0     0     0
            c8t1d0    ONLINE       0     0     0
            spare-2   DEGRADED     0     0 2.83M
              c8t2d0  REMOVED      0     0     0
              c7t6d0  ONLINE       0     0     0  35.6G resilvered
            c8t3d0    ONLINE       0     0     0
            c8t4d0    ONLINE       0     0     0
            c8t5d0    ONLINE       0     0     0
            c7t0d0    ONLINE       0     0     0
            c7t1d0    ONLINE       0     0     0
            c7t2d0    ONLINE       0     0     0
            c7t3d0    ONLINE       0     0     0
            c7t4d0    REMOVED      0     0     0
            c7t5d0    ONLINE       0     0     0
        spares
          c7t6d0      INUSE     currently in use



Nov  1 16:21:34 fserv sata: [ID 801593 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/
pci1033,125 at 0,1/pci11ab,11ab at 6:
Nov  1 16:21:34 fserv  SATA device at port 2 - device failed
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:34 fserv   Command failed to complete...Device is gone
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:34 fserv   drive offline
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:34 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:34 fserv   drive offline
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:34 fserv   drive offline
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:34 fserv   drive offline
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:34 fserv   drive offline
Nov  1 16:21:40 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:40 fserv   drive offline
Nov  1 16:21:40 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:40 fserv   drive offline
Nov  1 16:21:40 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:40 fserv   drive offline
Nov  1 16:21:40 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 16:21:40 fserv   drive offline
Nov  1 17:03:38 fserv marvell88sx: [ID 268337 kern.warning] WARNING:
marvell88sx2:device on port 4 failed to reset
Nov  1 17:04:08 fserv sata: [ID 801593 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0/pci11ab,11ab at 4:
Nov  1 17:04:08 fserv  SATA device at port 4 - device failed
Nov  1 17:04:08 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0/pci11ab,11ab at 4/disk at 4,0 (sd30):
Nov  1 17:04:08 fserv   Command failed to complete...Device is gone
Nov  1 17:04:08 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0/pci11ab,11ab at 4/disk at 4,0 (sd30):
Nov  1 17:04:08 fserv   drive offline
Nov  1 17:04:09 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0/pci11ab,11ab at 4/disk at 4,0 (sd30):
Nov  1 17:04:09 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 17:04:09 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0/pci11ab,11ab at 4/disk at 4,0 (sd30):
Nov  1 17:04:09 fserv   drive offline
Nov  1 17:04:09 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0/pci11ab,11ab at 4/disk at 4,0 (sd30):
Nov  1 17:04:09 fserv   drive offline
Nov  1 17:04:09 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0/pci11ab,11ab at 4/disk at 4,0 (sd30):
Nov  1 17:04:09 fserv   drive offline
Nov  1 17:04:09 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0/pci11ab,11ab at 4/disk at 4,0 (sd30):
Nov  1 17:04:09 fserv   drive offline
Nov  1 18:31:59 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 18:31:59 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 18:32:11 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 18:32:11 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 18:35:00 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 18:35:00 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 18:35:12 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 18:35:12 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 18:35:21 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 18:35:21 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 18:38:36 fserv scsi: [ID 107833 kern.warning] WARNING: /pci at 7c
,0/pci10de,376 at a/pci1033,125 at 0,1/pci11ab,11ab at 6/disk at 2,0 (sd26):
Nov  1 18:38:36 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 21:06:31 fserv pcplusmp: [ID 805372 kern.info] pcplusmp: ide (ata)
instance 2 irq 0xe vector 0x44 ioapic 0x4 intin 0xe is bound to cpu 3
Nov  1 21:06:31 fserv pcplusmp: [ID 805372 kern.info] pcplusmp: ide (ata)
instance 3 irq 0xf vector 0x44 ioapic 0x4 intin 0xf is bound to cpu 0
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091102/2a2e993e/attachment.html>

Orvar Korvar

2009-Nov-02 13:34 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

I have the same card and might have seen the same problem. Yesterday I upgraded
to b126 and started to migrate all my data to 8 disc raidz2 connected to such a
card. And suddenly ZFS reported checksum errors. I thought the drives were
faulty. But you suggest the problem could have been the driver? I also noticed
that one of the drives had resilvered a small amount, just like yours.

I now use b125 and there are no checksum errors. So, is there a bug in the new
b126 driver?
-- 
This message posted from opensolaris.org

Tim Cook

2009-Nov-03 18:14 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Mon, Nov 2, 2009 at 6:34 AM, Orvar Korvar <knatte_fnatte_tjatte at
yahoo.com> wrote:
> I have the same card and might have seen the same problem. Yesterday I
> upgraded to b126 and started to migrate all my data to 8 disc raidz2
> connected to such a card. And suddenly ZFS reported checksum errors. I
> thought the drives were faulty. But you suggest the problem could have been
> the driver? I also noticed that one of the drives had resilvered a small
> amount, just like yours.
>
> I now use b125 and there are no checksum errors. So, is there a bug in the
> new b126 driver?
>

Can any of you Sun folks comment on this?

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091103/f9f089f8/attachment.html>

Orvar Korvar

2009-Nov-06 16:38 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

Noone has noticed this?
-- 
This message posted from opensolaris.org

Rob Logan

2009-Nov-06 19:38 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

> Nov 1 16:21:34 fserv Command failed to complete...Device is gone > Nov 1 17:04:08 fserv Command failed to complete...Device is gone

kinda looks like drive FW or cable issue... if it was a driver
issue it might be a lost command or reset for phase resync.

 > driver change with build 126?
not for the SATA framework, but for HBAs there is:
http://hub.opensolaris.org/bin/view/Community+Group+on/2009093001

			Rob

Orvar Korvar

2009-Nov-06 20:10 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

Right now I do not dare to use builds later than 125, because in b126 the
problem showed up. Maybe a coincidence, maybe not. But I think it is best to not
use b126 or later, until someone has confirmed there are no driver changes.

So, to confirm, there are no driver changes in b126 for the marvell88sx2, right?
So I should safely be able to use b126 and later?
-- 
This message posted from opensolaris.org

Tim Cook

2009-Nov-06 23:39 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Fri, Nov 6, 2009 at 2:10 PM, Orvar Korvar <knatte_fnatte_tjatte at
yahoo.com> wrote:
> Right now I do not dare to use builds later than 125, because in b126 the
> problem showed up. Maybe a coincidence, maybe not. But I think it is best
to
> not use b126 or later, until someone has confirmed there are no driver
> changes.
>
> So, to confirm, there are no driver changes in b126 for the marvell88sx2,
> right? So I should safely be able to use b126 and later?
>
>
Let me know what your results are if you decide to upgrade.  I''ve
already
replaced both drives that were having issues, I''ll do cables later but
I''m
still having a hard time believing my cables magically went bad right when I
upgraded to build 126.  The new drives have the same issues the old drives
did.  New brand and model.

And from what I can tell, I''m getting checksum errors through the roof
on
the replace as well...



  pool: fserv
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist
for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using ''zpool
online''.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: resilver in progress for 0h34m, 22.60% done, 1h57m to go
config:

        NAME                        STATE     READ WRITE CKSUM
        fserv                       DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            c8t0d0                  ONLINE       0     0     0
            c8t1d0                  ONLINE       0     0     0
            spare-2                 DEGRADED     0     0     0
              14340903866396142118  UNAVAIL      0     0     0  was
/dev/dsk/c8t2d0s0
              c7t6d0                ONLINE       0     0     0
            c8t3d0                  REMOVED      0     0     0
            c8t4d0                  ONLINE       0     0     0
            c8t5d0                  ONLINE       0     0     0
            c7t0d0                  ONLINE       0     0     0
            c7t1d0                  ONLINE       0     0     0
            c7t2d0                  ONLINE       0     0     0
            c7t3d0                  ONLINE       0     0     0
            replacing-10            DEGRADED     0     0  816K
              15401866802517339500  FAULTED      0     0     0  was
/dev/dsk/c7t4d0s0/old
              c7t4d0                ONLINE       0     0     0  52.3G
resilvered
            c7t5d0                  ONLINE       0     0     0
        spares
          c7t6d0                    INUSE     currently in use





--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091106/b92d1043/attachment.html>

Orvar Korvar

2009-Nov-07 10:27 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

Ok, so you changed drives and you still see errors? Are the drives brand new or
used? What kind of drives, which brand? 2TB? And if you reboot into an earlier
build such as b125 you dont see any errors, right?

Right now I am running b125. I dont dare to run b126, if your observation is
correct. Could you just rip out drivers from b125? I could post them drivers
here for you, if you tell me which files you need. And then you can see if it is
the drivers causing the problem or not.
-- 
This message posted from opensolaris.org

Tim Cook

2009-Nov-07 17:06 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Sat, Nov 7, 2009 at 4:27 AM, Orvar Korvar <knatte_fnatte_tjatte at
yahoo.com> wrote:
> Ok, so you changed drives and you still see errors? Are the drives brand
> new or used? What kind of drives, which brand? 2TB? And if you reboot into
> an earlier build such as b125 you dont see any errors, right?
>
Brand new. I''ve tried both 1TB hitachi and 1.5TB seagate (not the
"bad"
ones).

I can''t boot into an older version because the last version I had was
b118
which doesn''t have zfs version 19 support.  I''ve been looking
to see if
there''s a way to downgrade via IPS but that''s turned up a lot
of nothing.

>
> Right now I am running b125. I dont dare to run b126, if your observation
> is correct. Could you just rip out drivers from b125? I could post them
> drivers here for you, if you tell me which files you need. And then you can
> see if it is the drivers causing the problem or not.

It''s tough to say what exactly is causing the problems.  I would
imagine
ripping something like sd from the older version would break more than it
would fix.

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091107/57257bb1/attachment.html>

Cindy Swearingen

2009-Nov-07 18:02 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

Hi Tim and all,

I believe you are saying that marvell88sx2 driver error messages started 
in build 126, along with new disk errors in RAIDZ pools. 

Is this correct? If so, please send me the following information:

1. Hardware you are running

2. If you are also seeing new disk errors in your RAIDZ pools
include your zpool status output.

I''m not the right person to be diagnosing driver-level issues but I
will
investigate.

Thanks,

Cindy


----- Original Message -----
From: Tim Cook <tim at cook.ms>
Date: Saturday, November 7, 2009 10:08 am
Subject: Re: [zfs-discuss] marvell88sx2 driver build126
To: Orvar Korvar <knatte_fnatte_tjatte at yahoo.com>
Cc: zfs-discuss at opensolaris.org
> On Sat, Nov 7, 2009 at 4:27 AM, Orvar Korvar <knatte_fnatte_tjatte at
yahoo.com
> > wrote:
> 
> > Ok, so you changed drives and you still see errors? Are the drives
brand
> > new or used? What kind of drives, which brand? 2TB? And if you 
> reboot into
> > an earlier build such as b125 you dont see any errors, right?
> >
> 
> Brand new. I''ve tried both 1TB hitachi and 1.5TB seagate (not the
"bad"
> ones).
> 
> I can''t boot into an older version because the last version I had
was
> b118
> which doesn''t have zfs version 19 support.  I''ve been
looking to see if
> there''s a way to downgrade via IPS but that''s turned up a
lot of nothing.
> 
> 
> 
> 
> >
> > Right now I am running b125. I dont dare to run b126, if your
observation
> > is correct. Could you just rip out drivers from b125? I could post
them
> > drivers here for you, if you tell me which files you need. And then 
> you can
> > see if it is the drivers causing the problem or not.
> 
> 
> 
> It''s tough to say what exactly is causing the problems.  I would
imagine
> ripping something like sd from the older version would break more than 
> it
> would fix.
> 
> --Tim
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Tim Cook

2009-Nov-07 18:44 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Sat, Nov 7, 2009 at 12:02 PM, Cindy Swearingen
<Cindy.Swearingen at sun.com>wrote:
> Hi Tim and all,
>
> I believe you are saying that marvell88sx2 driver error messages started
> in build 126, along with new disk errors in RAIDZ pools.
>
> Is this correct? If so, please send me the following information:
>
Yes.
>
> 1. Hardware you are running
>
Motherboard: SUPERMICRO MBD-H8DAE-2-O
2xAMD opteron 22xx CPU''s  (forget the exact model, they''re
2010mhz)
8GB crucial ECC ddr2 memory
2xSupermicro AOC-SAT2-MV8 SATA adapters
Supermicro SC932T-R760B case** with 15xSATA passthrough backplane

I also have an nvidia video card in it, but I''m not sure of the model,
and
doubt it has any role in this troubleshooting.



>
> 2. If you are also seeing new disk errors in your RAIDZ pools
> include your zpool status output.
>
Well, I can give you a current one, but I''ve done about a hundred
things
troubleshooting, so it isn''t representative of what the issues were a
few
days ago.  I''m still trying to figure out why it''s choking on
any drive I
put into c8t2d0.  It''s stopped generating errors on c7t4d0, but I
haven''t
changed a thing with that slot outside of stopping the zpool replace and
restarting it a few times... which is also extremely odd to me.

r00t at fserv:~$ zpool status
  pool: fserv
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist
for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using ''zpool
online''.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: resilver completed after 2h53m with 0 errors on Fri Nov  6 22:09:08
2009
config:

        NAME                        STATE     READ WRITE CKSUM
        fserv                       DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            c8t0d0                  ONLINE       0     0     0
            c8t1d0                  ONLINE       0     0     0
            spare-2                 DEGRADED     0     0     0
              14340903866396142118  UNAVAIL      0     0     0  was
/dev/dsk/c8t2d0s0
              c7t6d0                ONLINE       0     0     0
            c8t3d0                  ONLINE       0     0     0  2.68G
resilvered
            c8t4d0                  ONLINE       0     0     0
            c8t5d0                  ONLINE       0     0     0
            c7t0d0                  ONLINE       0     0     0
            c7t1d0                  ONLINE       0     0     0
            c7t2d0                  ONLINE       0     0     0
            c7t3d0                  ONLINE       0     0     0
            c7t4d0                  ONLINE       0     0     0  231G
resilvered
            c7t5d0                  ONLINE       0     0     0
        spares
          c7t6d0                    INUSE     currently in use

errors: No known data errors
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091107/126ce595/attachment.html>

Orvar Korvar

2009-Nov-07 21:33 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

I saw the same checksum error problem when I booted into b126. I havent dared
try b126 again, I use b125 now, without problems. Here is my hardware
Intel Q9450 + P45 Gigabyte EP45-DS3P motherboard + Ati 4850
I have the same AOC SATA controller card. And some Samsung Spinpoint F1, 1TB
drives. Brand new.
-- 
This message posted from opensolaris.org

Orvar Korvar

2009-Nov-08 11:57 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

"I can''t boot into an older version because the last version I had
was b118 which doesn''t have zfs version 19 support.  I''ve been
looking to see if there''s a way to downgrade via IPS but
that''s turned up a lot of nothing."

If someone can tell me which files are needed for the driver I can extract them
from my b125 and post them here for you, so you can try out. Then we can know if
the problem is in b126 drivers or not. If b125 drivers work, we know the problem
is in b126. Otherwise there might be some other problem.

Another solution could be that you install SCXE b125. There are links to that
DVD b125. And from SCXE you can upgrade to later Opensolaris builds. I think.

Is it possible to upgrade to a specific build via IPS? When I use the Update
Manager, I always upgrade to the latest build. Can I target, say, bXXX? Or is
the only way to get bXXX, by installing SXCE?
-- 
This message posted from opensolaris.org

Ross

2009-Nov-08 14:31 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

> "I can''t boot into an older version because the last
> version I had was b118 which doesn''t have zfs version
> 19 support.  I''ve been looking to see if there''s a
> way to downgrade via IPS but that''s turned up a lot
> of nothing."
> 
> If someone can tell me which files are needed for the
> driver I can extract them from my b125 and post them
> here for you, so you can try out. Then we can know if
> the problem is in b126 drivers or not. If b125
> drivers work, we know the problem is in b126.
> Otherwise there might be some other problem.
> 
> Another solution could be that you install SCXE b125.
> There are links to that DVD b125. And from SCXE you
> can upgrade to later Opensolaris builds. I think.
> 
> Is it possible to upgrade to a specific build via
> IPS? When I use the Update Manager, I always upgrade
> to the latest build. Can I target, say, bXXX? Or is
> the only way to get bXXX, by installing SXCE?
Here are some notes i stole from the list earlier.  I think they might be on a
wiki somewhere now, but it seems relatively easy to upgrade to a specific
version:

Starting from OpenSolaris 2009.06 (snv_111b) active BE.

1) beadm create snv_111b-dev
2) beadm activate snv_111b-dev
3) reboot
4) pkg set-authority -O http://pkg.opensolaris.org/dev opensolaris.org
5) pkg install SUNWipkg
6) pkg list ''entire*''
7) beadm create snv_118
8) beadm mount snv_118 /mnt
9) pkg -R /mnt refresh
10) pkg -R /mnt install entire at 0.5.11-0.118
11) bootadm update-archive -R /mnt
12) beadm umount snv_118
13) beadm activate snv_118
14) reboot

Now you have a snv_118 development environment.
-- 
This message posted from opensolaris.org

Orvar Korvar

2009-Nov-08 15:47 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

Great! So if I want another build, for instance b125, I just change step 10?
10) pkg -R /mnt install entire at 0.5.11-0.125
Yes? 

What is this "0.5.11" thing? Should that be changed too, if I try to
install b125? Like "0.5.12-0.125"?
-- 
This message posted from opensolaris.org

Tim Cook

2009-Nov-08 16:54 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Sun, Nov 8, 2009 at 9:47 AM, Orvar Korvar <knatte_fnatte_tjatte at
yahoo.com> wrote:
> Great! So if I want another build, for instance b125, I just change step
> 10?
> 10) pkg -R /mnt install entire at 0.5.11-0.125
> Yes?
>
> What is this "0.5.11" thing? Should that be changed too, if I try
to
> install b125? Like "0.5.12-0.125"?
>

No.  That''s the SunOS version number, and you should always use 0.5.11-
for
anything in opensolaris today.  Solaris 10= "5.10". 
Opensolaris="5.11".
9=5.9 etc. etc. etc.

http://en.wikipedia.org/wiki/Solaris_%28operating_system%29

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091108/482d7f89/attachment.html>

Nigel Smith

2009-Nov-08 20:08 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

I think you can work out the files for the driver by looking here:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/pkgdefs/SUNWmv88sx/prototype_i386

So the 32 bit driver is:

 kernel/drv/marvell88sx

And the 64 bit driver is:

 kernel/drv/amd64/marvell88sx

It a pity that the marvell driver is not open source.
For the sata drivers that are open source,

  ahci, nv_sata, si3124

..you can see the history of all the changes to the source code
of the drivers, all cross referenced to the bug numbers, using OpenGrok:

 
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/sata/adapters/

Regards
Nigel Smith
-- 
This message posted from opensolaris.org

Orvar Korvar

2009-Nov-08 22:04 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

Ok, here I attached the 64 bit variant. You can try it if you wish and see if
the checksum errors disappear.
-- 
This message posted from opensolaris.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: marvell88sx
Type: application/octet-stream
Size: 100256 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091108/98df9ddc/attachment.obj>

Orvar Korvar

2009-Nov-08 22:05 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

This is from build 125.
-- 
This message posted from opensolaris.org

Cindy Swearingen

2009-Nov-09 20:51 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

Hi,

I can''t find any bug-related issues with marvell88sx2 in b126.

I looked over Dave Hollister''s shoulder while he searched for
marvell in his webrevs of this putback and nothing came up:

 > driver change with build 126?
not for the SATA framework, but for HBAs there is:
http://hub.opensolaris.org/bin/view/Community+Group+on/2009093001

I will find a thumper, load build 125, create a raidz pool, and
upgrade to b126.

I''ll also send the error messages that Tim provided to someone who
works in the driver group.

Thanks,

Cindy

On 11/07/09 14:33, Orvar Korvar wrote:> I saw the same checksum error problem when I booted into b126. I havent
dared try b126 again, I use b125 now, without problems. Here is my hardware
> Intel Q9450 + P45 Gigabyte EP45-DS3P motherboard + Ati 4850
> I have the same AOC SATA controller card. And some Samsung Spinpoint F1,
1TB drives. Brand new.

Tim Cook

2009-Nov-10 07:59 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Mon, Nov 9, 2009 at 2:51 PM, Cindy Swearingen
<Cindy.Swearingen at sun.com>wrote:
> Hi,
>
> I can''t find any bug-related issues with marvell88sx2 in b126.
>
> I looked over Dave Hollister''s shoulder while he searched for
> marvell in his webrevs of this putback and nothing came up:
>
> > driver change with build 126?
> not for the SATA framework, but for HBAs there is:
> http://hub.opensolaris.org/bin/view/Community+Group+on/2009093001
>
> I will find a thumper, load build 125, create a raidz pool, and
> upgrade to b126.
>
> I''ll also send the error messages that Tim provided to someone who
> works in the driver group.
>
> Thanks,
>
> Cindy
>
I tried the build 125 driver and it didn''t make a difference.  The odd
part
I''ve just noticed is that it''s port 4 on both cards that have
been giving me
issues.  I guess it''s possible it''s just a coincidence/bad
luck.

I''ve grabbed the b125 ISO from genunix and am going to try booting off
the
livecd to see if it produces different results.

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091110/13fe5518/attachment.html>

Orvar Korvar

2009-Nov-10 09:25 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

Does this mean that there are no driver changes in marvell88sx2, between b125
and b126? If no driver changes, then it means that we both had extreme unluck
with our drives, because we both had checksum errors? And my discs were brand
new.

How probable is this? Something is weird here. What is your opinion on this?
Should we agree that there was a hardware error, and it was just a coincidence?
-- 
This message posted from opensolaris.org

Cindy Swearingen

2009-Nov-10 15:56 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

Hi Orvar,

Correct, I don''t see any marvell8sx2 driver changes between b125-126.

So far, only you and Tim are reporting these issues.

Generally, we see bugs filed by the internal test teams if they see
similar problems.

I will try to reproduce the RAIDZ checksum errors separately from the
marvell88sx2 issue.

Thanks,

Cindy

On 11/10/09 02:25, Orvar Korvar wrote:> Does this mean that there are no driver changes in marvell88sx2, between
b125 and b126? If no driver changes, then it means that we both had extreme
unluck with our drives, because we both had checksum errors? And my discs were
brand new.
> 
> How probable is this? Something is weird here. What is your opinion on
this? Should we agree that there was a hardware error, and it was just a
coincidence?

Richard Elling

2009-Nov-10 16:55 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Nov 10, 2009, at 1:25 AM, Orvar Korvar wrote:
> Does this mean that there are no driver changes in marvell88sx2,  
> between b125 and b126? If no driver changes, then it means that we  
> both had extreme unluck with our drives, because we both had  
> checksum errors? And my discs were brand new.
There are other drivers in the software stack that may have changed.
  -- richard
>
> How probable is this? Something is weird here. What is your opinion  
> on this? Should we agree that there was a hardware error, and it was  
> just a coincidence?
> -- 
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Tim Cook

2009-Nov-10 23:15 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Tue, Nov 10, 2009 at 10:55 AM, Richard Elling
<richard.elling at gmail.com>wrote:
>
> On Nov 10, 2009, at 1:25 AM, Orvar Korvar wrote:
>
>  Does this mean that there are no driver changes in marvell88sx2, between
>> b125 and b126? If no driver changes, then it means that we both had
extreme
>> unluck with our drives, because we both had checksum errors? And my
discs
>> were brand new.
>>
>
> There are other drivers in the software stack that may have changed.
>  -- richard
>
>
>
>> How probable is this? Something is weird here. What is your opinion on
>> this? Should we agree that there was a hardware error, and it was just
a
>> coincidence?
>>
>
So... I do appear to have reached somewhat of a truce with the system and
b126 at the moment.  I''m now going through and replacing the last of my
old
maxtor 300GB drives with brand new hitachi 1TB drives.  One thing I''m
noticing is a lot of checksum errors being generated during the resilver.
Is this normal?  Furthermore, since I see "no known data errors", is
it safe
to assume it''s all being corrected, and I''m not losing any
data?  I still do
have a separate copy of this data on a box at work that should be completely
consistent... but I will need to re-purpose that storage soon, and will be
without a known good backup for a while (I know, I know).  I''d rather
do a
fresh zfs send/receive than find out 6 months from now I lost something.

  pool: fserv
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h8m, 0.89% done, 15h14m to go
config:

        NAME                        STATE     READ WRITE CKSUM
        fserv                       DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            c8t0d0                  ONLINE       0     0     0
            c8t1d0                  ONLINE       0     0     0
            c8t2d0                  ONLINE       0     0     0
            c8t3d0                  ONLINE       0     0     0
            c8t4d0                  ONLINE       0     0     0
            c8t5d0                  ONLINE       0     0     0
            c7t0d0                  ONLINE       0     0     0
            c7t1d0                  ONLINE       0     0     0
            c7t2d0                  ONLINE       0     0     0
            replacing-9             DEGRADED     0     0  161K
              14274451003165180679  FAULTED      0     0     0  was
/dev/dsk/c7t3d0s0/old
              c7t3d0                ONLINE       0     0     0  2.05G
resilvered
            c7t4d0                  ONLINE       0     0     0
            c7t5d0                  ONLINE       0     0     0
        spares
          c7t6d0                    AVAIL

errors: No known data errors


--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091110/1f9351d9/attachment.html>

Tim Cook

2009-Nov-11 05:01 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Tue, Nov 10, 2009 at 5:15 PM, Tim Cook <tim at cook.ms> wrote:
>
>
> On Tue, Nov 10, 2009 at 10:55 AM, Richard Elling <richard.elling at
gmail.com
> > wrote:
>
>>
>> On Nov 10, 2009, at 1:25 AM, Orvar Korvar wrote:
>>
>>  Does this mean that there are no driver changes in marvell88sx2,
between
>>> b125 and b126? If no driver changes, then it means that we both had
extreme
>>> unluck with our drives, because we both had checksum errors? And my
discs
>>> were brand new.
>>>
>>
>> There are other drivers in the software stack that may have changed.
>>  -- richard
>>
>>
>>
>>> How probable is this? Something is weird here. What is your opinion
on
>>> this? Should we agree that there was a hardware error, and it was
just a
>>> coincidence?
>>>
>>
>
> So... I do appear to have reached somewhat of a truce with the system and
> b126 at the moment.  I''m now going through and replacing the last
of my old
> maxtor 300GB drives with brand new hitachi 1TB drives.  One thing
I''m
> noticing is a lot of checksum errors being generated during the resilver.
> Is this normal?  Furthermore, since I see "no known data errors",
is it safe
> to assume it''s all being corrected, and I''m not losing
any data?  I still do
> have a separate copy of this data on a box at work that should be
completely
> consistent... but I will need to re-purpose that storage soon, and will be
> without a known good backup for a while (I know, I know).  I''d
rather do a
> fresh zfs send/receive than find out 6 months from now I lost something.
>
>   pool: fserv
>  state: DEGRADED
> status: One or more devices is currently being resilvered.  The pool will
>         continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
>  scrub: resilver in progress for 0h8m, 0.89% done, 15h14m to go
>
> config:
>
>         NAME                        STATE     READ WRITE CKSUM
>         fserv                       DEGRADED     0     0     0
>           raidz2-0                  DEGRADED     0     0     0
>             c8t0d0                  ONLINE       0     0     0
>             c8t1d0                  ONLINE       0     0     0
>             c8t2d0                  ONLINE       0     0     0
>
>             c8t3d0                  ONLINE       0     0     0
>             c8t4d0                  ONLINE       0     0     0
>             c8t5d0                  ONLINE       0     0     0
>             c7t0d0                  ONLINE       0     0     0
>             c7t1d0                  ONLINE       0     0     0
>             c7t2d0                  ONLINE       0     0     0
>             replacing-9             DEGRADED     0     0  161K
>               14274451003165180679  FAULTED      0     0     0  was
> /dev/dsk/c7t3d0s0/old
>               c7t3d0                ONLINE       0     0     0  2.05G
> resilvered
>
>             c7t4d0                  ONLINE       0     0     0
>             c7t5d0                  ONLINE       0     0     0
>         spares
>           c7t6d0                    AVAIL
>
>
> errors: No known data errors
>
>
> --Tim
>
>
Anyone?  It''s up to 7.35M checksum errors and it''s rebuilding
extremely
slowly (as evidenced by the 10 hour time).  The errors are only showing on
the "replacing-9" line, not the individual drive.


  pool: fserv
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 6h56m, 39.61% done, 10h34m to go
config:

        NAME                        STATE     READ WRITE CKSUM
        fserv                       DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            c8t0d0                  ONLINE       0     0     0
            c8t1d0                  ONLINE       0     0     0
            c8t2d0                  ONLINE       0     0     0
            c8t3d0                  ONLINE       0     0     0
            c8t4d0                  ONLINE       0     0     0
            c8t5d0                  ONLINE       0     0     0
            c7t0d0                  ONLINE       0     0     0
            c7t1d0                  ONLINE       0     0     0
            c7t2d0                  ONLINE       0     0     0
            replacing-9             DEGRADED     0     0 7.35M
              14274451003165180679  FAULTED      0     0     0  was
/dev/dsk/c7t3d0s0/old
              c7t3d0                ONLINE       0     0     0  91.9G
resilvered
            c7t4d0                  ONLINE       0     0     0
            c7t5d0                  ONLINE       0     0     0
        spares
          c7t6d0                    AVAIL

errors: No known data errors



--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091110/803d26e7/attachment.html>

Orvar Korvar

2009-Nov-11 09:38 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

Other drivers in the stack? Which drivers? And have anyone of them been changed
between b125 and b126?
-- 
This message posted from opensolaris.org

rwalists at washdcmail.com

2009-Nov-11 13:24 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Nov 11, 2009, at 12:01 AM, Tim Cook wrote:
> On Tue, Nov 10, 2009 at 5:15 PM, Tim Cook <tim at cook.ms> wrote:
>>   One thing I''m
>> noticing is a lot of checksum errors being generated during the  
>> resilver.
>> Is this normal?
> Anyone?  It''s up to 7.35M checksum errors and it''s
rebuilding
> extremely
> slowly (as evidenced by the 10 hour time).  The errors are only  
> showing on
> the "replacing-9" line, not the individual drive.
I''ve only replaced a drive once, but it didn''t show any
checksum
errors during the resilver.  This was a 2 TB WD Green drive in a  
mirror pool that had started to show write errors.  It was attached to  
a SuperMicro AOC-SAT2-MV8.

Good luck,
Ware

Tim Cook

2009-Nov-11 15:59 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

On Wed, Nov 11, 2009 at 3:38 AM, Orvar Korvar <
knatte_fnatte_tjatte at yahoo.com> wrote:
> Other drivers in the stack? Which drivers? And have anyone of them been
> changed between b125 and b126?
>

Looks like the sd drive for one.

http://dlc.sun.com/osol/on/downloads/b126/on-changelog-b126.html

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091111/3c295503/attachment.html>

Eric C. Taylor

2009-Nov-11 17:23 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

The checksum errors are fixed in build 128 with:

6807339 spurious checksum errors when replacing a vdev

No; you''re not losing any data due to this.

-  Eric
-- 
This message posted from opensolaris.org

Orvar Korvar

2009-Nov-11 21:23 UTC

head link

[zfs-discuss] marvell88sx2 driver build126

So he did actually hit a bug? But the bug is not dangerous as it doesnt destroy
data?

But I did not replace any devices and still it showed checksum errors. I think I
did a zfs send | zfs receive? I dont remember. But I just copied things back and
forth, and the checksum errors showed up. So what does that mean?
-- 
This message posted from opensolaris.org

zfs discuss - Nov 2009 - marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126

[zfs-discuss] marvell88sx2 driver build126