thr3ads.net - freebsd stable - Concern: ZFS Mirror issues (12.STABLE and firmware 19 .v. 20) [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Karl Denninger

2019-Apr-10 14:38 UTC

Concern: ZFS Mirror issues (12.STABLE and firmware 19 .v. 20)

On 4/10/2019 08:45, Andriy Gapon wrote:> On 10/04/2019 04:09, Karl Denninger wrote:
>> Specifically, I *explicitly* OFFLINE the disk in question, which is a
>> controlled operation and *should* result in a cache flush out of the
ZFS
>> code into the drive before it is OFFLINE'd.
>>
>> This should result in the "last written" TXG that the
remaining online
>> members have, and the one in the offline member, being consistent.
>>
>> Then I "camcontrol standby" the involved drive, which forces
a writeback
>> cache flush and a spindown; in other words, re-ordered or not, the
>> on-platter data *should* be consistent with what the system thinks
>> happened before I yank the physical device.
> This may not be enough for a specific [RAID] controller and a specific
> configuration.  It should be enough for a dumb HBA.  But, for example,
mrsas(9)
> can simply ignore the synchronize cache command (meaning neither the
on-board
> cache is flushed nor the command is propagated to a disk).  So, if you use
some
> advanced controller it would make sense to use its own management tool to
> offline a disk before pulling it.
>
> I do not preclude a possibility of an issue in ZFS.  But it's not the
only
> possibility either.
In this specific case the adapter in question is...

mps0: <Avago Technologies (LSI) SAS2116> port 0xc000-0xc0ff mem
0xfbb3c000-0xfbb3ffff,0xfbb40000-0xfbb7ffff irq 30 at device 0.0 on pci3
mps0: Firmware: 20.00.07.00, Driver: 21.02.00.00-fbsd
mps0: IOCCapabilities:
1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>

Which is indeed a "dumb" HBA (in IT mode), and Zeephod says he
connects
his drives via dumb on-MoBo direct SATA connections.

What I don't know (yet) is if the update to firmware 20.00.07.00 in the
HBA has fixed it.? The 11.2 and 12.0 revs of FreeBSD through some
mechanism changed timing quite materially in the mps driver; prior to
11.2 I ran with a Lenovo SAS expander connected to SATA disks without
any problems at all, even across actual disk failures through the years,
but in 11.2 and 12.0 doing this resulted in spurious retries out of the
CAM layer that allegedly came from timeouts on individual units (which
looked very much like a lost command sent to the disk), but only on
mirrored volume sets -- yet there were no errors reported by the drive
itself, nor did either of my RaidZ2 pools (one spinning rust, one SSD)
experience problems of any sort.?? Flashing the HBA forward to
20.00.07.00 with the expander in resulted in the? *driver* (mps) taking
disconnects and resets instead of the targets, which in turn caused
random drive fault events across all of the pools.? For obvious reasons
that got backed out *fast*.

Without the expander 19.00.00.00 has been stable over the last few
months *except* for this circumstance, where an intentionally OFFLINE'd
disk in a mirror that is brought back online after some reasonably long
period of time (days to a week) results in a successful resilver but
then a small number of checksum errors on that drive -- always on the
one that was OFFLINEd, never on the one(s) not taken OFFLINE -- appear
and are corrected when a scrub is subsequently performed.? I am now on
20.00.07.00 and so far -- no problems.? But I've yet to do the backup
disk swap on 20.00.07.00 (scheduled for late week or Monday) so I do not
know if the 20.00.07.00 roll-forward addresses the scrub issue or not.?
I have no reason to believe it is involved, but given the previous
"iffy" nature of 11.2 and 12.0 on 19.0 with the expander it very well
might be due to what appear to be timing changes in the driver architecture.

-- 
Karl Denninger
karl at denninger.net <mailto:karl at denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4897 bytes
Desc: S/MIME Cryptographic Signature
URL:
<http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20190410/5a5b859f/attachment.bin>

Zaphod Beeblebrox

2019-Apr-11 18:52 UTC

head link

Concern: ZFS Mirror issues (12.STABLE and firmware 19 .v. 20)

On Wed, Apr 10, 2019 at 10:41 AM Karl Denninger <karl at denninger.net>
wrote:

> In this specific case the adapter in question is...
>
> mps0: <Avago Technologies (LSI) SAS2116> port 0xc000-0xc0ff mem
> 0xfbb3c000-0xfbb3ffff,0xfbb40000-0xfbb7ffff irq 30 at device 0.0 on pci3
> mps0: Firmware: 20.00.07.00, Driver: 21.02.00.00-fbsd
> mps0: IOCCapabilities:
>
1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>
>
> Which is indeed a "dumb" HBA (in IT mode), and Zeephod says he
connects
> his drives via dumb on-MoBo direct SATA connections.
>
Maybe I'm in good company.  My current setup has 8 of the disks connected
to:

mps0: <Avago Technologies (LSI) SAS2308> port 0xb000-0xb0ff mem
0xfe240000-0xfe24ffff,0xfe200000-0xfe23ffff irq 32 at device 0.0 on pci6
mps0: Firmware: 19.00.00.00, Driver: 21.02.00.00-fbsd
mps0: IOCCapabilities:
5a85c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,MSIXIndex,HostDisc>

... just with a cable that breaks out each of the 2 connectors into 4
SATA-style connectors, and the other 8 disks (plus boot disks and SSD
cache/log) connected to ports on...

- ahci0: <ASMedia ASM1062 AHCI SATA controller> port
0xd050-0xd057,0xd040-0xd043,0xd030-0xd037,0xd020-0xd023,0xd000-0xd01f mem
0xfe900000-0xfe9001ff irq 44 at device 0.0 on pci2
- ahci2: <Marvell 88SE9230 AHCI SATA controller> port
0xa050-0xa057,0xa040-0xa043,0xa030-0xa037,0xa020-0xa023,0xa000-0xa01f mem
0xfe610000-0xfe6107ff irq 40 at device 0.0 on pci7
- ahci3: <AMD SB7x0/SB8x0/SB9x0 AHCI SATA controller> port
0xf040-0xf047,0xf030-0xf033,0xf020-0xf027,0xf010-0xf013,0xf000-0xf00f mem
0xfea07000-0xfea073ff irq 19 at device 17.0 on pci0

... each drive connected to a single port.

I can actually reproduce this at will.  Because I have 16 drives, when one
fails, I need to find it.  I pull the sata cable for a drive, determine if
it's the drive in question, if not, reconnect, "ONLINE" it and
wait for
resilver to stop... usually only a minute or two.

... if I do this 4 to 6 odd times to find a drive (I can tell, in general,
that a drive is part of the SAS controller or the SATA controllers... so
I'm only looking among 8, ever) ... then I "REPLACE" the problem
drive.
More often than not, the a scrub will find a few problems.  In fact, it
appears that the most recent scrub is an example:

[1:7:306]dgilbert at vr:~> zpool status
  pool: vr1
 state: ONLINE
  scan: scrub repaired 32K in 47h16m with 0 errors on Mon Apr  1 23:12:03
2019
config:

        NAME            STATE     READ WRITE CKSUM
        vr1             ONLINE       0     0     0
          raidz2-0      ONLINE       0     0     0
            gpt/v1-d0   ONLINE       0     0     0
            gpt/v1-d1   ONLINE       0     0     0
            gpt/v1-d2   ONLINE       0     0     0
            gpt/v1-d3   ONLINE       0     0     0
            gpt/v1-d4   ONLINE       0     0     0
            gpt/v1-d5   ONLINE       0     0     0
            gpt/v1-d6   ONLINE       0     0     0
            gpt/v1-d7   ONLINE       0     0     0
          raidz2-2      ONLINE       0     0     0
            gpt/v1-e0c  ONLINE       0     0     0
            gpt/v1-e1b  ONLINE       0     0     0
            gpt/v1-e2b  ONLINE       0     0     0
            gpt/v1-e3b  ONLINE       0     0     0
            gpt/v1-e4b  ONLINE       0     0     0
            gpt/v1-e5a  ONLINE       0     0     0
            gpt/v1-e6a  ONLINE       0     0     0
            gpt/v1-e7c  ONLINE       0     0     0
        logs
          gpt/vr1log    ONLINE       0     0     0
        cache
          gpt/vr1cache  ONLINE       0     0     0

errors: No known data errors

... it doesn't say it now, but there were 5 CKSUM errors on one of the
drives that I had trial-removed (and not on the one replaced).

freebsd stable - Apr 2019 - Concern: ZFS Mirror issues (12.STABLE and firmware 19 .v. 20)

Concern: ZFS Mirror issues (12.STABLE and firmware 19 .v. 20)

Concern: ZFS Mirror issues (12.STABLE and firmware 19 .v. 20)