On 07/05/2015 09:07, Slawa Olhovchenkov wrote:> I have zpool of 12 vdev (zmirrors).
> One disk in one vdev out of service and stop serving reuquest:
>
> dT: 1.036s w: 1.000s
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
> 0 0 0 0 0.0 0 0 0.0 0.0| ada0
> 0 0 0 0 0.0 0 0 0.0 0.0| ada1
> 1 0 0 0 0.0 0 0 0.0 0.0| ada2
> 0 0 0 0 0.0 0 0 0.0 0.0| ada3
> 0 0 0 0 0.0 0 0 0.0 0.0| da0
> 0 0 0 0 0.0 0 0 0.0 0.0| da1
> 0 0 0 0 0.0 0 0 0.0 0.0| da2
> 0 0 0 0 0.0 0 0 0.0 0.0| da3
> 0 0 0 0 0.0 0 0 0.0 0.0| da4
> 0 0 0 0 0.0 0 0 0.0 0.0| da5
> 0 0 0 0 0.0 0 0 0.0 0.0| da6
> 0 0 0 0 0.0 0 0 0.0 0.0| da7
> 0 0 0 0 0.0 0 0 0.0 0.0| da8
> 0 0 0 0 0.0 0 0 0.0 0.0| da9
> 0 0 0 0 0.0 0 0 0.0 0.0| da10
> 0 0 0 0 0.0 0 0 0.0 0.0| da11
> 0 0 0 0 0.0 0 0 0.0 0.0| da12
> 0 0 0 0 0.0 0 0 0.0 0.0| da13
> 0 0 0 0 0.0 0 0 0.0 0.0| da14
> 0 0 0 0 0.0 0 0 0.0 0.0| da15
> 0 0 0 0 0.0 0 0 0.0 0.0| da16
> 0 0 0 0 0.0 0 0 0.0 0.0| da17
> 0 0 0 0 0.0 0 0 0.0 0.0| da18
> 24 0 0 0 0.0 0 0 0.0 0.0| da19
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 0 0 0 0 0.0 0 0 0.0 0.0| da20
> 0 0 0 0 0.0 0 0 0.0 0.0| da21
> 0 0 0 0 0.0 0 0 0.0 0.0| da22
> 0 0 0 0 0.0 0 0 0.0 0.0| da23
> 0 0 0 0 0.0 0 0 0.0 0.0| da24
> 0 0 0 0 0.0 0 0 0.0 0.0| da25
> 0 0 0 0 0.0 0 0 0.0 0.0| da26
> 0 0 0 0 0.0 0 0 0.0 0.0| da27
>
> As result zfs operation on this pool stoped too.
> `zpool list -v` don't worked.
> `zpool detach tank da19` don't worked.
> Application worked with this pool sticking in `zfs` wchan and don't
killed.
>
> # camcontrol tags da19 -v
> (pass19:isci0:0:3:0): dev_openings 7
> (pass19:isci0:0:3:0): dev_active 25
> (pass19:isci0:0:3:0): allocated 25
> (pass19:isci0:0:3:0): queued 0
> (pass19:isci0:0:3:0): held 0
> (pass19:isci0:0:3:0): mintags 2
> (pass19:isci0:0:3:0): maxtags 255
>
> How I can cancel this 24 requst?
> Why this requests don't timeout (3 hours already)?
> How I can forced detach this disk? (I am lready try `camcontrol reset`,
`camconrol rescan`).
> Why ZFS (or geom) don't timeout on request and don't rerouted to
da18?
>
If they are in mirrors, in theory you can just pull the disk, isci will
report to cam and cam will report to ZFS which should all recover.
With regards to not timing out this could be a default issue, but having
a very quick look that's not obvious in the code as
isci_io_request_construct etc do indeed set a timeout when
CAM_TIME_INFINITY hasn't been requested.
The sysctl hw.isci.debug_level may be able to provide more information,
but be aware this can be spammy.
Regards
Steve