thr3ads.net - Virtualization - [PATCH] Revert "virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events" [Jul 2023]

If this information is useful, please help other people find it:
Share via:

Paolo Bonzini

2023-Jul-12 08:06 UTC

[PATCH] Revert "virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events"

On 7/11/23 22:21, Mike Christie wrote:> What was the issue you are seeing?
> 
> Was it something like you get the UA. We retry then on one of the
> retries the sense is not setup correctly, so the scsi error handler
> runs? That fails and the device goes offline?
> 
> If you turn on scsi debugging you would see:
> 
> 
> [  335.445922] sd 0:0:0:0: [sda] tag#15 Add. Sense: Reported luns data has
changed
> [  335.445922] sd 0:0:0:0: [sda] tag#16 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
> [  335.445925] sd 0:0:0:0: [sda] tag#16 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
> [  335.445929] sd 0:0:0:0: [sda] tag#17 Done: FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
> [  335.445932] sd 0:0:0:0: [sda] tag#17 CDB: Write(10) 2a 00 00 db 4f c0 00
00 20 00
> [  335.445934] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
> [  335.445936] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
> [  335.445938] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
> [  335.445940] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
> [  335.445942] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
> [  335.445945] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
> [  335.451447] scsi host0: scsi_eh_0: waking up 0/2/2
> [  335.451453] scsi host0: Total of 2 commands on 1 devices require eh work
> [  335.451457] sd 0:0:0:0: [sda] tag#16 scsi_eh_0: requesting sense
Does this log come from internal discussions within Oracle?
> I don't know the qemu scsi code well, but I scanned the code for my
co-worker
> and my guess was commit 8cc5583abe6419e7faaebc9fbd109f34f4c850f2 had a race
in it.
> 
> How is locking done? when it is a bus level UA but there are multiple
devices
> on the bus?
No locking should be necessary, the code is single threaded.  However, 
what can happen is that two consecutive calls to 
virtio_scsi_handle_cmd_req_prepare use the unit attention ReqOps, and 
then the second virtio_scsi_handle_cmd_req_submit finds no unit 
attention (see the loop in virtio_scsi_handle_cmd_vq).  That can 
definitely explain the log above.

Paolo

Stefano Garzarella

2023-Jul-12 10:14 UTC

head link

[PATCH] Revert "virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events"

On Wed, Jul 12, 2023 at 10:06:56AM +0200, Paolo Bonzini
wrote:>On 7/11/23 22:21, Mike Christie wrote:
>>What was the issue you are seeing?
>>
>>Was it something like you get the UA. We retry then on one of the
>>retries the sense is not setup correctly, so the scsi error handler
>>runs? That fails and the device goes offline?
>>
>>If you turn on scsi debugging you would see:
>>
>>
>>[  335.445922] sd 0:0:0:0: [sda] tag#15 Add. Sense: Reported luns data
has changed
>>[  335.445922] sd 0:0:0:0: [sda] tag#16 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
>>[  335.445925] sd 0:0:0:0: [sda] tag#16 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
>>[  335.445929] sd 0:0:0:0: [sda] tag#17 Done: FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
>>[  335.445932] sd 0:0:0:0: [sda] tag#17 CDB: Write(10) 2a 00 00 db 4f c0
00 00 20 00
>>[  335.445934] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
>>[  335.445936] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
>>[  335.445938] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
>>[  335.445940] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
>>[  335.445942] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
>>[  335.445945] sd 0:0:0:0: [sda] tag#17 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
>>[  335.451447] scsi host0: scsi_eh_0: waking up 0/2/2
>>[  335.451453] scsi host0: Total of 2 commands on 1 devices require eh
work
>>[  335.451457] sd 0:0:0:0: [sda] tag#16 scsi_eh_0: requesting sense
>
>Does this log come from internal discussions within Oracle?
>
>>I don't know the qemu scsi code well, but I scanned the code for my
co-worker
>>and my guess was commit 8cc5583abe6419e7faaebc9fbd109f34f4c850f2 had a
race in it.
>>
>>How is locking done? when it is a bus level UA but there are multiple
devices
>>on the bus?
>
>No locking should be necessary, the code is single threaded.  However, 
>what can happen is that two consecutive calls to 
>virtio_scsi_handle_cmd_req_prepare use the unit attention ReqOps, and 
>then the second virtio_scsi_handle_cmd_req_submit finds no unit 
>attention (see the loop in virtio_scsi_handle_cmd_vq).  That can 
>definitely explain the log above.
Yes, this seems to be the case!
Thank you both for the help!

Following Paolo's advice, I'm preparing a series for QEMU to solve the
problem!

Stefano

Virtualization - Jul 2023 - [PATCH] Revert "virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events"

[PATCH] Revert "virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events"

[PATCH] Revert "virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events"