thr3ads.net - Linux Virtualization - [PATCH 1/1] virtio/s390: fix vritio-ccw device teardown [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Halil Pasic

2021-Sep-19 22:39 UTC

[PATCH 1/1] virtio/s390: fix vritio-ccw device teardown

On Fri, 17 Sep 2021 10:40:20 +0200
Cornelia Huck <cohuck at redhat.com> wrote:
> On Thu, Sep 16 2021, Halil Pasic <pasic at linux.ibm.com> wrote:
> 
> > On Thu, 16 Sep 2021 10:59:15 +0200
> > Cornelia Huck <cohuck at redhat.com> wrote:
> >  
> >> > Since commit 48720ba56891 ("virtio/s390: use DMA memory
for ccw I/O and
> >> > classic notifiers") we were supposed to make sure that
> >> > virtio_ccw_release_dev() completes before the ccw device, and
the
> >> > attached dma pool are torn down, but unfortunately we did
not.
> >> > Before that commit it used to be OK to delay cleaning up the
memory
> >> > allocated by virtio-ccw indefinitely (which isn't really
intuitive for
> >> > guys used to destruction happens in reverse construction
order).
> >> >
> >> > To accomplish this let us take a reference on the ccw device
before we
> >> > allocate the dma_area and give it up after dma_area was
freed.
> >> >
> >> > Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
> >> > Fixes: 48720ba56891 ("virtio/s390: use DMA memory for
ccw I/O and
> >> > classic notifiers")
> >> > Reported-by: bfu at redhat.com
> >> > ---
> >> >
> >> > I'm not certain this is the only hot-unplug and teardonw
related problem
> >> > with virtio-ccw.
> >> >
> >> > Some things that are not perfectly clear to me:
> >> > * What would happen if we observed an hot-unplug while we are
doing
> >> >   wait_event() in ccw_io_helper()? Do we get stuck? I
don't thin we
> >> >   are guaranteed to receive an irq for a subchannel that is
gone.
> >> 
> >> Hm. I think we may need to do a wake_up during remove handling.  
> >
> > My guess is that the BQL is saving us from ever seeing this with QEMU
> > as the hypervisor-userspace. Nevertheless I don't think we should
rely
> > on that.  
> 
> I agree. Let's do that via a separate patch.
> 
I understand you would like us to finish the discussion on the alternate
approach before giving an r-b for this patch, right?
> >  
> >>   
> >> > * cdev->online seems to be manipulated under
cdev->ccwlock, but
> >> >   in virtio_ccw_remove() we look at it to decide should we
clean up
> >> >   or not. What is the idea there? I guess we want to avoid
doing
> >> >   if nothing is there or twice. But I don't understand
how stuff
> >> >   interlocks.    
> >> 
> >> We only created the virtio device when we onlined the ccw device.
Do you
> >> have a better idea how to check for that? (And yes, I'm not
sure the
> >> locking is correct.)
> >>   
> >
> > Thanks, if I find time for it, I will try to understand this better
and
> > come back with my findings.
> >  
> >> > * Can virtio_ccw_remove() get called while !cdev->online
and
> >> >   virtio_ccw_online() is running on a different cpu? If yes,
what would
> >> >   happen then?    
> >> 
> >> All of the remove/online/... etc. callbacks are invoked via the
ccw bus
> >> code. We have to trust that it gets it correct :) (Or have the
common
> >> I/O layer maintainers double-check it.)
> >>   
> >
> > Vineeth, what is your take on this? Are the struct ccw_driver
> > virtio_ccw_remove and the virtio_ccw_online callbacks mutually
> > exclusive. Please notice that we may initiate the onlining by
> > calling ccw_device_set_online() from a workqueue.
> >
> > @Conny: I'm not sure what is your definition of 'it gets it
correct'...
> > I doubt CIO can make things 100% foolproof in this area.  
> 
> Not 100% foolproof, but "don't online a device that is in the
progress
> of going away" seems pretty basic to me.
> 
I hope Vineeth will chime in on this.
> >  
> >> >  
> >> > The main addresse of these questions is Conny ;).  
> >
> > In any case, I think we can go step by step. I would like the issue
> > this patch intends to address, addressed first. Then we can think
> > about the rest.
> >  
> >> >
> >> > An alternative to this approach would be to inc and dec the
refcount
> >> > in ccw_device_dma_zalloc() and ccw_device_dma_free()
respectively.
> >> 
> >> Yeah, I also thought about that. This would give us more get/put
> >> operations, but might be the safer option.  
> >
> > My understanding is, that having the ccw device go away while in a
> > middle of doing ccw stuff (about to submit, or waiting for a channel
> > program, or whatever) was bad before.  
> 
> What do you mean with "was bad before"?
Using an already invalid pointer to the ccw device is always bad. I'm
not sure what prevented this from happening before commit 48720ba56891.
I'm aware of the fact that virtio_ccw_release_dev() didn't use to
deference the vcdev->cdev before that commit, so we didn't have this
exact problem. Can you tell me, how did we use to ensure that all
dereferences of vcdev->cdev are legit, i.e. happened while the
ccw device is still fully alive before commit 48720ba56891?
> 
> > So my intuition tells me that
> > drivers should manage explicitly. Yes virtio_ccw happens to have dma
> > memory whose lifetime is more or less the lifetime of struct
virtio_ccw,
> > but that may not be always the case.  
> 
> I'm not sure what you're getting at here. Regardless of the
lifetime of
> the dma memory, it depends on the presence of the ccw device to which it
> is tied. This means that the ccw device must not be released while the
> dma memory is alive. We can use the approach in your patch here due to
> the lifetime of the dma memory that virtio-ccw allocates when we start
> using the device and frees when we stop using the device, or we can use
> get/put with every allocate/release dma memory pair, which should be
> safe for everyone?
> 
What I mean is that ccw_device_dma_[zalloc,free]() take a pointer to the
ccw_device. If we get/put in those we can ensure that, provided the
alloc and the free calls are properly paired, the device will be still
alive (and the pointer valid) for the free, if it was valid for the
alloc. But it does not ensure that each and every call to alloc is with
a valid pointer, or that other uses of the pointer are OK. So I don't
think it is completely safe for everyone, because we could try to use
a pointer to a ccw device when not having any dma memory allocated from
its pool.

This patch takes reference to cdev before the pointer is published via
vcdev->cdev and drops the reference after *vcdev is freed. The idea is
that the pointee basically outlives the pointer. (Without having a full
understanding of how things are synchronized).

Regards,
Halil

Cornelia Huck

2021-Sep-20 10:30 UTC

head link

[PATCH 1/1] virtio/s390: fix vritio-ccw device teardown

On Mon, Sep 20 2021, Halil Pasic <pasic at linux.ibm.com> wrote:
> On Fri, 17 Sep 2021 10:40:20 +0200
> Cornelia Huck <cohuck at redhat.com> wrote:
>
>> On Thu, Sep 16 2021, Halil Pasic <pasic at linux.ibm.com> wrote:
>> 
>> > On Thu, 16 Sep 2021 10:59:15 +0200
>> > Cornelia Huck <cohuck at redhat.com> wrote:
>> >  
>> >> > Since commit 48720ba56891 ("virtio/s390: use DMA
memory for ccw I/O and
>> >> > classic notifiers") we were supposed to make sure
that
>> >> > virtio_ccw_release_dev() completes before the ccw device,
and the
>> >> > attached dma pool are torn down, but unfortunately we did
not.
>> >> > Before that commit it used to be OK to delay cleaning up
the memory
>> >> > allocated by virtio-ccw indefinitely (which isn't
really intuitive for
>> >> > guys used to destruction happens in reverse construction
order).
>> >> >
>> >> > To accomplish this let us take a reference on the ccw
device before we
>> >> > allocate the dma_area and give it up after dma_area was
freed.
>> >> >
>> >> > Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
>> >> > Fixes: 48720ba56891 ("virtio/s390: use DMA memory
for ccw I/O and
>> >> > classic notifiers")
>> >> > Reported-by: bfu at redhat.com
>> >> > ---
>> >> >
>> >> > I'm not certain this is the only hot-unplug and
teardonw related problem
>> >> > with virtio-ccw.
>> >> >
>> >> > Some things that are not perfectly clear to me:
>> >> > * What would happen if we observed an hot-unplug while we
are doing
>> >> >   wait_event() in ccw_io_helper()? Do we get stuck? I
don't thin we
>> >> >   are guaranteed to receive an irq for a subchannel that
is gone.
>> >> 
>> >> Hm. I think we may need to do a wake_up during remove
handling.
>> >
>> > My guess is that the BQL is saving us from ever seeing this with
QEMU
>> > as the hypervisor-userspace. Nevertheless I don't think we
should rely
>> > on that.  
>> 
>> I agree. Let's do that via a separate patch.
>> 
>
> I understand you would like us to finish the discussion on the alternate
> approach before giving an r-b for this patch, right?
Yes, exactly.

(...)
>> >> > An alternative to this approach would be to inc and dec
the refcount
>> >> > in ccw_device_dma_zalloc() and ccw_device_dma_free()
respectively.
>> >> 
>> >> Yeah, I also thought about that. This would give us more
get/put
>> >> operations, but might be the safer option.  
>> >
>> > My understanding is, that having the ccw device go away while in a
>> > middle of doing ccw stuff (about to submit, or waiting for a
channel
>> > program, or whatever) was bad before.  
>> 
>> What do you mean with "was bad before"?
>
> Using an already invalid pointer to the ccw device is always bad. I'm
> not sure what prevented this from happening before commit 48720ba56891.
> I'm aware of the fact that virtio_ccw_release_dev() didn't use to
> deference the vcdev->cdev before that commit, so we didn't have this
> exact problem. Can you tell me, how did we use to ensure that all
> dereferences of vcdev->cdev are legit, i.e. happened while the
> ccw device is still fully alive before commit 48720ba56891?
I'm not sure what that commit is having to do with lifetimes, it did not
change anything, only added the extra interaction for the dma buffer.

Basically, the vcdev is supposed to be around while the ccw device is
online (with a tail end until references have been given up, of course.)
It embeds a virtio device that has the ccw device as a parent, which
will give us a reference on the ccw device as long as the virtio device
is alive. Any interactions with the ccw device (except freeing the dma
buffer) are limited to the time where we still have a reference to it
via the virtio device.
>
>> 
>> > So my intuition tells me that
>> > drivers should manage explicitly. Yes virtio_ccw happens to have
dma
>> > memory whose lifetime is more or less the lifetime of struct
virtio_ccw,
>> > but that may not be always the case.  
>> 
>> I'm not sure what you're getting at here. Regardless of the
lifetime of
>> the dma memory, it depends on the presence of the ccw device to which
it
>> is tied. This means that the ccw device must not be released while the
>> dma memory is alive. We can use the approach in your patch here due to
>> the lifetime of the dma memory that virtio-ccw allocates when we start
>> using the device and frees when we stop using the device, or we can use
>> get/put with every allocate/release dma memory pair, which should be
>> safe for everyone?
>> 
>
> What I mean is that ccw_device_dma_[zalloc,free]() take a pointer to the
> ccw_device. If we get/put in those we can ensure that, provided the
> alloc and the free calls are properly paired, the device will be still
> alive (and the pointer valid) for the free, if it was valid for the
> alloc. But it does not ensure that each and every call to alloc is with
> a valid pointer, or that other uses of the pointer are OK. So I don't
> think it is completely safe for everyone, because we could try to use
> a pointer to a ccw device when not having any dma memory allocated from
> its pool.
But the problem is the dma memory, right? Also, it is the same issue for
any potential caller of the ccw_device_dma_* interfaces.
>
> This patch takes reference to cdev before the pointer is published via
> vcdev->cdev and drops the reference after *vcdev is freed. The idea is
> that the pointee basically outlives the pointer. (Without having a full
> understanding of how things are synchronized).
I don't think we have to care about accessing ->cdev (see above.) Plus,
as we give up the dma memory at the very last point, we would also give
up the reference via that memory at the very last point, so I'm not sure
what additional problems could come up.

Linux Virtualization - Sep 2021 - [PATCH 1/1] virtio/s390: fix vritio-ccw device teardown

[PATCH 1/1] virtio/s390: fix vritio-ccw device teardown

[PATCH 1/1] virtio/s390: fix vritio-ccw device teardown