thr3ads.net - Linux Virtualization - [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS

If this information is useful, please help other people find it:
Share via:

Michael S. Tsirkin

2015-Apr-21 05:22 UTC

[Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"

On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng
wrote:> On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > Currently, virtio code chooses to kill QEMU if the guest passes
any invalid
> > > data with vring.
> > > That has drawbacks such as losing unsaved data (e.g. when
> > > guest user is writing a very long email), or possible denial of
service in
> > > a nested vm use case where virtio device is passed through.
> > > 
> > > virtio-1 has introduced a new status bit "NEEDS RESET"
which could be used to
> > > improve this by communicating the error state between virtio
devices and
> > > drivers. The device notifies guest upon setting the bit, then the
guest driver
> > > should detect this bit and report to userspace, or recover the
device by
> > > resetting it.
> > 
> > Unfortunately, virtio 1 spec does not have a conformance statement
> > that requires driver to recover. We merely have a non-normative
looking
> > text:
> > 	Note: For example, the driver can?t assume requests in flight
> > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume
that
> > 	they have not been completed. A good implementation will try to
recover
> > 	by issuing a reset.
> > 
> > Implementing this reset for all devices in a race-free manner might
also
> > be far from trivial.  I think we'd need a feature bit for this.
> > OTOH as long as we make this a new feature, would an ability to
> > reset a single VQ be a better match for what you are trying to
> > achieve?
> 
> I think that is too complicated as a recovery measure, a device level
resetting
> will be better to get to a deterministic state, at least.
Question would be, how hard is it to stop host from using all queues,
retrieve all host OS state and re-program it into the device.
If we need to shadow all OS state within the driver, then that's a lot
of not well tested code with a possibility of introducing more bugs.
> > 
> > > This series makes necessary changes in virtio core code, based on
which
> > > virtio-blk is converted. Other devices now keep the existing
behavior by
> > > passing in "error_abort". They will be converted in
following series. The Linux
> > > driver part will also be worked on.
> > > 
> > > One concern with this behavior change is that it's now harder
to notice the
> > > actual driver bug that caused the error, as the guest continues
to run.  To
> > > address that, we could probably add a new error action option to
virtio
> > > devices,  similar to the "read/write werror" in block
layer, so the vm could be
> > > paused and the management will get an event in QMP like pvpanic. 
This work can
> > > be done on top.
> > 
> > At the architectural level, that's only one concern. Others would
be
> > - workloads such as openstack handle guest crash better than
> >   a guest that's e.g. slow because of a memory leak
> 
> What memory leak are you referring to?
That was just an example.  If host detects a malformed ring, it will
crash.  But often it doesn't, result is buffers not being used, so guest
can't free them up.
> > - it's easier for guests to probe host for security issues
> >   if guest isn't killed
> > - guest can flood host log with guest-triggered errors
> 
> We can still abort() if guest is triggering error too quickly.
> 
> Fam

Absolutely, and if it looked like I'm against error detection and
recovery, this was not my intent.

I am merely saying we can't apply this patchset as is, deferring
addressing the issues to patches on top.

But I have an idea: refactor the code to use error_abort. This way we
can apply the patchset without making functional changes, and you can
make progress to complete this, on top.



-- 
MST

Fam Zheng

2015-Apr-21 05:50 UTC

head link

[Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"

On Tue, 04/21 07:22, Michael S. Tsirkin wrote:> On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> > On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > > Currently, virtio code chooses to kill QEMU if the guest
passes any invalid
> > > > data with vring.
> > > > That has drawbacks such as losing unsaved data (e.g. when
> > > > guest user is writing a very long email), or possible denial
of service in
> > > > a nested vm use case where virtio device is passed through.
> > > > 
> > > > virtio-1 has introduced a new status bit "NEEDS
RESET" which could be used to
> > > > improve this by communicating the error state between virtio
devices and
> > > > drivers. The device notifies guest upon setting the bit,
then the guest driver
> > > > should detect this bit and report to userspace, or recover
the device by
> > > > resetting it.
> > > 
> > > Unfortunately, virtio 1 spec does not have a conformance
statement
> > > that requires driver to recover. We merely have a non-normative
looking
> > > text:
> > > 	Note: For example, the driver can?t assume requests in flight
> > > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it
assume that
> > > 	they have not been completed. A good implementation will try to
recover
> > > 	by issuing a reset.
> > > 
> > > Implementing this reset for all devices in a race-free manner
might also
> > > be far from trivial.  I think we'd need a feature bit for
this.
> > > OTOH as long as we make this a new feature, would an ability to
> > > reset a single VQ be a better match for what you are trying to
> > > achieve?
> > 
> > I think that is too complicated as a recovery measure, a device level
resetting
> > will be better to get to a deterministic state, at least.
> 
> Question would be, how hard is it to stop host from using all queues,
> retrieve all host OS state and re-program it into the device.
> If we need to shadow all OS state within the driver, then that's a lot
> of not well tested code with a possibility of introducing more bugs.
I don't understand the question. In this series the virtio-blk device will
not
pop any more requests, and as long as the reset is properly handled, both guest
and host should go back to a good state.> 
> > > 
> > > > This series makes necessary changes in virtio core code,
based on which
> > > > virtio-blk is converted. Other devices now keep the existing
behavior by
> > > > passing in "error_abort". They will be converted
in following series. The Linux
> > > > driver part will also be worked on.
> > > > 
> > > > One concern with this behavior change is that it's now
harder to notice the
> > > > actual driver bug that caused the error, as the guest
continues to run.  To
> > > > address that, we could probably add a new error action
option to virtio
> > > > devices,  similar to the "read/write werror" in
block layer, so the vm could be
> > > > paused and the management will get an event in QMP like
pvpanic.  This work can
> > > > be done on top.
> > > 
> > > At the architectural level, that's only one concern. Others
would be
> > > - workloads such as openstack handle guest crash better than
> > >   a guest that's e.g. slow because of a memory leak
> > 
> > What memory leak are you referring to?
> 
> That was just an example.  If host detects a malformed ring, it will
> crash.  But often it doesn't, result is buffers not being used, so
guest
> can't free them up.
> 
> > > - it's easier for guests to probe host for security issues
> > >   if guest isn't killed
> > > - guest can flood host log with guest-triggered errors
> > 
> > We can still abort() if guest is triggering error too quickly.
> 
> 
> Absolutely, and if it looked like I'm against error detection and
> recovery, this was not my intent.
> 
> I am merely saying we can't apply this patchset as is, deferring
> addressing the issues to patches on top.
> 
> But I have an idea: refactor the code to use error_abort. 
That is patch 1-9 of this series. Or do you mean also refactor and pass
error_abort to the memory core?

Fam
>This way we
> can apply the patchset without making functional changes, and you can
> make progress to complete this, on top.

Michael S. Tsirkin

2015-Apr-21 06:09 UTC

head link

[Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"

On Tue, Apr 21, 2015 at 01:50:33PM +0800, Fam Zheng
wrote:> On Tue, 04/21 07:22, Michael S. Tsirkin wrote:
> > On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> > > On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > > > Currently, virtio code chooses to kill QEMU if the
guest passes any invalid
> > > > > data with vring.
> > > > > That has drawbacks such as losing unsaved data (e.g.
when
> > > > > guest user is writing a very long email), or possible
denial of service in
> > > > > a nested vm use case where virtio device is passed
through.
> > > > > 
> > > > > virtio-1 has introduced a new status bit "NEEDS
RESET" which could be used to
> > > > > improve this by communicating the error state between
virtio devices and
> > > > > drivers. The device notifies guest upon setting the
bit, then the guest driver
> > > > > should detect this bit and report to userspace, or
recover the device by
> > > > > resetting it.
> > > > 
> > > > Unfortunately, virtio 1 spec does not have a conformance
statement
> > > > that requires driver to recover. We merely have a
non-normative looking
> > > > text:
> > > > 	Note: For example, the driver can?t assume requests in
flight
> > > > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it
assume that
> > > > 	they have not been completed. A good implementation will
try to recover
> > > > 	by issuing a reset.
> > > > 
> > > > Implementing this reset for all devices in a race-free
manner might also
> > > > be far from trivial.  I think we'd need a feature bit
for this.
> > > > OTOH as long as we make this a new feature, would an ability
to
> > > > reset a single VQ be a better match for what you are trying
to
> > > > achieve?
> > > 
> > > I think that is too complicated as a recovery measure, a device
level resetting
> > > will be better to get to a deterministic state, at least.
> > 
> > Question would be, how hard is it to stop host from using all queues,
> > retrieve all host OS state and re-program it into the device.
> > If we need to shadow all OS state within the driver, then that's a
lot
> > of not well tested code with a possibility of introducing more bugs.
> 
> I don't understand the question. In this series the virtio-blk device
will not
> pop any more requests, and as long as the reset is properly handled, both
guest
> and host should go back to a good state.
> > 
> > > > 
> > > > > This series makes necessary changes in virtio core
code, based on which
> > > > > virtio-blk is converted. Other devices now keep the
existing behavior by
> > > > > passing in "error_abort". They will be
converted in following series. The Linux
> > > > > driver part will also be worked on.
> > > > > 
> > > > > One concern with this behavior change is that it's
now harder to notice the
> > > > > actual driver bug that caused the error, as the guest
continues to run.  To
> > > > > address that, we could probably add a new error action
option to virtio
> > > > > devices,  similar to the "read/write werror"
in block layer, so the vm could be
> > > > > paused and the management will get an event in QMP like
pvpanic.  This work can
> > > > > be done on top.
> > > > 
> > > > At the architectural level, that's only one concern.
Others would be
> > > > - workloads such as openstack handle guest crash better than
> > > >   a guest that's e.g. slow because of a memory leak
> > > 
> > > What memory leak are you referring to?
> > 
> > That was just an example.  If host detects a malformed ring, it will
> > crash.  But often it doesn't, result is buffers not being used, so
guest
> > can't free them up.
> > 
> > > > - it's easier for guests to probe host for security
issues
> > > >   if guest isn't killed
> > > > - guest can flood host log with guest-triggered errors
> > > 
> > > We can still abort() if guest is triggering error too quickly.
> > 
> > 
> > Absolutely, and if it looked like I'm against error detection and
> > recovery, this was not my intent.
> > 
> > I am merely saying we can't apply this patchset as is, deferring
> > addressing the issues to patches on top.
> > 
> > But I have an idea: refactor the code to use error_abort. 
> 
> That is patch 1-9 of this series. Or do you mean also refactor and pass
> error_abort to the memory core?
> 
> Fam
So if you like just patches 1-9 applied, this sounds
reasonable. I'll provide review comments on the individual patches.


> >This way we
> > can apply the patchset without making functional changes, and you can
> > make progress to complete this, on top.

Possibly Parallel Threads

Search for more possibly parallel threads

Linux Virtualization - Apr 2015 - [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"

[Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"

[Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"

[Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"

Possibly Parallel Threads