On Wed, Oct 20, 2021 at 1:02 AM Dongli Zhang <dongli.zhang at oracle.com>
wrote:>
>
>
> On 10/18/21 6:33 PM, Jason Wang wrote:
> > On Sat, Oct 16, 2021 at 1:27 AM Michael S. Tsirkin <mst at
redhat.com> wrote:
> >>
> >> On Fri, Oct 15, 2021 at 05:09:38AM -0700, Dongli Zhang wrote:
> >>> Hi Jason,
> >>>
> >>> On 10/11/21 11:52 PM, Jason Wang wrote:
> >>>> We used to synchronize pending MSI-X irq handlers via
> >>>> synchronize_irq(), this may not work for the untrusted
device which
> >>>> may keep sending interrupts after reset which may lead
unexpected
> >>>> results. Similarly, we should not enable MSI-X interrupt
until the
> >>>
> >>> About "unexpected results", while you mentioned
below in v1 ...
> >>>
> >>> "My understanding is that e.g in the case of SEV/TDX we
don't trust the
> >>> hypervisor. So the hypervisor can keep sending interrupts even
if the
> >>> device is reset. The guest can only trust its own software
interrupt
> >>> management logic to avoid call virtio callback in this
case."
> >>>
> >>> .. and you also mentioned to avoid the panic (due to untrusted
device) in as
> >>> many scenarios as possible.
> >>>
> >>>
> >>> Here is my understanding.
> >>>
> >>> The reason we do not trust hypervisor is to protect (1)
data/code privacy for
> >>> most of times and sometimes (2) program execution integrity.
> >>>
> >>> The bad thing is: the hypervisor is able to panic/destroy the
VM in the worst case.
> >>>
> >>> It is reasonable to re-configure/recover if we assume there is
BUG at
> >>> hypervisor/device side. That is, the hypervisor/device is
buggy, but not malicious.
> >>>
> >>> However, how about to just detect and report the
hypervisor/device is malicious
> >>> and shutdown/panic the VM immediately? If we have detected the
malicious
> >>> hypervisor, we should avoid running VMs on the malicious
hypervisor further. At
> >>> least how about to print error message to reminder users that
the hypervisor is
> >>> malicious?
> >
> > I understand your point, but several questions needs to be answered:
> >
> > 1) how can we easily differentiate "malicious" from
"buggy"?
> > 2) If we want to detect malicious hypervisor, virtio is not the only
> > place that we want to do this
> > 3) Is there a way that "malicious" hypervisor can prevent
guest from
> > shutting down itself?
> >
> >>>
> >>>
> >>> About "unexpected results", it should not be
hang/panic. I suggest ...
> >>>
> >
> > It's just the phenomenon not the logic behind that. It could be
e.g
> > OOB which may lead the unexpected kernel codes to be executed in
> > unexpected ways (e.g mark the page as shared or go with IOTLB etc).
> > Sometimes we can see panic finally but it's not always.
>
> This is what I was trying to explain.
>
> The objective is to protect "data privacy" (or code execution
integrity in some
> case), but not to prevent DoS attack. That is, the 'malicious'
hypervisor should
> not be able to derive VM data privacy.
>
> Suppose the hypervisor did something buggy/malicious when SEV/TDX is used
by VM,
> the "unexpected results" should never reveal secure/private data
to the hypervisor.
>
> In my own opinion, the threat model is:
>
> Attacker: 'malicious' hypervisor
>
> Victim: VM with SEV/TDX/SGX
>
> The attacker should not be able to steal secure/private data from VM, when
the
> hypervisor's action is unexpected. DoS is out of the scope.
>
> My concern is: it is very hard to clearly explain in the patchset how the
> hypervisor is able to steal VM's data, by setting queue=0 or injecting
unwanted
> interrupts to VM.
Yes, it's a hard question but instead of trying to answer that, we can
just fix the case of e.g unexpected interrupts.
Thanks
>
> Thank you very much!
>
> Dongli Zhang
>
> >
> >>> Assuming SEV/TDX is involved, the hypervisor should never be
able to derive the
> >>> VM privacy (at least secure memory data) by providing
malicious configuration,
> >>> e.g., num_queues=0.
> >
> > Yes.
> >
> >>> If we detect hypervisor is malicious, the VM is
> >>> panic/shutdown immediately.
> >
> > This seems to enforce the policy into the kernel, we need to leave
> > this to userspace to decide.
> >
> >>> At least how about to print error message to
> >>> reminder users.
> >
> > This is fine.
> >
> >>>
> >>>
> >>> BTW, while I always do care about the loss of interrupt issue,
the malicious
> >>> device is able to hang a VM by dropping a single interrupt on
purpose for
> >>> virtio-scsi :)
> >>>
> >
> > Right.
> >
> >>>
> >>> Thank you very much!
> >>
> >>
> >> Can't say I understand what it's about. TDX does not
protect against
> >> hypervisor DoS attacks.
> >
> > Yes, I think what Dongli wants to discuss is how to behave if we
> > detect a malicious hypervisor. He suggested a shutdown instead of
> > failing the probe.
> >
> > Thanks
> >
> >>
> >>> Dongli Zhang
> >>>
> >>>> device is ready. So this patch fixes those two issues by:
> >>>>
> >>>> 1) switching to use disable_irq() to prevent the virtio
interrupt
> >>>> handlers to be called after the device is reset.
> >>>> 2) using IRQF_NO_AUTOEN and enable the MSI-X irq during
.ready()
> >>>>
> >>>> This can make sure the virtio interrupt handler won't
be called before
> >>>> virtio_device_ready() and after reset.
> >>>>
> >>>> Cc: Thomas Gleixner <tglx at linutronix.de>
> >>>> Cc: Peter Zijlstra <peterz at infradead.org>
> >>>> Cc: Paul E. McKenney <paulmck at kernel.org>
> >>>> Signed-off-by: Jason Wang <jasowang at redhat.com>
> >>>> ---
> >>>> drivers/virtio/virtio_pci_common.c | 27
+++++++++++++++++++++------
> >>>> drivers/virtio/virtio_pci_common.h | 6 ++++--
> >>>> drivers/virtio/virtio_pci_legacy.c | 5 +++--
> >>>> drivers/virtio/virtio_pci_modern.c | 6 ++++--
> >>>> 4 files changed, 32 insertions(+), 12 deletions(-)
> >>>>
> >>>> diff --git a/drivers/virtio/virtio_pci_common.c
b/drivers/virtio/virtio_pci_common.c
> >>>> index b35bb2d57f62..0b9523e6dd39 100644
> >>>> --- a/drivers/virtio/virtio_pci_common.c
> >>>> +++ b/drivers/virtio/virtio_pci_common.c
> >>>> @@ -24,8 +24,8 @@ MODULE_PARM_DESC(force_legacy,
> >>>> "Force legacy mode for transitional
virtio 1 devices");
> >>>> #endif
> >>>>
> >>>> -/* wait for pending irq handlers */
> >>>> -void vp_synchronize_vectors(struct virtio_device *vdev)
> >>>> +/* disable irq handlers */
> >>>> +void vp_disable_vectors(struct virtio_device *vdev)
> >>>> {
> >>>> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> >>>> int i;
> >>>> @@ -34,7 +34,20 @@ void vp_synchronize_vectors(struct
virtio_device *vdev)
> >>>> synchronize_irq(vp_dev->pci_dev->irq);
> >>>>
> >>>> for (i = 0; i < vp_dev->msix_vectors; ++i)
> >>>> -
synchronize_irq(pci_irq_vector(vp_dev->pci_dev, i));
> >>>> + disable_irq(pci_irq_vector(vp_dev->pci_dev,
i));
> >>>> +}
> >>>> +
> >>>> +/* enable irq handlers */
> >>>> +void vp_enable_vectors(struct virtio_device *vdev)
> >>>> +{
> >>>> + struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> >>>> + int i;
> >>>> +
> >>>> + if (vp_dev->intx_enabled)
> >>>> + return;
> >>>> +
> >>>> + for (i = 0; i < vp_dev->msix_vectors; ++i)
> >>>> + enable_irq(pci_irq_vector(vp_dev->pci_dev,
i));
> >>>> }
> >>>>
> >>>> /* the notify function used when creating a virt queue */
> >>>> @@ -141,7 +154,8 @@ static int
vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
> >>>> snprintf(vp_dev->msix_names[v], sizeof
*vp_dev->msix_names,
> >>>> "%s-config", name);
> >>>> err = request_irq(pci_irq_vector(vp_dev->pci_dev,
v),
> >>>> - vp_config_changed, 0,
vp_dev->msix_names[v],
> >>>> + vp_config_changed, IRQF_NO_AUTOEN,
> >>>> + vp_dev->msix_names[v],
> >>>> vp_dev);
> >>>> if (err)
> >>>> goto error;
> >>>> @@ -160,7 +174,8 @@ static int
vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
> >>>> snprintf(vp_dev->msix_names[v], sizeof
*vp_dev->msix_names,
> >>>> "%s-virtqueues", name);
> >>>> err =
request_irq(pci_irq_vector(vp_dev->pci_dev, v),
> >>>> - vp_vring_interrupt, 0,
vp_dev->msix_names[v],
> >>>> + vp_vring_interrupt,
IRQF_NO_AUTOEN,
> >>>> + vp_dev->msix_names[v],
> >>>> vp_dev);
> >>>> if (err)
> >>>> goto error;
> >>>> @@ -337,7 +352,7 @@ static int vp_find_vqs_msix(struct
virtio_device *vdev, unsigned nvqs,
> >>>> "%s-%s",
> >>>> dev_name(&vp_dev->vdev.dev),
names[i]);
> >>>> err =
request_irq(pci_irq_vector(vp_dev->pci_dev, msix_vec),
> >>>> - vring_interrupt, 0,
> >>>> + vring_interrupt,
IRQF_NO_AUTOEN,
> >>>>
vp_dev->msix_names[msix_vec],
> >>>> vqs[i]);
> >>>> if (err)
> >>>> diff --git a/drivers/virtio/virtio_pci_common.h
b/drivers/virtio/virtio_pci_common.h
> >>>> index beec047a8f8d..a235ce9ff6a5 100644
> >>>> --- a/drivers/virtio/virtio_pci_common.h
> >>>> +++ b/drivers/virtio/virtio_pci_common.h
> >>>> @@ -102,8 +102,10 @@ static struct virtio_pci_device
*to_vp_device(struct virtio_device *vdev)
> >>>> return container_of(vdev, struct virtio_pci_device,
vdev);
> >>>> }
> >>>>
> >>>> -/* wait for pending irq handlers */
> >>>> -void vp_synchronize_vectors(struct virtio_device *vdev);
> >>>> +/* disable irq handlers */
> >>>> +void vp_disable_vectors(struct virtio_device *vdev);
> >>>> +/* enable irq handlers */
> >>>> +void vp_enable_vectors(struct virtio_device *vdev);
> >>>> /* the notify function used when creating a virt queue */
> >>>> bool vp_notify(struct virtqueue *vq);
> >>>> /* the config->del_vqs() implementation */
> >>>> diff --git a/drivers/virtio/virtio_pci_legacy.c
b/drivers/virtio/virtio_pci_legacy.c
> >>>> index d62e9835aeec..bdf6bc667ab5 100644
> >>>> --- a/drivers/virtio/virtio_pci_legacy.c
> >>>> +++ b/drivers/virtio/virtio_pci_legacy.c
> >>>> @@ -97,8 +97,8 @@ static void vp_reset(struct
virtio_device *vdev)
> >>>> /* Flush out the status write, and flush in device
writes,
> >>>> * including MSi-X interrupts, if any. */
> >>>> ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
> >>>> - /* Flush pending VQ/configuration callbacks. */
> >>>> - vp_synchronize_vectors(vdev);
> >>>> + /* Disable VQ/configuration callbacks. */
> >>>> + vp_disable_vectors(vdev);
> >>>> }
> >>>>
> >>>> static u16 vp_config_vector(struct virtio_pci_device
*vp_dev, u16 vector)
> >>>> @@ -194,6 +194,7 @@ static void del_vq(struct
virtio_pci_vq_info *info)
> >>>> }
> >>>>
> >>>> static const struct virtio_config_ops
virtio_pci_config_ops = {
> >>>> + .ready = vp_enable_vectors,
> >>>> .get = vp_get,
> >>>> .set = vp_set,
> >>>> .get_status = vp_get_status,
> >>>> diff --git a/drivers/virtio/virtio_pci_modern.c
b/drivers/virtio/virtio_pci_modern.c
> >>>> index 30654d3a0b41..acf0f6b6381d 100644
> >>>> --- a/drivers/virtio/virtio_pci_modern.c
> >>>> +++ b/drivers/virtio/virtio_pci_modern.c
> >>>> @@ -172,8 +172,8 @@ static void vp_reset(struct
virtio_device *vdev)
> >>>> */
> >>>> while (vp_modern_get_status(mdev))
> >>>> msleep(1);
> >>>> - /* Flush pending VQ/configuration callbacks. */
> >>>> - vp_synchronize_vectors(vdev);
> >>>> + /* Disable VQ/configuration callbacks. */
> >>>> + vp_disable_vectors(vdev);
> >>>> }
> >>>>
> >>>> static u16 vp_config_vector(struct virtio_pci_device
*vp_dev, u16 vector)
> >>>> @@ -380,6 +380,7 @@ static bool vp_get_shm_region(struct
virtio_device *vdev,
> >>>> }
> >>>>
> >>>> static const struct virtio_config_ops
virtio_pci_config_nodev_ops = {
> >>>> + .ready = vp_enable_vectors,
> >>>> .get = NULL,
> >>>> .set = NULL,
> >>>> .generation = vp_generation,
> >>>> @@ -397,6 +398,7 @@ static const struct virtio_config_ops
virtio_pci_config_nodev_ops = {
> >>>> };
> >>>>
> >>>> static const struct virtio_config_ops
virtio_pci_config_ops = {
> >>>> + .ready = vp_enable_vectors,
> >>>> .get = vp_get,
> >>>> .set = vp_set,
> >>>> .generation = vp_generation,
> >>>>
> >>
> >
>