On Tue, Nov 27, 2018 at 05:50:50PM +0000, Jean-Philippe Brucker wrote:> On 23/11/2018 22:02, Michael S. Tsirkin wrote: > >> +/* > >> + * __viommu_sync_req - Complete all in-flight requests > >> + * > >> + * Wait for all added requests to complete. When this function returns, all > >> + * requests that were in-flight at the time of the call have completed. > >> + */ > >> +static int __viommu_sync_req(struct viommu_dev *viommu) > >> +{ > >> + int ret = 0; > >> + unsigned int len; > >> + size_t write_len; > >> + struct viommu_request *req; > >> + struct virtqueue *vq = viommu->vqs[VIOMMU_REQUEST_VQ]; > >> + > >> + assert_spin_locked(&viommu->request_lock); > >> + > >> + virtqueue_kick(vq); > >> + > >> + while (!list_empty(&viommu->requests)) { > >> + len = 0; > >> + req = virtqueue_get_buf(vq, &len); > >> + if (!req) > >> + continue; > >> + > >> + if (!len) > >> + viommu_set_req_status(req->buf, req->len, > >> + VIRTIO_IOMMU_S_IOERR); > >> + > >> + write_len = req->len - req->write_offset; > >> + if (req->writeback && len == write_len) > >> + memcpy(req->writeback, req->buf + req->write_offset, > >> + write_len); > >> + > >> + list_del(&req->list); > >> + kfree(req); > >> + } > > > > I didn't notice this in the past but it seems this will spin > > with interrupts disabled until host handles the request. > > Please do not do this - host execution can be another > > task that needs the same host CPU. This will then disable > > interrupts for a very very long time. > > In the guest yes, but that doesn't prevent the host from running another > task right?Doesn't prevent it but it will delay it significantly until scheduler decides to kick the VCPU task out.> My tests run fine when QEMU is bound to a single CPU, even > though vcpu and viommu run in different threads > > > What to do then? Queue in software and wake up task. > > Unfortunately I can't do anything here, because IOMMU drivers can't > sleep in the iommu_map() or iommu_unmap() path. > > The problem is the same > for all IOMMU drivers. That's because the DMA API allows drivers to call > some functions with interrupts disabled. For example > Documentation/DMA-API-HOWTO.txt allows dma_alloc_coherent() and > dma_unmap_single() to be called in interrupt context.In fact I don't really understand how it's supposed to work at all: you only sync when ring is full. So host may not have seen your map request if ring is not full. Why is it safe to use the address with a device then?> > As kick is vm exit, kick under interrupts disabled is discouraged too: > > better to prepare for kick enable interrupts then kick. > > That was on my list of things to look at, because it could relax > things for device drivers that don't call us with interrupts disabled. I > just tried it and I can see some performance improvement (7% and 4% on > tcp_stream and tcp_maerts respectively, +/-2.5%). > > Since it's an optimization I'll leave it for later (ACPI and module > support is higher on my list). The resulting change is complicated > because we now need to deal with threads adding new requests while > sync() is running. With my current prototype one thread could end up > staying in sync() while other threads add new async requests, so I need > to find a way to bound it. > > Thanks, > Jean
On 27/11/2018 18:04, Michael S. Tsirkin wrote:> On Tue, Nov 27, 2018 at 05:50:50PM +0000, Jean-Philippe Brucker wrote: >> On 23/11/2018 22:02, Michael S. Tsirkin wrote: >>>> +/* >>>> + * __viommu_sync_req - Complete all in-flight requests >>>> + * >>>> + * Wait for all added requests to complete. When this function returns, all >>>> + * requests that were in-flight at the time of the call have completed. >>>> + */ >>>> +static int __viommu_sync_req(struct viommu_dev *viommu) >>>> +{ >>>> + int ret = 0; >>>> + unsigned int len; >>>> + size_t write_len; >>>> + struct viommu_request *req; >>>> + struct virtqueue *vq = viommu->vqs[VIOMMU_REQUEST_VQ]; >>>> + >>>> + assert_spin_locked(&viommu->request_lock); >>>> + >>>> + virtqueue_kick(vq); >>>> + >>>> + while (!list_empty(&viommu->requests)) { >>>> + len = 0; >>>> + req = virtqueue_get_buf(vq, &len); >>>> + if (!req) >>>> + continue; >>>> + >>>> + if (!len) >>>> + viommu_set_req_status(req->buf, req->len, >>>> + VIRTIO_IOMMU_S_IOERR); >>>> + >>>> + write_len = req->len - req->write_offset; >>>> + if (req->writeback && len == write_len) >>>> + memcpy(req->writeback, req->buf + req->write_offset, >>>> + write_len); >>>> + >>>> + list_del(&req->list); >>>> + kfree(req); >>>> + } >>> >>> I didn't notice this in the past but it seems this will spin >>> with interrupts disabled until host handles the request. >>> Please do not do this - host execution can be another >>> task that needs the same host CPU. This will then disable >>> interrupts for a very very long time. >> >> In the guest yes, but that doesn't prevent the host from running another >> task right? > > Doesn't prevent it but it will delay it significantly > until scheduler decides to kick the VCPU task out. > >> My tests run fine when QEMU is bound to a single CPU, even >> though vcpu and viommu run in different threads >> >>> What to do then? Queue in software and wake up task. >> >> Unfortunately I can't do anything here, because IOMMU drivers can't >> sleep in the iommu_map() or iommu_unmap() path. >> >> The problem is the same >> for all IOMMU drivers. That's because the DMA API allows drivers to call >> some functions with interrupts disabled. For example >> Documentation/DMA-API-HOWTO.txt allows dma_alloc_coherent() and >> dma_unmap_single() to be called in interrupt context. > > In fact I don't really understand how it's supposed to > work at all: you only sync when ring is full. > So host may not have seen your map request if ring > is not full. > Why is it safe to use the address with a device then?viommu_map() calls viommu_send_req_sync(), which does the sync immediately after adding the MAP request. Thanks, Jean
On Tue, Nov 27, 2018 at 06:10:46PM +0000, Jean-Philippe Brucker wrote:> On 27/11/2018 18:04, Michael S. Tsirkin wrote: > > On Tue, Nov 27, 2018 at 05:50:50PM +0000, Jean-Philippe Brucker wrote: > >> On 23/11/2018 22:02, Michael S. Tsirkin wrote: > >>>> +/* > >>>> + * __viommu_sync_req - Complete all in-flight requests > >>>> + * > >>>> + * Wait for all added requests to complete. When this function returns, all > >>>> + * requests that were in-flight at the time of the call have completed. > >>>> + */ > >>>> +static int __viommu_sync_req(struct viommu_dev *viommu) > >>>> +{ > >>>> + int ret = 0; > >>>> + unsigned int len; > >>>> + size_t write_len; > >>>> + struct viommu_request *req; > >>>> + struct virtqueue *vq = viommu->vqs[VIOMMU_REQUEST_VQ]; > >>>> + > >>>> + assert_spin_locked(&viommu->request_lock); > >>>> + > >>>> + virtqueue_kick(vq); > >>>> + > >>>> + while (!list_empty(&viommu->requests)) { > >>>> + len = 0; > >>>> + req = virtqueue_get_buf(vq, &len); > >>>> + if (!req) > >>>> + continue; > >>>> + > >>>> + if (!len) > >>>> + viommu_set_req_status(req->buf, req->len, > >>>> + VIRTIO_IOMMU_S_IOERR); > >>>> + > >>>> + write_len = req->len - req->write_offset; > >>>> + if (req->writeback && len == write_len) > >>>> + memcpy(req->writeback, req->buf + req->write_offset, > >>>> + write_len); > >>>> + > >>>> + list_del(&req->list); > >>>> + kfree(req); > >>>> + } > >>> > >>> I didn't notice this in the past but it seems this will spin > >>> with interrupts disabled until host handles the request. > >>> Please do not do this - host execution can be another > >>> task that needs the same host CPU. This will then disable > >>> interrupts for a very very long time. > >> > >> In the guest yes, but that doesn't prevent the host from running another > >> task right? > > > > Doesn't prevent it but it will delay it significantly > > until scheduler decides to kick the VCPU task out. > > > >> My tests run fine when QEMU is bound to a single CPU, even > >> though vcpu and viommu run in different threads > >> > >>> What to do then? Queue in software and wake up task. > >> > >> Unfortunately I can't do anything here, because IOMMU drivers can't > >> sleep in the iommu_map() or iommu_unmap() path. > >> > >> The problem is the same > >> for all IOMMU drivers. That's because the DMA API allows drivers to call > >> some functions with interrupts disabled. For example > >> Documentation/DMA-API-HOWTO.txt allows dma_alloc_coherent() and > >> dma_unmap_single() to be called in interrupt context. > > > > In fact I don't really understand how it's supposed to > > work at all: you only sync when ring is full. > > So host may not have seen your map request if ring > > is not full. > > Why is it safe to use the address with a device then? > > viommu_map() calls viommu_send_req_sync(), which does the sync > immediately after adding the MAP request. > > Thanks, > JeanI see. So it happens on every request. Maybe you should clear event index then. This way if exits are disabled you know that host is processing the ring. Event index is good for when you don't care when it will be processed, you just want to reduce number of exits as much as possible. -- MST