Xuan Zhuo
2023-Mar-15 06:01 UTC
[PATCH vhost v2 06/12] virtio_ring: split-indirect: support premapped
On Wed, 15 Mar 2023 12:47:29 +0800, Jason Wang <jasowang at redhat.com> wrote:> On Tue, Mar 14, 2023 at 5:17?PM Xuan Zhuo <xuanzhuo at linux.alibaba.com> wrote: > > > > On Tue, 14 Mar 2023 15:57:06 +0800, Jason Wang <jasowang at redhat.com> wrote: > > > On Wed, Mar 8, 2023 at 2:44?PM Xuan Zhuo <xuanzhuo at linux.alibaba.com> wrote: > > > > > > > > virtio core only supports virtual addresses, dma is completed in virtio > > > > core. > > > > > > > > In some scenarios (such as the AF_XDP), the memory is allocated > > > > and DMA is completed in advance, so it is necessary for us to support > > > > passing the DMA address to virtio core. > > > > > > > > Drives can use sg->dma_address to pass the mapped dma address to virtio > > > > core. If one sg->dma_address is used then all sgs must use sg->dma_address, > > > > otherwise all dma_address must be null. > > > > > > > > On the indirect path, if dma_address is used, desc_state.indir_desc will > > > > be mixed with VRING_INDIRECT_PREMAPPED. So when do unmap, we can pass it. > > > > > > It's better to mention why indirect descriptors can't be done in the > > > same way with direct descriptors. > > > > > > Btw, if we change the semantics of desc_extra.dma_addr and > > > desc_state.indir_desc, we should add comments to definitions of those > > > structures. > > > > > > Will fix. > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo at linux.alibaba.com> > > > > --- > > > > drivers/virtio/virtio_ring.c | 28 ++++++++++++++++++++++------ > > > > 1 file changed, 22 insertions(+), 6 deletions(-) > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > index 66a071e3bdef..11827d2e56a8 100644 > > > > --- a/drivers/virtio/virtio_ring.c > > > > +++ b/drivers/virtio/virtio_ring.c > > > > @@ -231,6 +231,18 @@ static void vring_free(struct virtqueue *_vq); > > > > * Helpers. > > > > */ > > > > > > > > +#define VRING_INDIRECT_PREMAPPED BIT(0) > > > > + > > > > +#define desc_mix_dma_map(do_map, desc) \ > > > > + (do_map ? desc : (typeof(desc))((unsigned long)(desc) | VRING_INDIRECT_PREMAPPED)) > > > > + > > > > +#define desc_rm_dma_map(desc) \ > > > > + ((typeof(desc))((unsigned long)(desc) & ~VRING_INDIRECT_PREMAPPED)) > > > > + > > > > +#define desc_map_inter(desc) \ > > > > + !((unsigned long)(desc) & VRING_INDIRECT_PREMAPPED) > > > > + > > > > + > > > > #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq) > > > > > > > > static inline bool virtqueue_use_indirect(struct vring_virtqueue *vq, > > > > @@ -725,7 +737,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, > > > > /* Store token and indirect buffer state. */ > > > > vq->split.desc_state[head].data = data; > > > > if (indirect) > > > > - vq->split.desc_state[head].indir_desc = desc; > > > > + vq->split.desc_state[head].indir_desc = desc_mix_dma_map(do_map, desc); > > > > > > So using indir_desc is kind of hacky (since we don't use indirect for > > > rx with extra context). > > > > > > But at least I think we should seeka way to use the same metadata for > > > both direct and indirect descriptors. > > > > > > E.g can we make them all to use indir_desc? > > > > I think it may not. My original idea is to use indir_desc uniformly, but > > for the scene of saving ctx, we cannot guarantee that the ctx has space for us. > > Ok, but the problem is that the code became even more hacky (imagine > one day we may want to use indirect for RX?).I think it may have nothing to do with RX, but whether ctx is used. Because ctx and indirect cannot coexist. static inline int virtqueue_add_split(struct virtqueue *_vq, struct scatterlist *sgs[], unsigned int total_sg, unsigned int out_sgs, unsigned int in_sgs, void *data, void *ctx, gfp_t gfp) { struct vring_virtqueue *vq = to_vvq(_vq); struct scatterlist *sg; struct vring_desc *desc; unsigned int i, n, avail, descs_used, prev, err_idx; int head; bool indirect; START_USE(vq); BUG_ON(data == NULL); > BUG_ON(ctx && vq->indirect); If we want to use ctx with indirect, we must add an dedicated metadata for ctx. So I think the current plan is OK. Thanks.> > So I tend to change my mind to introduce dedicated metadata, instead > of trying to be packed with two types of the existing ones. > > Thanks > > > > > Thanks. > > > > > > > > Thanks > > > > > > > else > > > > vq->split.desc_state[head].indir_desc = ctx; > > > > > > > > @@ -820,22 +832,26 @@ static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head, > > > > vq->vq.num_free++; > > > > > > > > if (vq->indirect) { > > > > - struct vring_desc *indir_desc > > > > - vq->split.desc_state[head].indir_desc; > > > > + struct vring_desc *mix = vq->split.desc_state[head].indir_desc; > > > > + struct vring_desc *indir_desc; > > > > u32 len; > > > > > > > > /* Free the indirect table, if any, now that it's unmapped. */ > > > > - if (!indir_desc) > > > > + if (!mix) > > > > return; > > > > > > > > + indir_desc = desc_rm_dma_map(mix); > > > > + > > > > len = vq->split.desc_extra[head].len; > > > > > > > > BUG_ON(!(vq->split.desc_extra[head].flags & > > > > VRING_DESC_F_INDIRECT)); > > > > BUG_ON(len == 0 || len % sizeof(struct vring_desc)); > > > > > > > > - for (j = 0; j < len / sizeof(struct vring_desc); j++) > > > > - vring_unmap_one_split_indirect(vq, &indir_desc[j]); > > > > + if (desc_map_inter(mix)) { > > > > + for (j = 0; j < len / sizeof(struct vring_desc); j++) > > > > + vring_unmap_one_split_indirect(vq, &indir_desc[j]); > > > > + } > > > > > > > > kfree(indir_desc); > > > > vq->split.desc_state[head].indir_desc = NULL; > > > > -- > > > > 2.32.0.3.g01195cf9f > > > > > > > > > >
Jason Wang
2023-Mar-16 02:49 UTC
[PATCH vhost v2 06/12] virtio_ring: split-indirect: support premapped
On Wed, Mar 15, 2023 at 2:06?PM Xuan Zhuo <xuanzhuo at linux.alibaba.com> wrote:> > On Wed, 15 Mar 2023 12:47:29 +0800, Jason Wang <jasowang at redhat.com> wrote: > > On Tue, Mar 14, 2023 at 5:17?PM Xuan Zhuo <xuanzhuo at linux.alibaba.com> wrote: > > > > > > On Tue, 14 Mar 2023 15:57:06 +0800, Jason Wang <jasowang at redhat.com> wrote: > > > > On Wed, Mar 8, 2023 at 2:44?PM Xuan Zhuo <xuanzhuo at linux.alibaba.com> wrote: > > > > > > > > > > virtio core only supports virtual addresses, dma is completed in virtio > > > > > core. > > > > > > > > > > In some scenarios (such as the AF_XDP), the memory is allocated > > > > > and DMA is completed in advance, so it is necessary for us to support > > > > > passing the DMA address to virtio core. > > > > > > > > > > Drives can use sg->dma_address to pass the mapped dma address to virtio > > > > > core. If one sg->dma_address is used then all sgs must use sg->dma_address, > > > > > otherwise all dma_address must be null. > > > > > > > > > > On the indirect path, if dma_address is used, desc_state.indir_desc will > > > > > be mixed with VRING_INDIRECT_PREMAPPED. So when do unmap, we can pass it. > > > > > > > > It's better to mention why indirect descriptors can't be done in the > > > > same way with direct descriptors. > > > > > > > > Btw, if we change the semantics of desc_extra.dma_addr and > > > > desc_state.indir_desc, we should add comments to definitions of those > > > > structures. > > > > > > > > > Will fix. > > > > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo at linux.alibaba.com> > > > > > --- > > > > > drivers/virtio/virtio_ring.c | 28 ++++++++++++++++++++++------ > > > > > 1 file changed, 22 insertions(+), 6 deletions(-) > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > index 66a071e3bdef..11827d2e56a8 100644 > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > @@ -231,6 +231,18 @@ static void vring_free(struct virtqueue *_vq); > > > > > * Helpers. > > > > > */ > > > > > > > > > > +#define VRING_INDIRECT_PREMAPPED BIT(0) > > > > > + > > > > > +#define desc_mix_dma_map(do_map, desc) \ > > > > > + (do_map ? desc : (typeof(desc))((unsigned long)(desc) | VRING_INDIRECT_PREMAPPED)) > > > > > + > > > > > +#define desc_rm_dma_map(desc) \ > > > > > + ((typeof(desc))((unsigned long)(desc) & ~VRING_INDIRECT_PREMAPPED)) > > > > > + > > > > > +#define desc_map_inter(desc) \ > > > > > + !((unsigned long)(desc) & VRING_INDIRECT_PREMAPPED) > > > > > + > > > > > + > > > > > #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq) > > > > > > > > > > static inline bool virtqueue_use_indirect(struct vring_virtqueue *vq, > > > > > @@ -725,7 +737,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, > > > > > /* Store token and indirect buffer state. */ > > > > > vq->split.desc_state[head].data = data; > > > > > if (indirect) > > > > > - vq->split.desc_state[head].indir_desc = desc; > > > > > + vq->split.desc_state[head].indir_desc = desc_mix_dma_map(do_map, desc); > > > > > > > > So using indir_desc is kind of hacky (since we don't use indirect for > > > > rx with extra context). > > > > > > > > But at least I think we should seeka way to use the same metadata for > > > > both direct and indirect descriptors. > > > > > > > > E.g can we make them all to use indir_desc? > > > > > > I think it may not. My original idea is to use indir_desc uniformly, but > > > for the scene of saving ctx, we cannot guarantee that the ctx has space for us. > > > > Ok, but the problem is that the code became even more hacky (imagine > > one day we may want to use indirect for RX?). > > > I think it may have nothing to do with RX, but whether ctx is used. Because ctx > and indirect cannot coexist. > > static inline int virtqueue_add_split(struct virtqueue *_vq, > struct scatterlist *sgs[], > unsigned int total_sg, > unsigned int out_sgs, > unsigned int in_sgs, > void *data, > void *ctx, > gfp_t gfp) > { > struct vring_virtqueue *vq = to_vvq(_vq); > struct scatterlist *sg; > struct vring_desc *desc; > unsigned int i, n, avail, descs_used, prev, err_idx; > int head; > bool indirect; > > START_USE(vq); > > BUG_ON(data == NULL); > > BUG_ON(ctx && vq->indirect); > > If we want to use ctx with indirect, we must add an dedicated metadata for ctx.The reason this BUG_ON() is that we do a hack: For the vq that use ctx it can't use indirect. The only user so far is the mergeable RX path. This path adds one more hack on top, this seems a burden for the future maintenance. Consider one day we may want to use a virtqueue with both indirect and extra context. We can hear from others. Thanks> > So I think the current plan is OK. > > Thanks. > > > > > So I tend to change my mind to introduce dedicated metadata, instead > > of trying to be packed with two types of the existing ones. > > > > Thanks > > > > > > > > Thanks. > > > > > > > > > > > Thanks > > > > > > > > > else > > > > > vq->split.desc_state[head].indir_desc = ctx; > > > > > > > > > > @@ -820,22 +832,26 @@ static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head, > > > > > vq->vq.num_free++; > > > > > > > > > > if (vq->indirect) { > > > > > - struct vring_desc *indir_desc > > > > > - vq->split.desc_state[head].indir_desc; > > > > > + struct vring_desc *mix = vq->split.desc_state[head].indir_desc; > > > > > + struct vring_desc *indir_desc; > > > > > u32 len; > > > > > > > > > > /* Free the indirect table, if any, now that it's unmapped. */ > > > > > - if (!indir_desc) > > > > > + if (!mix) > > > > > return; > > > > > > > > > > + indir_desc = desc_rm_dma_map(mix); > > > > > + > > > > > len = vq->split.desc_extra[head].len; > > > > > > > > > > BUG_ON(!(vq->split.desc_extra[head].flags & > > > > > VRING_DESC_F_INDIRECT)); > > > > > BUG_ON(len == 0 || len % sizeof(struct vring_desc)); > > > > > > > > > > - for (j = 0; j < len / sizeof(struct vring_desc); j++) > > > > > - vring_unmap_one_split_indirect(vq, &indir_desc[j]); > > > > > + if (desc_map_inter(mix)) { > > > > > + for (j = 0; j < len / sizeof(struct vring_desc); j++) > > > > > + vring_unmap_one_split_indirect(vq, &indir_desc[j]); > > > > > + } > > > > > > > > > > kfree(indir_desc); > > > > > vq->split.desc_state[head].indir_desc = NULL; > > > > > -- > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > >