On Thu, Oct 17, 2019 at 10:17:18AM +0800, Jason Wang wrote:> On 2019/10/17 ??7:33, Will Deacon wrote: > > In an attempt to remove the remaining traces of [smp_]read_barrier_depends() > > following my previous patches to strengthen READ_ONCE() for Alpha [1], I > > ended up trying to decipher the read_barrier_depends() usage in the vhost > > driver: > > > > --->8 > > > > // drivers/vhost/vhost.c > > static int get_indirect(struct vhost_virtqueue *vq, > > struct iovec iov[], unsigned int iov_size, > > unsigned int *out_num, unsigned int *in_num, > > struct vhost_log *log, unsigned int *log_num, > > struct vring_desc *indirect) > > { > > [...] > > > > /* We will use the result as an address to read from, so most > > * architectures only need a compiler barrier here. */ > > read_barrier_depends(); > > > > --->8 > > > > Unfortunately, although the barrier is commented (hurrah!), it's not > > particularly enlightening about the accesses making up the dependency > > chain, and I don't understand the supposed need for a compiler barrier > > either (read_barrier_depends() doesn't generally provide this). > > > > Does anybody know which accesses are being ordered here? Usually you'd need > > a READ_ONCE()/rcu_dereference() beginning the chain, but I haven't managed > > to find one... > > > > I guess it was because we will read from the address stored in the iov like: > > 1) trasnlate_desc() that stores the userspace buffer pointer in the iov > > 2) copy_from_iter() that reads from those pointersIsn't that exactly the same flow as vhost_copy_from_user(), which doesn't have the barrier? Staring at the code some more, my best bet at the moment is that the load of 'indirect->addr' is probably the one to worry about, since it's part of the vring and can be updated concurrently.> So we need a data dependency barrier in the middle as explained in the > memory-barriers.txt? (since READ_ONCE is not used in iov iterator).If the barrier is actually required, then there must be a concurrent access involved, in which case READ_ONCE should also be used. So I would propose something like the diff below, but I'd still be glad to hear whether I'm barking up the wrong tree. Will --->8 diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 36ca2cf419bf..2e370a229fea 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -2107,6 +2107,7 @@ static int get_indirect(struct vhost_virtqueue *vq, { struct vring_desc desc; unsigned int i = 0, count, found = 0; + __virtio64 addr = READ_ONCE(indirect->addr); u32 len = vhost32_to_cpu(vq, indirect->len); struct iov_iter from; int ret, access; @@ -2120,7 +2121,7 @@ static int get_indirect(struct vhost_virtqueue *vq, return -EINVAL; } - ret = translate_desc(vq, vhost64_to_cpu(vq, indirect->addr), len, vq->indirect, + ret = translate_desc(vq, vhost64_to_cpu(vq, addr), len, vq->indirect, UIO_MAXIOV, VHOST_ACCESS_RO); if (unlikely(ret < 0)) { if (ret != -EAGAIN) @@ -2129,10 +2130,6 @@ static int get_indirect(struct vhost_virtqueue *vq, } iov_iter_init(&from, READ, vq->indirect, ret, len); - /* We will use the result as an address to read from, so most - * architectures only need a compiler barrier here. */ - read_barrier_depends(); - count = len / sizeof desc; /* Buffers are chained via a 16 bit next field, so * we can have at most 2^16 of these. */ @@ -2152,12 +2149,12 @@ static int get_indirect(struct vhost_virtqueue *vq, } if (unlikely(!copy_from_iter_full(&desc, sizeof(desc), &from))) { vq_err(vq, "Failed indirect descriptor: idx %d, %zx\n", - i, (size_t)vhost64_to_cpu(vq, indirect->addr) + i * sizeof desc); + i, (size_t)vhost64_to_cpu(vq, addr) + i * sizeof desc); return -EINVAL; } if (unlikely(desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_INDIRECT))) { vq_err(vq, "Nested indirect descriptor: idx %d, %zx\n", - i, (size_t)vhost64_to_cpu(vq, indirect->addr) + i * sizeof desc); + i, (size_t)vhost64_to_cpu(vq, addr) + i * sizeof desc); return -EINVAL; }
On 2019/10/19 ??4:58, Will Deacon wrote:> On Thu, Oct 17, 2019 at 10:17:18AM +0800, Jason Wang wrote: >> On 2019/10/17 ??7:33, Will Deacon wrote: >>> In an attempt to remove the remaining traces of [smp_]read_barrier_depends() >>> following my previous patches to strengthen READ_ONCE() for Alpha [1], I >>> ended up trying to decipher the read_barrier_depends() usage in the vhost >>> driver: >>> >>> --->8 >>> >>> // drivers/vhost/vhost.c >>> static int get_indirect(struct vhost_virtqueue *vq, >>> struct iovec iov[], unsigned int iov_size, >>> unsigned int *out_num, unsigned int *in_num, >>> struct vhost_log *log, unsigned int *log_num, >>> struct vring_desc *indirect) >>> { >>> [...] >>> >>> /* We will use the result as an address to read from, so most >>> * architectures only need a compiler barrier here. */ >>> read_barrier_depends(); >>> >>> --->8 >>> >>> Unfortunately, although the barrier is commented (hurrah!), it's not >>> particularly enlightening about the accesses making up the dependency >>> chain, and I don't understand the supposed need for a compiler barrier >>> either (read_barrier_depends() doesn't generally provide this). >>> >>> Does anybody know which accesses are being ordered here? Usually you'd need >>> a READ_ONCE()/rcu_dereference() beginning the chain, but I haven't managed >>> to find one... >>> >> I guess it was because we will read from the address stored in the iov like: >> >> 1) trasnlate_desc() that stores the userspace buffer pointer in the iov >> >> 2) copy_from_iter() that reads from those pointers > Isn't that exactly the same flow as vhost_copy_from_user(), which doesn't > have the barrier?There's a rmb before calling vhost_copy_from_user() (vhost_get_desc()).> Staring at the code some more, my best bet at the moment > is that the load of 'indirect->addr' is probably the one to worry about, > since it's part of the vring and can be updated concurrently.I'm also confused about the barrier here, basically in driver side we did: 1) allocate pages 2) store pages in indirect->addr 3) smp_wmb() 4) increase the avail idx (somehow a tail pointer of vring) in vhost we did: 1) read avail idx 2) smp_rmb() 3) read indirect->addr 4) read from indirect->addr It looks to me even the data dependency barrier is not necessary since we have rmb() which is sufficient for us to the correct indirect->addr and driver are not expected to do any writing to indirect->addr after avail idx is increased ? Thanks> >> So we need a data dependency barrier in the middle as explained in the >> memory-barriers.txt? (since READ_ONCE is not used in iov iterator). > If the barrier is actually required, then there must be a concurrent access > involved, in which case READ_ONCE should also be used. So I would propose > something like the diff below, but I'd still be glad to hear whether I'm > barking up the wrong tree. > > Will > > --->8 > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > index 36ca2cf419bf..2e370a229fea 100644 > --- a/drivers/vhost/vhost.c > +++ b/drivers/vhost/vhost.c > @@ -2107,6 +2107,7 @@ static int get_indirect(struct vhost_virtqueue *vq, > { > struct vring_desc desc; > unsigned int i = 0, count, found = 0; > + __virtio64 addr = READ_ONCE(indirect->addr); > u32 len = vhost32_to_cpu(vq, indirect->len); > struct iov_iter from; > int ret, access; > @@ -2120,7 +2121,7 @@ static int get_indirect(struct vhost_virtqueue *vq, > return -EINVAL; > } > > - ret = translate_desc(vq, vhost64_to_cpu(vq, indirect->addr), len, vq->indirect, > + ret = translate_desc(vq, vhost64_to_cpu(vq, addr), len, vq->indirect, > UIO_MAXIOV, VHOST_ACCESS_RO); > if (unlikely(ret < 0)) { > if (ret != -EAGAIN) > @@ -2129,10 +2130,6 @@ static int get_indirect(struct vhost_virtqueue *vq, > } > iov_iter_init(&from, READ, vq->indirect, ret, len); > > - /* We will use the result as an address to read from, so most > - * architectures only need a compiler barrier here. */ > - read_barrier_depends(); > - > count = len / sizeof desc; > /* Buffers are chained via a 16 bit next field, so > * we can have at most 2^16 of these. */ > @@ -2152,12 +2149,12 @@ static int get_indirect(struct vhost_virtqueue *vq, > } > if (unlikely(!copy_from_iter_full(&desc, sizeof(desc), &from))) { > vq_err(vq, "Failed indirect descriptor: idx %d, %zx\n", > - i, (size_t)vhost64_to_cpu(vq, indirect->addr) + i * sizeof desc); > + i, (size_t)vhost64_to_cpu(vq, addr) + i * sizeof desc); > return -EINVAL; > } > if (unlikely(desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_INDIRECT))) { > vq_err(vq, "Nested indirect descriptor: idx %d, %zx\n", > - i, (size_t)vhost64_to_cpu(vq, indirect->addr) + i * sizeof desc); > + i, (size_t)vhost64_to_cpu(vq, addr) + i * sizeof desc); > return -EINVAL; > } >
Hi Jason, On Mon, Oct 21, 2019 at 01:48:53PM +0800, Jason Wang wrote:> On 2019/10/19 ??4:58, Will Deacon wrote: > > Staring at the code some more, my best bet at the moment > > is that the load of 'indirect->addr' is probably the one to worry about, > > since it's part of the vring and can be updated concurrently. > > I'm also confused about the barrier here, basically in driver side we did: > > 1) allocate pages > > 2) store pages in indirect->addr > > 3) smp_wmb() > > 4) increase the avail idx (somehow a tail pointer of vring) > > in vhost we did: > > 1) read avail idx > > 2) smp_rmb() > > 3) read indirect->addr > > 4) read from indirect->addr > > It looks to me even the data dependency barrier is not necessary since we > have rmb() which is sufficient for us to the correct indirect->addr and > driver are not expected to do any writing to indirect->addr after avail idx > is increased ?That makes sense to me: the synchronization is done via vq->avail_idx() and so the subsequent access of the indirect pages doesn't need additional barriers, even on Alpha. Thanks. I'll write up a patch and adopt your explanation above. Will