thr3ads.net - Linux Virtualization - [PULL 2/2] vhost: replace rcu with mutex [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Michael S. Tsirkin

2014-Jun-02 21:30 UTC

[PULL 0/2] vhost enhancements for 3.16

Reposting with actual patches included.

The following changes since commit 96b2e73c5471542cb9c622c4360716684f8797ed:

  Revert "net/mlx4_en: Use affinity hint" (2014-06-02 00:18:48 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-next

for you to fetch changes up to 2ae76693b8bcabf370b981cd00c36cd41d33fabc:

  vhost: replace rcu with mutex (2014-06-02 23:47:59 +0300)

----------------------------------------------------------------
Michael S. Tsirkin (2):
      vhost-net: extend device allocation to vmalloc
      vhost: replace rcu with mutex

 drivers/vhost/net.c   | 23 ++++++++++++++++++-----
 drivers/vhost/vhost.c | 10 +++++++++-
 2 files changed, 27 insertions(+), 6 deletions(-)

Michael S. Tsirkin

2014-Jun-02 21:30 UTC

head link

[PULL 1/2] vhost-net: extend device allocation to vmalloc

Michael Mueller provided a patch to reduce the size of
vhost-net structure as some allocations could fail under
memory pressure/fragmentation. We are still left with
high order allocations though.

This patch is handling the problem at the core level, allowing
vhost structures to use vmalloc() if kmalloc() failed.

As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT
to kzalloc() flags to do this fallback only when really needed.

People are still looking at cleaner ways to handle the problem
at the API level, probably passing in multiple iovecs.
This hack seems consistent with approaches
taken since then by drivers/vhost/scsi.c and net/core/dev.c

Based on patch by Romain Francoise.

Cc: Michael Mueller <mimu at linux.vnet.ibm.com>
Signed-off-by: Romain Francoise <romain at orebokech.com>
Acked-by: Michael S. Tsirkin <mst at redhat.com>
---
 drivers/vhost/net.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index be414d2b..e489161 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -17,6 +17,7 @@
 #include <linux/workqueue.h>
 #include <linux/file.h>
 #include <linux/slab.h>
+#include <linux/vmalloc.h>
 
 #include <linux/net.h>
 #include <linux/if_packet.h>
@@ -699,18 +700,30 @@ static void handle_rx_net(struct vhost_work *work)
 	handle_rx(net);
 }
 
+static void vhost_net_free(void *addr)
+{
+	if (is_vmalloc_addr(addr))
+		vfree(addr);
+	else
+		kfree(addr);
+}
+
 static int vhost_net_open(struct inode *inode, struct file *f)
 {
-	struct vhost_net *n = kmalloc(sizeof *n, GFP_KERNEL);
+	struct vhost_net *n;
 	struct vhost_dev *dev;
 	struct vhost_virtqueue **vqs;
 	int i;
 
-	if (!n)
-		return -ENOMEM;
+	n = kmalloc(sizeof *n, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT);
+	if (!n) {
+		n = vmalloc(sizeof *n);
+		if (!n)
+			return -ENOMEM;
+	}
 	vqs = kmalloc(VHOST_NET_VQ_MAX * sizeof(*vqs), GFP_KERNEL);
 	if (!vqs) {
-		kfree(n);
+		vhost_net_free(n);
 		return -ENOMEM;
 	}
 
@@ -827,7 +840,7 @@ static int vhost_net_release(struct inode *inode, struct
file *f)
 	 * since jobs can re-queue themselves. */
 	vhost_net_flush(n);
 	kfree(n->dev.vqs);
-	kfree(n);
+	vhost_net_free(n);
 	return 0;
 }
 
-- 
MST

Michael S. Tsirkin

2014-Jun-02 21:30 UTC

head link

[PULL 2/2] vhost: replace rcu with mutex

All memory accesses are done under some VQ mutex.
So lock/unlock all VQs is a faster equivalent of synchronize_rcu()
for memory access changes.
Some guests cause a lot of these changes, so it's helpful
to make them faster.

Reported-by: "Gonglei (Arei)" <arei.gonglei at huawei.com>
Signed-off-by: Michael S. Tsirkin <mst at redhat.com>
---
 drivers/vhost/vhost.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 78987e4..1c05e60 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -593,6 +593,7 @@ static long vhost_set_memory(struct vhost_dev *d, struct
vhost_memory __user *m)
 {
 	struct vhost_memory mem, *newmem, *oldmem;
 	unsigned long size = offsetof(struct vhost_memory, regions);
+	int i;
 
 	if (copy_from_user(&mem, m, size))
 		return -EFAULT;
@@ -619,7 +620,14 @@ static long vhost_set_memory(struct vhost_dev *d, struct
vhost_memory __user *m)
 	oldmem = rcu_dereference_protected(d->memory,
 					   lockdep_is_held(&d->mutex));
 	rcu_assign_pointer(d->memory, newmem);
-	synchronize_rcu();
+
+	/* All memory accesses are done under some VQ mutex.
+	 * So below is a faster equivalent of synchronize_rcu()
+	 */
+	for (i = 0; i < d->nvqs; ++i) {
+		mutex_lock(&d->vqs[i]->mutex);
+		mutex_unlock(&d->vqs[i]->mutex);
+	}
 	kfree(oldmem);
 	return 0;
 }
-- 
MST

Eric Dumazet

2014-Jun-02 21:58 UTC

head link

[PULL 2/2] vhost: replace rcu with mutex

On Tue, 2014-06-03 at 00:30 +0300, Michael S. Tsirkin
wrote:> All memory accesses are done under some VQ mutex.
> So lock/unlock all VQs is a faster equivalent of synchronize_rcu()
> for memory access changes.
> Some guests cause a lot of these changes, so it's helpful
> to make them faster.
> 
> Reported-by: "Gonglei (Arei)" <arei.gonglei at huawei.com>
> Signed-off-by: Michael S. Tsirkin <mst at redhat.com>
> ---
>  drivers/vhost/vhost.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 78987e4..1c05e60 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -593,6 +593,7 @@ static long vhost_set_memory(struct vhost_dev *d,
struct vhost_memory __user *m)
>  {
>  	struct vhost_memory mem, *newmem, *oldmem;
>  	unsigned long size = offsetof(struct vhost_memory, regions);
> +	int i;
>  
>  	if (copy_from_user(&mem, m, size))
>  		return -EFAULT;
> @@ -619,7 +620,14 @@ static long vhost_set_memory(struct vhost_dev *d,
struct vhost_memory __user *m)
>  	oldmem = rcu_dereference_protected(d->memory,
>  					   lockdep_is_held(&d->mutex));
>  	rcu_assign_pointer(d->memory, newmem);
> -	synchronize_rcu();
> +
> +	/* All memory accesses are done under some VQ mutex.
> +	 * So below is a faster equivalent of synchronize_rcu()
> +	 */
> +	for (i = 0; i < d->nvqs; ++i) {
> +		mutex_lock(&d->vqs[i]->mutex);
> +		mutex_unlock(&d->vqs[i]->mutex);
> +	}
>  	kfree(oldmem);
>  	return 0;
>  }
This looks dubious

What about using kfree_rcu() instead ?

translate_desc() still uses rcu_read_lock(), its not clear if the mutex
is really held.

Apparently Analagous Threads

Search for more possibly parallel threads

Linux Virtualization - Jun 2014 - [PULL 2/2] vhost: replace rcu with mutex

[PULL 0/2] vhost enhancements for 3.16

[PULL 1/2] vhost-net: extend device allocation to vmalloc

[PULL 2/2] vhost: replace rcu with mutex

[PULL 2/2] vhost: replace rcu with mutex

Apparently Analagous Threads