thr3ads.net - Libguestfs - [Libguestfs] [PATCH nbdkit 2/2] allocators: sparse: Split the highly contended mutex [Apr 2023]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2023-Apr-27 15:35 UTC

[Libguestfs] [PATCH nbdkit 2/2] allocators: sparse: Split the highly contended mutex

On Thu, Apr 27, 2023 at 03:19:12PM +0100, Richard W.M. Jones
wrote:> We've long known that nbdkit-memory-plugin with the default sparse
> array allocator suffers because a global lock must be taken whenever
> any read or write operation is performed.  This commit aims to safely
> improve performance by converting the lock into a read-write lock.
I searched for other approaches, and found this to be an interesting read:

https://stackoverflow.com/questions/2407558/pthreads-reader-writer-locks-upgrading-read-lock-to-write-lock

One of the suggestions there is using fcntl() locking (which does
support upgrade attempts) instead of pthread_rwlock*.  Might be
interesting to code up and compare an approach along those lines.
> +++ b/common/allocators/sparse.c
> @@ -129,11 +129,20 @@ DEFINE_VECTOR_TYPE (l1_dir, struct l1_entry);
>  struct sparse_array {
>    struct allocator a;           /* Must come first. */
>  
> -  /* This lock is highly contended.  When hammering the memory plugin
> -   * with 8 fio threads, about 30% of the total system time was taken
> -   * just waiting for this lock.  Fixing this is quite hard.
> +  /* The shared (read) lock must be held if you just want to access
> +   * the data without modifying any of the L1/L2 metadata or
> +   * allocating or freeing any page.
> +   *
> +   * To modify the L1/L2 metadata including allocating or freeing any
> +   * page you must hold the exclusive (write) lock.
> +   *
> +   * Because POSIX rwlocks are not upgradable this presents a problem.
> +   * We solve it below by speculatively performing the request while
> +   * holding the shared lock, but if we run into an operation that
> +   * needs to update the metadata, we restart the entire request
> +   * holding the exclusive lock.
>     */
> -  pthread_mutex_t lock;
> +  pthread_rwlock_t lock;
But as written, this does look like a nice division of labor that
reduces serialization in the common case.  And yes, it does look like
posix_rwlock is a bit heavier-weight than bare mutex, which explains
the slowdown on the read-heavy workload even while improving the
write-heavy workload.

Reviewed-by: Eric Blake <eblake at redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Richard W.M. Jones

2023-Apr-28 11:25 UTC

head link

[Libguestfs] [PATCH nbdkit 2/2] allocators: sparse: Split the highly contended mutex

On Thu, Apr 27, 2023 at 10:35:30AM -0500, Eric Blake
wrote:> Reviewed-by: Eric Blake <eblake at redhat.com>
Thanks, I pushed the two patches as e397f0643..9b5b93a07

About using fcntl locks: it seems troublesome because we'd then need
to have an actual file to lock (even one in /dev/shm).  That makes the
implementation significantly more complicated in unexpected ways.  I'm
thinking about error handling as we'd need to handle file errors
rather than basically aborting on pthread error as we do now.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Libguestfs - Apr 2023 - [PATCH nbdkit 2/2] allocators: sparse: Split the highly contended mutex

[Libguestfs] [PATCH nbdkit 2/2] allocators: sparse: Split the highly contended mutex

[Libguestfs] [PATCH nbdkit 2/2] allocators: sparse: Split the highly contended mutex