Richard W.M. Jones
2019-Mar-05 10:38 UTC
[Libguestfs] [PATCH nbdkit] Add new filter for rate-limiting connections.
For virt-v2v we have been discussing how to limit network bandwidth. The initial discussion has been around how to use cgroups to do this limiting, and that is still probably what we will go with in the end. However this patch gives us another possibility for certain virt-v2v inputs, especially VDDK. We could apply a filter on top of the nbdkit plugin which limits the rate at which it copies data. For example, to limit the rate to 1 Mbps (megabit per second) we could now do: nbdkit --filter=rate vddk [etc] rate=1M The filter is implemented using a simple Token Bucket (https://en.wikipedia.org/wiki/Token_bucket) and is quite simple while at the same time using the fully parallel thread model. Rich.
Richard W.M. Jones
2019-Mar-05 10:38 UTC
[Libguestfs] [PATCH nbdkit] Add new filter for rate-limiting connections.
--- filters/delay/nbdkit-delay-filter.pod | 4 +- filters/rate/nbdkit-rate-filter.pod | 84 +++++++++ configure.ac | 2 + filters/rate/bucket.h | 62 +++++++ filters/rate/bucket.c | 173 +++++++++++++++++++ filters/rate/rate.c | 235 ++++++++++++++++++++++++++ TODO | 9 + filters/rate/Makefile.am | 64 +++++++ tests/Makefile.am | 6 +- tests/test-rate.sh | 60 +++++++ 10 files changed, 697 insertions(+), 2 deletions(-) diff --git a/filters/delay/nbdkit-delay-filter.pod b/filters/delay/nbdkit-delay-filter.pod index 7009a8c..c2eb172 100644 --- a/filters/delay/nbdkit-delay-filter.pod +++ b/filters/delay/nbdkit-delay-filter.pod @@ -17,6 +17,7 @@ nbdkit-delay-filter - nbdkit delay filter C<nbdkit-delay-filter> is a filter that delays read and write requests by some seconds or milliseconds. This is used to simulate a slow or remote server, or to test certain kinds of race conditions in Linux. +To limit server bandwidth use L<nbdkit-rate-filter(1)> instead. =head1 EXAMPLES @@ -74,7 +75,8 @@ milliseconds. =head1 SEE ALSO L<nbdkit(1)>, -L<nbdkit-filter(3)>. +L<nbdkit-filter(3)>, +L<nbdkit-rate-filter(1)>. =head1 AUTHORS diff --git a/filters/rate/nbdkit-rate-filter.pod b/filters/rate/nbdkit-rate-filter.pod new file mode 100644 index 0000000..795c7b8 --- /dev/null +++ b/filters/rate/nbdkit-rate-filter.pod @@ -0,0 +1,84 @@ +=head1 NAME + +nbdkit-rate-filter - limit bandwidth by connection or server + +=head1 SYNOPSIS + + nbdkit --filter=rate PLUGIN [PLUGIN-ARGS...] + [rate=BITSPERSEC] + [connection-rate=BITSPERSEC] + +=head1 DESCRIPTION + +C<nbdkit-rate-filter> is a filter that limits the bandwidth that can +be used by the server. Limits can be applied per connection and/or +for the server as a whole. + +=head1 EXAMPLES + +=over 4 + +=item nbdkit --filter=rate memory size=64M rate=1M + +Create a 64M RAM disk and limit server bandwidth as a whole to a +maximum of S<1 Mbps> (megabit per second). + +=item nbdkit --filter=rate memory size=64M connection-rate=50K + +Limit each connection to S<50 Kbps> (kilobits per second). However as +there is no limit to the number of simultaneous connections this does +not limit overall server bandwidth. + +=item nbdkit --filter=rate memory size=64M connection-rate=50K rate=1M + +Limit each connection to S<50 Kbps>. Additionally the total bandwidth +across all connections to the server is limited to S<1 Mbps>. + +=back + +=head1 PARAMETERS + +=over 4 + +=item B<connection-rate=>BITSPERSEC + +Limit each connection to C<BITSPERSEC>. + +=item B<rate=>BITSPERSEC + +Limit total bandwidth across all connections to C<BITSPERSEC>. + +=back + +You can specify C<rate> and C<connection-rate> on their own or +together. If you specify neither then the filter is turned off. + +C<BITSPERSEC> can be specified as a simple number, or you can use a +number followed by C<K>, C<M> etc to mean kilobits, megabits and so +on. + +=head1 NOTES + +The rate filter approximates the bandwidth used by the NBD protocol on +the wire. Some operations such as zeroing and trimming are +effectively free (because only a tiny NBD message is sent over the +network) and so do not count against the bandwidth limit. NBD and TCP +protocol overhead is not included, so you may find that other tools +such as L<tc(8)> and L<iptables(8)> give more accurate results. + +There are separate bandwidth limits for read and write (ie. upload and +download to the server). + +=head1 SEE ALSO + +L<nbdkit(1)>, +L<nbdkit-delay-filter(1)>. +L<nbdkit-filter(3)>. + +=head1 AUTHORS + +Richard W.M. Jones + +=head1 COPYRIGHT + +Copyright (C) 2019 Red Hat Inc. diff --git a/configure.ac b/configure.ac index 9e7e5ca..467d48f 100644 --- a/configure.ac +++ b/configure.ac @@ -819,6 +819,7 @@ filters="\ nozero \ offset \ partition \ + rate \ truncate \ xz \ " @@ -889,6 +890,7 @@ AC_CONFIG_FILES([Makefile filters/nozero/Makefile filters/offset/Makefile filters/partition/Makefile + filters/rate/Makefile filters/truncate/Makefile filters/xz/Makefile fuzzing/Makefile diff --git a/filters/rate/bucket.h b/filters/rate/bucket.h new file mode 100644 index 0000000..0e8c8f7 --- /dev/null +++ b/filters/rate/bucket.h @@ -0,0 +1,62 @@ +/* nbdkit + * Copyright (C) 2018-2019 Red Hat Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * * Neither the name of Red Hat nor the names of its contributors may be + * used to endorse or promote products derived from this software without + * specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, + * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A + * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#ifndef NBDKIT_BUCKET_H +#define NBDKIT_BUCKET_H + +#include <stdint.h> +#include <time.h> +#include <sys/time.h> + +/* A token bucket. */ +struct bucket { + uint64_t rate; /* Fill rate. 0 = no limit set. */ + uint64_t capacity; /* Maximum capacity of the bucket. */ + uint64_t level; /* How full is the bucket now? */ + struct timeval tv; /* Last time we updated the level. */ +}; + +/* Initialize the bucket structure. Capacity is expressed in + * rate-equivalent seconds. + */ +extern void init_bucket (struct bucket *bucket, + uint64_t rate, double capacity); + +/* Take up to N tokens from the bucket. If we couldn't take N tokens, + * then *N is updated with the number of remaining tokens that we are + * waiting for. Returns the length of time we should sleep (which may + * be 0). + */ +extern struct timespec bucket_run (struct bucket *bucket, uint64_t *n); + +#endif /* NBDKIT_BUCKET_H */ diff --git a/filters/rate/bucket.c b/filters/rate/bucket.c new file mode 100644 index 0000000..ccd657e --- /dev/null +++ b/filters/rate/bucket.c @@ -0,0 +1,173 @@ +/* nbdkit + * Copyright (C) 2018-2019 Red Hat Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * * Neither the name of Red Hat nor the names of its contributors may be + * used to endorse or promote products derived from this software without + * specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, + * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A + * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* This filter is implemented using a Token Bucket + * (https://en.wikipedia.org/wiki/Token_bucket). There are two + * buckets per connection (one each for reading and writing) and two + * global buckets (also for reading and writing). + * + * We add tokens at the desired rate (the per-connection rate for the + * connection buckets, and the global rate for the global buckets). + * Note that we don't actually keep the buckets updated in real time + * because as a filter we are called asynchronously. Instead for each + * bucket we store the last time we were called and add the + * appropriate number of tokens when we are called next. + * + * The bucket capacity controls the burstiness allowed. This is + * hard-coded at the moment but could be configurable. All buckets + * start off full. + * + * When a packet is to be read or written, if there are sufficient + * tokens in the bucket then the packet may be immediately passed + * through to the underlying plugin. The number of bits used is + * deducted from the appropriate per-connection and global bucket. + * + * If there are insufficient tokens then the packet must be delayed. + * This is done by inserting a sleep which has an estimated length + * that is long enough based on the rate at which enough tokens will + * replenish the bucket to allow the packet to be sent next time. + */ + +#include <config.h> + +#include <stdio.h> +#include <stdlib.h> +#include <stdint.h> +#include <inttypes.h> +#include <string.h> +#include <time.h> +#include <sys/time.h> + +#include <nbdkit-filter.h> + +#include "minmax.h" + +#include "bucket.h" + +int rate_debug_bucket; /* -D rate.bucket=1 */ + +void +init_bucket (struct bucket *bucket, uint64_t rate, double capacity) +{ + bucket->rate = rate; + + /* Capacity is expressed in seconds, but we want to know the + * capacity in tokens, so multiply by the rate to get this. + */ + bucket->capacity = rate * capacity; + + /* Buckets start off full. */ + bucket->level = capacity; + + gettimeofday (&bucket->tv, NULL); +} + +/* Return the number of microseconds in y - x. */ +static int64_t +tvdiff (const struct timeval *x, const struct timeval *y) +{ + int64_t usec; + + usec = (y->tv_sec - x->tv_sec) * 1000000; + usec += y->tv_usec - x->tv_usec; + return usec; +} + +struct timespec +bucket_run (struct bucket *bucket, uint64_t *n) +{ + struct timespec ts; + struct timeval now; + int64_t usec; + uint64_t add, nsec; + + /* rate == 0 is a special case meaning that there is no limit being + * enforced. + */ + if (bucket->rate == 0) { + *n = 0; + ts.tv_sec = ts.tv_nsec = 0; + return ts; + } + + gettimeofday (&now, NULL); + + /* Can we add tokens to the bucket? Work out how much time has + * elapsed since we last did this. + */ + usec = tvdiff (&bucket->tv, &now); + if (usec < 0) /* Maybe happens if system time not monotonic? */ + usec = 0; + + add = bucket->rate * usec / 1000000; + add = MIN (add, bucket->capacity - bucket->level); + if (rate_debug_bucket) + nbdkit_debug ("bucket %p: adding %" PRIu64 " tokens, new level %" PRIu64, + bucket, add, bucket->level + add); + bucket->level += add; + bucket->tv = now; + + /* Can we deduct N tokens from the bucket? If yes then we're good, + * we can return with TS = 0 which means the caller won't sleep. + */ + if (bucket->level >= *n) { + if (rate_debug_bucket) + nbdkit_debug ("bucket %p: deducting %" PRIu64 " tokens, no sleep", + bucket, *n); + bucket->level -= *n; + *n = 0; + ts.tv_sec = ts.tv_nsec = 0; + return ts; + } + + if (rate_debug_bucket) + nbdkit_debug ("bucket %p: deducting %" PRIu64 " tokens, empty, " + "need another %" PRIu64 " tokens", + bucket, bucket->level, *n - bucket->level); + + *n -= bucket->level; + bucket->level = 0; + + /* Now we need to estimate how long it will take to add N tokens to + * the bucket, which is how long the caller must sleep for. + */ + nsec = 1000000000 * (*n) / bucket->rate; + ts.tv_sec = nsec / 1000000000; + ts.tv_nsec = nsec % 1000000000; + + if (rate_debug_bucket) + nbdkit_debug ("bucket %p: sleeping for %.1f seconds", bucket, + nsec / 1000000000.); + + return ts; +} diff --git a/filters/rate/rate.c b/filters/rate/rate.c new file mode 100644 index 0000000..6827e16 --- /dev/null +++ b/filters/rate/rate.c @@ -0,0 +1,235 @@ +/* nbdkit + * Copyright (C) 2018-2019 Red Hat Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * * Neither the name of Red Hat nor the names of its contributors may be + * used to endorse or promote products derived from this software without + * specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, + * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A + * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* For a note on the implementation of this filter, see bucket.c. */ + +#include <config.h> + +#include <stdio.h> +#include <stdlib.h> +#include <stdint.h> +#include <string.h> +#include <time.h> +#include <sys/time.h> + +#include <pthread.h> + +#include <nbdkit-filter.h> + +#include "bucket.h" + +#define THREAD_MODEL NBDKIT_THREAD_MODEL_PARALLEL + +/* Per-connection and global limit, both in bits per second, with zero + * meaning not set / not enforced. + */ +static uint64_t connection_rate = 0; +static uint64_t rate = 0; + +/* Bucket capacity controls the burst rate. It is expressed as the + * length of time in "rate-equivalent seconds" that the client can + * burst for after a period of inactivity. This could be adjustable + * in future. + */ +#define BUCKET_CAPACITY 2.0 + +/* Global read and write buckets. */ +static struct bucket read_bucket; +static pthread_mutex_t read_bucket_lock = PTHREAD_MUTEX_INITIALIZER; +static struct bucket write_bucket; +static pthread_mutex_t write_bucket_lock = PTHREAD_MUTEX_INITIALIZER; + +/* Per-connection handle. */ +struct rate_handle { + /* Per-connection read and write buckets. */ + struct bucket read_bucket; + pthread_mutex_t read_bucket_lock; + struct bucket write_bucket; + pthread_mutex_t write_bucket_lock; +}; + +/* Called for each key=value passed on the command line. */ +static int +rate_config (nbdkit_next_config *next, void *nxdata, + const char *key, const char *value) +{ + if (strcmp (key, "rate") == 0) { + if (rate > 0) { + nbdkit_error ("rate set twice on the command line"); + return -1; + } + rate = nbdkit_parse_size (value); + if (rate == -1) + return -1; + if (rate == 0) { + nbdkit_error ("rate cannot be set to 0"); + return -1; + } + return 0; + } + else if (strcmp (key, "connection-rate") == 0) { + if (connection_rate > 0) { + nbdkit_error ("connection-rate set twice on the command line"); + return -1; + } + connection_rate = nbdkit_parse_size (value); + if (connection_rate == -1) + return -1; + if (connection_rate == 0) { + nbdkit_error ("connection-rate cannot be set to 0"); + return -1; + } + return 0; + } + else + return next (nxdata, key, value); +} + +static int +rate_config_complete (nbdkit_next_config_complete *next, void *nxdata) +{ + /* Initialize the global buckets. */ + init_bucket (&read_bucket, rate, BUCKET_CAPACITY); + init_bucket (&write_bucket, rate, BUCKET_CAPACITY); + + return next (nxdata); +} + +#define rate_config_help \ + "rate=BITSPERSEC Limit total bandwidth.\n" \ + "connection-rate=BITSPERSEC Limit per-connection bandwidth." + +/* Create the per-connection handle. */ +static void * +rate_open (nbdkit_next_open *next, void *nxdata, int readonly) +{ + struct rate_handle *h; + + if (next (nxdata, readonly) == -1) + return NULL; + + h = malloc (sizeof *h); + if (h == NULL) { + nbdkit_error ("malloc: %m"); + return NULL; + } + + init_bucket (&h->read_bucket, connection_rate, BUCKET_CAPACITY); + init_bucket (&h->write_bucket, connection_rate, BUCKET_CAPACITY); + pthread_mutex_init (&h->read_bucket_lock, NULL); + pthread_mutex_init (&h->write_bucket_lock, NULL); + + return h; +} + +/* Free up the per-connection handle. */ +static void +rate_close (void *handle) +{ + struct rate_handle *h = handle; + + pthread_mutex_destroy (&h->read_bucket_lock); + pthread_mutex_destroy (&h->write_bucket_lock); + free (h); +} + +static inline void +maybe_sleep (struct bucket *bucket, pthread_mutex_t *lock, uint32_t count) +{ + struct timespec ts; + uint64_t bits; + + /* Count is in bytes, but we rate limit using bits. We could + * multiply this by 10 to include start/stop but let's not + * second-guess the transport layers underneath. + */ + bits = count * UINT64_C(8); + + /* Run the token bucket algorithm. It will set ts to the sleep time + * required, if any, and update bits with the remaining number of + * bits we have to wait for. + */ + again: + pthread_mutex_lock (lock); + ts = bucket_run (bucket, &bits); + pthread_mutex_unlock (lock); + + if (ts.tv_sec > 0 || ts.tv_nsec > 0) { + nanosleep (&ts, NULL); + goto again; + } +} + +/* Read data. */ +static int +rate_pread (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, void *buf, uint32_t count, uint64_t offset, + uint32_t flags, int *err) +{ + struct rate_handle *h = handle; + + maybe_sleep (&read_bucket, &read_bucket_lock, count); + maybe_sleep (&h->read_bucket, &h->read_bucket_lock, count); + + return next_ops->pread (nxdata, buf, count, offset, flags, err); +} + +/* Write data. */ +static int +rate_pwrite (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, + const void *buf, uint32_t count, uint64_t offset, uint32_t flags, + int *err) +{ + struct rate_handle *h = handle; + + maybe_sleep (&write_bucket, &write_bucket_lock, count); + maybe_sleep (&h->write_bucket, &h->write_bucket_lock, count); + + return next_ops->pwrite (nxdata, buf, count, offset, flags, err); +} + +static struct nbdkit_filter filter = { + .name = "rate", + .longname = "nbdkit rate filter", + .version = PACKAGE_VERSION, + .config = rate_config, + .config_complete = rate_config_complete, + .config_help = rate_config_help, + .open = rate_open, + .close = rate_close, + .pread = rate_pread, + .pwrite = rate_pwrite, +}; + +NBDKIT_REGISTER_FILTER(filter) diff --git a/TODO b/TODO index 59590a1..b589127 100644 --- a/TODO +++ b/TODO @@ -129,6 +129,15 @@ Suggestions for filters * nbdkit-cache-filter should handle ENOSPC errors automatically by reclaiming blocks from the cache +nbdkit-rate-filter: + +* allow other kinds of traffic shaping such as VBR + +* limit traffic per client (ie. per IP address) + +* split large requests to avoid long, lumpy sleeps when request size + is much larger than rate limit + Filters for security -------------------- diff --git a/filters/rate/Makefile.am b/filters/rate/Makefile.am new file mode 100644 index 0000000..c39aa01 --- /dev/null +++ b/filters/rate/Makefile.am @@ -0,0 +1,64 @@ +# nbdkit +# Copyright (C) 2018-2019 Red Hat Inc. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are +# met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# * Neither the name of Red Hat nor the names of its contributors may be +# used to endorse or promote products derived from this software without +# specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND +# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, +# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF +# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT +# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +# SUCH DAMAGE. + +include $(top_srcdir)/common-rules.mk + +EXTRA_DIST = nbdkit-rate-filter.pod + +filter_LTLIBRARIES = nbdkit-rate-filter.la + +nbdkit_rate_filter_la_SOURCES = \ + bucket.c \ + bucket.h \ + rate.c \ + $(top_srcdir)/include/nbdkit-filter.h + +nbdkit_rate_filter_la_CPPFLAGS = \ + -I$(top_srcdir)/include \ + -I$(top_srcdir)/common/include +nbdkit_rate_filter_la_CFLAGS = \ + $(WARNINGS_CFLAGS) +nbdkit_rate_filter_la_LDFLAGS = \ + -module -avoid-version -shared \ + -Wl,--version-script=$(top_srcdir)/filters/filters.syms + +if HAVE_POD + +man_MANS = nbdkit-rate-filter.1 +CLEANFILES += $(man_MANS) + +nbdkit-rate-filter.1: nbdkit-rate-filter.pod + $(PODWRAPPER) --section=1 --man $@ \ + --html $(top_builddir)/html/$@.html \ + $< + +endif HAVE_POD diff --git a/tests/Makefile.am b/tests/Makefile.am index 3992d9b..d1e6f0e 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -1,5 +1,5 @@ # nbdkit -# Copyright (C) 2013-2018 Red Hat Inc. +# Copyright (C) 2013-2019 Red Hat Inc. # All rights reserved. # # Redistribution and use in source and binary forms, with or without @@ -97,6 +97,7 @@ EXTRA_DIST = \ test-python-exception.sh \ test.pl \ test.py \ + test-rate.sh \ test.rb \ test.tcl \ test-shebang-perl.sh \ @@ -810,6 +811,9 @@ if HAVE_GUESTFISH TESTS += test-partition2.sh endif HAVE_GUESTFISH +# rate filter test. +TESTS += test-rate.sh + # truncate filter tests. TESTS += \ test-truncate1.sh \ diff --git a/tests/test-rate.sh b/tests/test-rate.sh new file mode 100755 index 0000000..010ef19 --- /dev/null +++ b/tests/test-rate.sh @@ -0,0 +1,60 @@ +#!/usr/bin/env bash +# nbdkit +# Copyright (C) 2018-2019 Red Hat Inc. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are +# met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# * Neither the name of Red Hat nor the names of its contributors may be +# used to endorse or promote products derived from this software without +# specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND +# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, +# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF +# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT +# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +# SUCH DAMAGE. + +source ./functions.sh +set -e +set -x + +requires qemu-img --version + +files="rate.img rate.time rate.err" +rm -f $files +cleanup_fn rm -f $files + +# This should take no less than 20 seconds to run: +# (8 * 25 * 1024 * 1024) / (10 * 1024 * 1024) = 20 + +# We are using the bash time builtin, so setting TIMEFORMAT will +# control the output format of the time builtin. For strange use of +# { ; } here, see: https://stackoverflow.com/a/13356654 +set +x +{ TIMEFORMAT="%0R" ; time nbdkit --filter=rate memory size=25M rate=10M --run 'qemu-img convert -p $nbd rate.img' 2>rate.err ; } 2>rate.time +set -x + +cat rate.err ||: + +seconds="$( cat rate.time )" +if [ "$seconds" -lt 20 ]; then + echo "$0: rate filter failed: command took $seconds seconds, expected > 20" + exit 1 +fi -- 2.20.1
Eric Blake
2019-Mar-08 12:38 UTC
Re: [Libguestfs] [PATCH nbdkit] Add new filter for rate-limiting connections.
On 3/5/19 4:38 AM, Richard W.M. Jones wrote:> --- > filters/delay/nbdkit-delay-filter.pod | 4 +- > filters/rate/nbdkit-rate-filter.pod | 84 +++++++++ > configure.ac | 2 + > filters/rate/bucket.h | 62 +++++++ > filters/rate/bucket.c | 173 +++++++++++++++++++ > filters/rate/rate.c | 235 ++++++++++++++++++++++++++ > TODO | 9 + > filters/rate/Makefile.am | 64 +++++++ > tests/Makefile.am | 6 +- > tests/test-rate.sh | 60 +++++++ > 10 files changed, 697 insertions(+), 2 deletions(-)I see I was too slow in reviewing this before you pushed (that's okay, overall it looks reasonable).> + > +=item nbdkit --filter=rate memory size=64M connection-rate=50K rate=1M > + > +Limit each connection to S<50 Kbps>. Additionally the total bandwidth > +across all connections to the server is limited to S<1 Mbps>.If there are more than 40 clients, does this setup fairly service all of them at a lower per-client rate?> +=head1 NOTES > + > +The rate filter approximates the bandwidth used by the NBD protocol on > +the wire. Some operations such as zeroing and trimming are > +effectively free (because only a tiny NBD message is sent over the > +network) and so do not count against the bandwidth limit. NBD and TCP > +protocol overhead is not included, so you may find that other tools > +such as L<tc(8)> and L<iptables(8)> give more accurate results. > + > +There are separate bandwidth limits for read and write (ie. upload and > +download to the server). > +Is it worth mentioning that the blocksize filter can be used to ensure smoother spreading of the bandwidth? (If a large client request is broken into 128k max packets, that hits the network in a smoother pattern than a single request).> +++ b/TODO > @@ -129,6 +129,15 @@ Suggestions for filters > * nbdkit-cache-filter should handle ENOSPC errors automatically by > reclaiming blocks from the cache > > +nbdkit-rate-filter: > + > +* allow other kinds of traffic shaping such as VBR > + > +* limit traffic per client (ie. per IP address) > + > +* split large requests to avoid long, lumpy sleeps when request size > + is much larger than rate limitCan't you pair the blocksize filter in front of this one to accomplish that already? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org