William Dauchy
2013-Aug-05 15:13 UTC
[PATCH v4 0/3][xen-netback][toolstack] add a pseudo pps limit to netback
VM traffic is already limited by a throughput limit, but there is no control over the maximum packet per second (PPS). In DDOS attack the major issue is rather PPS than throughput. With provider offering more bandwidth to VMs, it becames easy to coordinate a massive attack using VMs. Example: 100Mbits ~ 200kpps using 64B packets. This patch provides a new option to limit VMs maximum packets per second emission rate. It follows the same credits logic used for throughput shaping. For the moment we have considered each "txreq" as a packet. PPS limits is passed to VIF at connection time via xenstore. PPS credit uses the same usecond period used by rate shaping check. known limitations: - by using the same usecond period, PPS shaping depends on throughput shaping. - it is not always true that a "txreq" correspond to a paquet (fragmentation cases) but as this shaping is meant to avoid DDOS (small paquets) such an pproximation should not impact the results. - Some help on burst handling will be appreciated. v2: - fix some typo v3: - fix some typo - add toolstack patch v4: - fix toolstack memleak Ahmed Amamou (1): xen netback: add a pseudo pps rate limit drivers/net/xen-netback/common.h | 2 ++ drivers/net/xen-netback/interface.c | 1 + drivers/net/xen-netback/netback.c | 41 +++++++++++++++++++++++++++++++++++ drivers/net/xen-netback/xenbus.c | 31 +++++++++++++++++++++----- 4 files changed, 70 insertions(+), 5 deletions(-) [toolstack] This patch will update the libxl in order to provide the new pps limit new pps limit can be defined as follow YYMb/s&XXKpps@ZZms or YYMb/s@ZZms&XXKpps or YYMb/s&XXKpps in such case default 50ms interval will be used Ahmed Amamou (2): handle pps limit parameter netif documentation docs/misc/xl-network-configuration.markdown | 18 +++++-- tools/libxl/libxl.c | 3 ++ tools/libxl/libxl_types.idl | 1 + tools/libxl/libxlu_vif.c | 70 +++++++++++++++++++++++++-- xen/include/public/io/netif.h | 27 +++++++++++ 5 files changed, 111 insertions(+), 8 deletions(-) -- 1.7.9.5
William Dauchy
2013-Aug-05 15:13 UTC
[PATCH v4 1/3] xen netback: add a pseudo pps rate limit
This patch provides a new option to limit VMs maximum packets per second emission rate. It follows the same credits logic used for throughput shaping. For the moment we have considered each "txreq" as a packet. PPS limits is passed to VIF at connection time via xenstore. PPS credit uses the same usecond period used by rate shaping check. known limitations: - by using the same usecond period, PPS shaping depends on throughput shaping. - it is not always true that a "txreq" correspond to a packet (fragmentation cases) but as this shaping is meant to avoid DDOS (small packets) such an approximation should not impact the results. - Some help on burst handling will be appreciated. Signed-off-by: Ahmed Amamou <ahmed@gandi.net> Signed-off-by: William Dauchy <william@gandi.net> Signed-off-by: Kamel Haddadou <kamel@gandi.net> --- drivers/net/xen-netback/common.h | 2 ++ drivers/net/xen-netback/interface.c | 1 + drivers/net/xen-netback/netback.c | 41 +++++++++++++++++++++++++++++++++++ drivers/net/xen-netback/xenbus.c | 35 ++++++++++++++++++++++++------ 4 files changed, 72 insertions(+), 7 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 8a4d77e..e1a2d4f 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -89,8 +89,10 @@ struct xenvif { /* Transmit shaping: allow ''credit_bytes'' every ''credit_usec''. */ unsigned long credit_bytes; + unsigned long credit_packets; unsigned long credit_usec; unsigned long remaining_credit; + unsigned long remaining_packets; struct timer_list credit_timeout; /* Statistics */ diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 087d2db..43c2da7 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -295,6 +295,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, INIT_LIST_HEAD(&vif->notify_list); vif->credit_bytes = vif->remaining_credit = ~0UL; + vif->credit_packets = vif->remaining_packets = ~0UL; vif->credit_usec = 0UL; init_timer(&vif->credit_timeout); /* Initialize ''expires'' now: it''s used to track the credit window. */ diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 64828de..6ab3cb2 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -912,10 +912,16 @@ static void tx_add_credit(struct xenvif *vif) vif->remaining_credit = min(max_credit, max_burst); } +static void tx_add_packets(struct xenvif *vif) +{ + vif->remaining_packets = vif->credit_packets; +} + static void tx_credit_callback(unsigned long data) { struct xenvif *vif = (struct xenvif *)data; tx_add_credit(vif); + tx_add_packets(vif); xen_netbk_check_rx_xenvif(vif); } @@ -1426,6 +1432,34 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) return false; } +static bool tx_packets_exceeded(struct xenvif *vif) +{ + unsigned long now = jiffies; + unsigned long next_credit + vif->credit_timeout.expires + + msecs_to_jiffies(vif->credit_usec / 1000); + + /* Timer could already be pending in rare cases. */ + if (timer_pending(&vif->credit_timeout)) + return true; + + /* Passed the point where we can replenish credit? */ + if (time_after_eq(now, next_credit)) { + vif->credit_timeout.expires = now; + tx_add_packets(vif); + } + + /* Not enough slot to send right now? Set a callback. */ + if (vif->remaining_packets < 1) { + vif->credit_timeout.data = (unsigned long)vif; + vif->credit_timeout.function = tx_credit_callback; + mod_timer(&vif->credit_timeout, next_credit); + return true; + } + + return false; +} + static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) { struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop; @@ -1477,6 +1511,12 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) rmb(); /* Ensure that we see the request before we copy it. */ memcpy(&txreq, RING_GET_REQUEST(&vif->tx, idx), sizeof(txreq)); + /* pps-based scheduling. */ + if(vif->remaining_packets < 1 && tx_packets_exceeded(vif)) { + xenvif_put(vif); + continue; + } + /* Credit-based scheduling. */ if (txreq.size > vif->remaining_credit && tx_credit_exceeded(vif, txreq.size)) { @@ -1485,6 +1525,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) } vif->remaining_credit -= txreq.size; + vif->remaining_packets--; work_to_do--; vif->tx.req_cons = ++idx; diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 1fe48fe3..2b52a09 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -276,15 +276,18 @@ static void frontend_changed(struct xenbus_device *dev, static void xen_net_read_rate(struct xenbus_device *dev, - unsigned long *bytes, unsigned long *usec) + unsigned long *bytes, + unsigned long *packets, + unsigned long *usec) { char *s, *e; - unsigned long b, u; - char *ratestr; + unsigned long b, u, pps; + char *ratestr, *ppsstr; /* Default to unlimited bandwidth. */ *bytes = ~0UL; *usec = 0; + *packets = ~0UL; ratestr = xenbus_read(XBT_NIL, dev->nodename, "rate", NULL); if (IS_ERR(ratestr)) @@ -293,22 +296,39 @@ static void xen_net_read_rate(struct xenbus_device *dev, s = ratestr; b = simple_strtoul(s, &e, 10); if ((s == e) || (*e != '','')) - goto fail; + goto fail_ratestr; s = e + 1; u = simple_strtoul(s, &e, 10); if ((s == e) || (*e != ''\0'')) - goto fail; + goto fail_ratestr; *bytes = b; *usec = u; + ppsstr = xenbus_read(XBT_NIL, dev->nodename, "pps", NULL); + if (IS_ERR(ppsstr)) + return; + s = ppsstr; + pps = simple_strtoul(s, &e, 10); + if ((s == e) || (*e != ''\0'')) + goto fail_ppsstr; + *packets = pps; + kfree(ratestr); + kfree(ppsstr); return; - fail: + fail_ppsstr: + pr_warn("Failed to parse network PPS limit. PPS unlimited.\n"); + kfree(ppsstr); + goto free_ratestr; + + fail_ratestr: pr_warn("Failed to parse network rate limit. Traffic unlimited.\n"); + free_ratestr: kfree(ratestr); + return; } static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[]) @@ -379,8 +399,9 @@ static void connect(struct backend_info *be) } xen_net_read_rate(dev, &be->vif->credit_bytes, - &be->vif->credit_usec); + &be->vif->credit_packets, &be->vif->credit_usec); be->vif->remaining_credit = be->vif->credit_bytes; + be->vif->remaining_packets = be->vif->credit_packets; unregister_hotplug_status_watch(be); err = xenbus_watch_pathfmt(dev, &be->hotplug_status_watch, -- 1.7.9.5
adapt libxl to handle pps limit parameter the new pps limit can be defined using a ''&'' symbol after the rate limit for example: YYMb/s&XXKpps@ZZms or YYMb/s@ZZms&XXKpps or YYMb/s&XXKpps in such case default 50ms interval will be used Signed-off-by: Ahmed Amamou <ahmed@gandi.net> Signed-off-by: William Dauchy <william@gandi.net> Signed-off-by: Kamel Haddadou <kamel@gandi.net> --- tools/libxl/libxl.c | 3 ++ tools/libxl/libxl_types.idl | 1 + tools/libxl/libxlu_vif.c | 70 +++++++++++++++++++++++++++++++++++++++++-- 3 files changed, 71 insertions(+), 3 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 3236aa9..11572bc 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -2920,6 +2920,9 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid, flexarray_append(back, libxl__sprintf(gc, "%"PRIu64",%"PRIu32"", nic->rate_bytes_per_interval, nic->rate_interval_usecs)); + flexarray_append(back, "pps"); + flexarray_append(back, libxl__sprintf(gc, "%"PRIu64"", + nic->rate_packets_per_interval)); } flexarray_append(back, "bridge"); diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index d218a2d..be2ae94 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -390,6 +390,7 @@ libxl_device_nic = Struct("device_nic", [ ("script", string), ("nictype", libxl_nic_type), ("rate_bytes_per_interval", uint64), + ("rate_packets_per_interval", uint64), ("rate_interval_usecs", uint32), ("gatewaydev", string), ]) diff --git a/tools/libxl/libxlu_vif.c b/tools/libxl/libxlu_vif.c index 3b3de0f..3d491d8 100644 --- a/tools/libxl/libxlu_vif.c +++ b/tools/libxl/libxlu_vif.c @@ -3,6 +3,7 @@ static const char *vif_bytes_per_sec_re = "^[0-9]+[GMK]?[Bb]/s$"; static const char *vif_internal_usec_re = "^[0-9]+[mu]?s?$"; +static const char *vif_packets_per_sec_re = "^[0-9]+[GMK]?pps$"; static void xlu__vif_err(XLU_Config *cfg, const char *msg, const char *rate) { fprintf(cfg->report, @@ -49,6 +50,43 @@ out: return rc; } +static int vif_parse_rate_packets_per_sec(XLU_Config *cfg, const char *packet, + uint64_t *packets_per_sec) +{ + regex_t rec; + uint64_t tmp = 0; + const char *p; + int rc = 0; + + regcomp(&rec, vif_packets_per_sec_re, REG_EXTENDED|REG_NOSUB); + if (regexec(&rec, packet, 0, NULL, 0)) { + xlu__vif_err(cfg, "invalid pps", packet); + rc = EINVAL; + goto out; + } + + p = packet; + tmp = strtoull(p, (char**)&p, 0); + if (tmp == 0 || tmp > UINT32_MAX || errno == ERANGE) { + xlu__vif_err(cfg, "pps overflow", packet); + rc = EOVERFLOW; + goto out; + } + + if (*p == ''G'') + tmp *= 1000 * 1000 * 1000; + else if (*p == ''M'') + tmp *= 1000 * 1000; + else if (*p == ''K'') + tmp *= 1000; + + *packets_per_sec = tmp; + +out: + regfree(&rec); + return rc; +} + static int vif_parse_rate_interval_usecs(XLU_Config *cfg, const char *interval, uint32_t *interval_usecs) { @@ -94,22 +132,35 @@ int xlu_vif_parse_rate(XLU_Config *cfg, const char *rate, libxl_device_nic *nic) { uint64_t bytes_per_sec = 0; uint64_t bytes_per_interval = 0; + uint64_t packets_per_sec = 0; + uint64_t packets_per_interval = 0; uint32_t interval_usecs = 50000UL; /* Default to 50ms */ - char *ratetok, *tmprate; + char *ratetok, *tmprate, *tmp_pps, *tmpint; int rc = 0; + /* rate string need to be duplicated because strtok may change it */ tmprate = strdup(rate); + tmp_pps = strdup(rate); + tmpint = strdup(rate); + if (!strcmp(tmprate,"")) { xlu__vif_err(cfg, "no rate specified", rate); rc = EINVAL; goto out; } - ratetok = strtok(tmprate, "@"); + /* accepted rate string are as follow: + * rate&pps@interval or rate@interval&pps + */ + + /* ratetok contains the first token */ + ratetok = strtok(tmprate, "@&"); rc = vif_parse_rate_bytes_per_sec(cfg, ratetok, &bytes_per_sec); if (rc) goto out; - ratetok = strtok(NULL, "@"); + ratetok = strtok(tmpint, "@"); + ratetok = strtok(NULL, "@&"); + /* ratetok contains the first token following the ''@'' */ if (ratetok != NULL) { rc = vif_parse_rate_interval_usecs(cfg, ratetok, &interval_usecs); if (rc) goto out; @@ -121,14 +172,27 @@ int xlu_vif_parse_rate(XLU_Config *cfg, const char *rate, libxl_device_nic *nic) goto out; } + ratetok = strtok(tmp_pps, "&"); + ratetok = strtok(NULL, "&@"); + /* ratetok contains the first token following the ''&'' */ + if (ratetok != NULL) { + rc = vif_parse_rate_packets_per_sec(cfg, ratetok, &packets_per_sec); + if (rc) goto out; + } + bytes_per_interval (((uint64_t) bytes_per_sec * (uint64_t) interval_usecs) / 1000000UL); + packets_per_interval + (((uint64_t) packets_per_sec * (uint64_t) interval_usecs) / 1000000UL); nic->rate_interval_usecs = interval_usecs; nic->rate_bytes_per_interval = bytes_per_interval; + nic->rate_packets_per_interval = packets_per_interval; out: free(tmprate); + free(tmp_pps); + free(tmpint); return rc; } -- 1.7.9.5
add netif old rate limit documentation add new pps limit documentation Signed-off-by: Ahmed Amamou <ahmed@gandi.net> Signed-off-by: William Dauchy <william@gandi.net> Signed-off-by: Kamel Haddadou <kamel@gandi.net> --- docs/misc/xl-network-configuration.markdown | 18 +++++++++++++----- xen/include/public/io/netif.h | 27 +++++++++++++++++++++++++++ 2 files changed, 40 insertions(+), 5 deletions(-) diff --git a/docs/misc/xl-network-configuration.markdown b/docs/misc/xl-network-configuration.markdown index e0d3d2a..1f4ffff 100644 --- a/docs/misc/xl-network-configuration.markdown +++ b/docs/misc/xl-network-configuration.markdown @@ -141,25 +141,33 @@ domain which is outside the scope of this document. Specifies the rate at which the outgoing traffic will be limited to. The default if this keyword is not specified is unlimited. -The rate may be specified as "<RATE>/s" or optionally "<RATE>/s@<INTERVAL>". +The rate may be specified as "<RATE>/s" or optionally +"<RATE>/s@<INTERVAL>" or "<RATE>/s&<PPS>pps" or "<RATE>/s&<PPS>pps@<INTERVAL>". * `RATE` is in bytes and can accept suffixes: * GB, MB, KB, B for bytes. * Gb, Mb, Kb, b for bits. + + * `PPS` is in packet and can accept suffixes: + * Gpps, Mpps, Kpps, pps. + It determines the packets per second that outgoing traffic will be + limited to. + * `INTERVAL` is in microseconds and can accept suffixes: ms, us, s. It determines the frequency at which the vif transmission credit is replenished. The default is 50ms. -Vif rate limiting is credit-based. It means that for "1MB/s@20ms", the +Vif rate limiting is credit-based. It means that for "1MB/s&50Kpps@20ms", the available credit will be equivalent of the traffic you would have done -at "1MB/s" during 20ms. This will results in a credit of 20,000 bytes -replenished every 20,000 us. +at "1MB/s" or 50,000 pps during 20ms. This will results in a credit of 20,000 +bytes or 1000 packets replenished every 20,000 us. For example: ''rate=10Mb/s'' -- meaning up to 10 megabits every second ''rate=250KB/s'' -- meaning up to 250 kilobytes every second - ''rate=1MB/s@20ms'' -- meaning 20,000 bytes in every 20 millisecond period + ''rate=1MB/s&10Kpps@20ms'' -- meaning 20,000 bytes or 200 packets in +every 20 millisecond period NOTE: The actual underlying limits of rate limiting are dependent on the underlying netback implementation. diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h index d477751..8bd112f 100644 --- a/xen/include/public/io/netif.h +++ b/xen/include/public/io/netif.h @@ -79,6 +79,33 @@ * Request N: netif_tx_request -- 0 */ +/* + * ------------------------- Backend Device Properties ------------------------- + * + * interval + * Values: <uint64_t> + * Default Value: 0 + * + * Used time interval in usecond for throughput and pps limits. The + * same interval is used for both of them in order to simplify the + * implementation. Using the same check interval also avoid possible bugs + * that invole bypassing these limits + * + * rate + * Values: <uint64_t> + * Default Value: ~0 + * + * The netback reading rate in bytes from the shared ring. This rate is + * represented in bytes per interval. + * + * pps + * Values: <uint64_t> + * Default Value: ~0 + * + * The netback reading rate in tx slots from the shared ring.This rate is + * represented in request per interval. + */ + /* Protocol checksum field is blank in the packet (hardware offload)? */ #define _NETTXF_csum_blank (0) #define NETTXF_csum_blank (1U<<_NETTXF_csum_blank) -- 1.7.9.5
Suggest adding "libxl: " prefix in subject line. Plus, you should CC Ian Jackson for toolstack patches (which I''ve already done for this mail). On Mon, Aug 05, 2013 at 05:13:09PM +0200, William Dauchy wrote:> adapt libxl to handle pps limit parameter > the new pps limit can be defined using a ''&'' symbol after > the rate limit for example: > YYMb/s&XXKpps@ZZms > or > YYMb/s@ZZms&XXKpps > or > YYMb/s&XXKpps in such case default 50ms interval will be used >One question I need to ask is that is it possible for user to only specify PPS? From previous email and the code below it doesn''t seem to allow users to do so. I know rate and PPS use the same interval but that doesn''t mean that the later depend on the former, right? Wei.> Signed-off-by: Ahmed Amamou <ahmed@gandi.net> > Signed-off-by: William Dauchy <william@gandi.net> > Signed-off-by: Kamel Haddadou <kamel@gandi.net> > --- > tools/libxl/libxl.c | 3 ++ > tools/libxl/libxl_types.idl | 1 + > tools/libxl/libxlu_vif.c | 70 +++++++++++++++++++++++++++++++++++++++++-- > 3 files changed, 71 insertions(+), 3 deletions(-) > > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c > index 3236aa9..11572bc 100644 > --- a/tools/libxl/libxl.c > +++ b/tools/libxl/libxl.c > @@ -2920,6 +2920,9 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid, > flexarray_append(back, libxl__sprintf(gc, "%"PRIu64",%"PRIu32"", > nic->rate_bytes_per_interval, > nic->rate_interval_usecs)); > + flexarray_append(back, "pps"); > + flexarray_append(back, libxl__sprintf(gc, "%"PRIu64"", > + nic->rate_packets_per_interval)); > } > > flexarray_append(back, "bridge"); > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > index d218a2d..be2ae94 100644 > --- a/tools/libxl/libxl_types.idl > +++ b/tools/libxl/libxl_types.idl > @@ -390,6 +390,7 @@ libxl_device_nic = Struct("device_nic", [ > ("script", string), > ("nictype", libxl_nic_type), > ("rate_bytes_per_interval", uint64), > + ("rate_packets_per_interval", uint64), > ("rate_interval_usecs", uint32), > ("gatewaydev", string), > ]) > diff --git a/tools/libxl/libxlu_vif.c b/tools/libxl/libxlu_vif.c > index 3b3de0f..3d491d8 100644 > --- a/tools/libxl/libxlu_vif.c > +++ b/tools/libxl/libxlu_vif.c > @@ -3,6 +3,7 @@ > > static const char *vif_bytes_per_sec_re = "^[0-9]+[GMK]?[Bb]/s$"; > static const char *vif_internal_usec_re = "^[0-9]+[mu]?s?$"; > +static const char *vif_packets_per_sec_re = "^[0-9]+[GMK]?pps$"; > > static void xlu__vif_err(XLU_Config *cfg, const char *msg, const char *rate) { > fprintf(cfg->report, > @@ -49,6 +50,43 @@ out: > return rc; > } > > +static int vif_parse_rate_packets_per_sec(XLU_Config *cfg, const char *packet, > + uint64_t *packets_per_sec) > +{ > + regex_t rec; > + uint64_t tmp = 0; > + const char *p; > + int rc = 0; > + > + regcomp(&rec, vif_packets_per_sec_re, REG_EXTENDED|REG_NOSUB); > + if (regexec(&rec, packet, 0, NULL, 0)) { > + xlu__vif_err(cfg, "invalid pps", packet); > + rc = EINVAL; > + goto out; > + } > + > + p = packet; > + tmp = strtoull(p, (char**)&p, 0); > + if (tmp == 0 || tmp > UINT32_MAX || errno == ERANGE) { > + xlu__vif_err(cfg, "pps overflow", packet); > + rc = EOVERFLOW; > + goto out; > + } > + > + if (*p == ''G'') > + tmp *= 1000 * 1000 * 1000; > + else if (*p == ''M'') > + tmp *= 1000 * 1000; > + else if (*p == ''K'') > + tmp *= 1000; > + > + *packets_per_sec = tmp; > + > +out: > + regfree(&rec); > + return rc; > +} > + > static int vif_parse_rate_interval_usecs(XLU_Config *cfg, const char *interval, > uint32_t *interval_usecs) > { > @@ -94,22 +132,35 @@ int xlu_vif_parse_rate(XLU_Config *cfg, const char *rate, libxl_device_nic *nic) > { > uint64_t bytes_per_sec = 0; > uint64_t bytes_per_interval = 0; > + uint64_t packets_per_sec = 0; > + uint64_t packets_per_interval = 0; > uint32_t interval_usecs = 50000UL; /* Default to 50ms */ > - char *ratetok, *tmprate; > + char *ratetok, *tmprate, *tmp_pps, *tmpint; > int rc = 0; > > + /* rate string need to be duplicated because strtok may change it */ > tmprate = strdup(rate); > + tmp_pps = strdup(rate); > + tmpint = strdup(rate); > + > if (!strcmp(tmprate,"")) { > xlu__vif_err(cfg, "no rate specified", rate); > rc = EINVAL; > goto out; > } > > - ratetok = strtok(tmprate, "@"); > + /* accepted rate string are as follow: > + * rate&pps@interval or rate@interval&pps > + */ > + > + /* ratetok contains the first token */ > + ratetok = strtok(tmprate, "@&"); > rc = vif_parse_rate_bytes_per_sec(cfg, ratetok, &bytes_per_sec); > if (rc) goto out; > > - ratetok = strtok(NULL, "@"); > + ratetok = strtok(tmpint, "@"); > + ratetok = strtok(NULL, "@&"); > + /* ratetok contains the first token following the ''@'' */ > if (ratetok != NULL) { > rc = vif_parse_rate_interval_usecs(cfg, ratetok, &interval_usecs); > if (rc) goto out; > @@ -121,14 +172,27 @@ int xlu_vif_parse_rate(XLU_Config *cfg, const char *rate, libxl_device_nic *nic) > goto out; > } > > + ratetok = strtok(tmp_pps, "&"); > + ratetok = strtok(NULL, "&@"); > + /* ratetok contains the first token following the ''&'' */ > + if (ratetok != NULL) { > + rc = vif_parse_rate_packets_per_sec(cfg, ratetok, &packets_per_sec); > + if (rc) goto out; > + } > + > bytes_per_interval > (((uint64_t) bytes_per_sec * (uint64_t) interval_usecs) / 1000000UL); > + packets_per_interval > + (((uint64_t) packets_per_sec * (uint64_t) interval_usecs) / 1000000UL); > > nic->rate_interval_usecs = interval_usecs; > nic->rate_bytes_per_interval = bytes_per_interval; > + nic->rate_packets_per_interval = packets_per_interval; > > out: > free(tmprate); > + free(tmp_pps); > + free(tmpint); > return rc; > } > > -- > 1.7.9.5
Wei Liu
2013-Aug-09 06:00 UTC
Re: [PATCH v4 0/3][xen-netback][toolstack] add a pseudo pps limit to netback
On Mon, Aug 05, 2013 at 05:13:07PM +0200, William Dauchy wrote:> VM traffic is already limited by a throughput limit, but there is no > control over the maximum packet per second (PPS). > In DDOS attack the major issue is rather PPS than throughput. > With provider offering more bandwidth to VMs, it becames easy to > coordinate a massive attack using VMs. Example: 100Mbits ~ 200kpps using > 64B packets. > This patch provides a new option to limit VMs maximum packets per second > emission rate. > It follows the same credits logic used for throughput shaping. For the > moment we have considered each "txreq" as a packet. > PPS limits is passed to VIF at connection time via xenstore. > PPS credit uses the same usecond period used by rate shaping check. > > known limitations: > - by using the same usecond period, PPS shaping depends on throughput > shaping. > - it is not always true that a "txreq" correspond to a paquet > (fragmentation cases) but as this shaping is meant to avoid DDOS > (small paquets) such an pproximation should not impact the results.^^^^^^^ ^ packets? extra "p"?> - Some help on burst handling will be appreciated. >Is this series RFC? I don''t see "RFC" in subject line. Do you intend to address this problem (burst handling)? Wei.> v2: > - fix some typo > > v3: > > - fix some typo > - add toolstack patch > > v4: > - fix toolstack memleak > Ahmed Amamou (1): > xen netback: add a pseudo pps rate limit > > drivers/net/xen-netback/common.h | 2 ++ > drivers/net/xen-netback/interface.c | 1 + > drivers/net/xen-netback/netback.c | 41 +++++++++++++++++++++++++++++++++++ > drivers/net/xen-netback/xenbus.c | 31 +++++++++++++++++++++----- > 4 files changed, 70 insertions(+), 5 deletions(-) > > [toolstack] > This patch will update the libxl in order to provide the new pps limit > new pps limit can be defined as follow > YYMb/s&XXKpps@ZZms > or > YYMb/s@ZZms&XXKpps > or > YYMb/s&XXKpps in such case default 50ms interval will be used > > Ahmed Amamou (2): > handle pps limit parameter > netif documentation > > docs/misc/xl-network-configuration.markdown | 18 +++++-- > tools/libxl/libxl.c | 3 ++ > tools/libxl/libxl_types.idl | 1 + > tools/libxl/libxlu_vif.c | 70 +++++++++++++++++++++++++-- > xen/include/public/io/netif.h | 27 +++++++++++ > 5 files changed, 111 insertions(+), 8 deletions(-) > > -- > 1.7.9.5
On Mon, Aug 05, 2013 at 05:13:10PM +0200, William Dauchy wrote:> add netif old rate limit documentation > add new pps limit documentation > > Signed-off-by: Ahmed Amamou <ahmed@gandi.net> > Signed-off-by: William Dauchy <william@gandi.net> > Signed-off-by: Kamel Haddadou <kamel@gandi.net> > --- > docs/misc/xl-network-configuration.markdown | 18 +++++++++++++----- > xen/include/public/io/netif.h | 27 +++++++++++++++++++++++++++ > 2 files changed, 40 insertions(+), 5 deletions(-) > > diff --git a/docs/misc/xl-network-configuration.markdown b/docs/misc/xl-network-configuration.markdown > index e0d3d2a..1f4ffff 100644 > --- a/docs/misc/xl-network-configuration.markdown > +++ b/docs/misc/xl-network-configuration.markdown > @@ -141,25 +141,33 @@ domain which is outside the scope of this document. > Specifies the rate at which the outgoing traffic will be limited to. > The default if this keyword is not specified is unlimited. > > -The rate may be specified as "<RATE>/s" or optionally "<RATE>/s@<INTERVAL>". > +The rate may be specified as "<RATE>/s" or optionally > +"<RATE>/s@<INTERVAL>" or "<RATE>/s&<PPS>pps" or "<RATE>/s&<PPS>pps@<INTERVAL>". > > * `RATE` is in bytes and can accept suffixes: > * GB, MB, KB, B for bytes. > * Gb, Mb, Kb, b for bits. > + > + * `PPS` is in packet and can accept suffixes: > + * Gpps, Mpps, Kpps, pps. > + It determines the packets per second that outgoing traffic will be > + limited to. > +This documentation doesn''t match the fact. Actually the "packet" here is just proximation. You should probably describe the limitation / assumption as well.> * `INTERVAL` is in microseconds and can accept suffixes: ms, us, s. > It determines the frequency at which the vif transmission credit > is replenished. The default is 50ms. > > -Vif rate limiting is credit-based. It means that for "1MB/s@20ms", the > +Vif rate limiting is credit-based. It means that for "1MB/s&50Kpps@20ms", theLine too long. Please align with previous line.> available credit will be equivalent of the traffic you would have done > -at "1MB/s" during 20ms. This will results in a credit of 20,000 bytes > -replenished every 20,000 us. > +at "1MB/s" or 50,000 pps during 20ms. This will results in a credit of 20,000Ditto.> +bytes or 1000 packets replenished every 20,000 us. > > For example: > > ''rate=10Mb/s'' -- meaning up to 10 megabits every second > ''rate=250KB/s'' -- meaning up to 250 kilobytes every second > - ''rate=1MB/s@20ms'' -- meaning 20,000 bytes in every 20 millisecond period > + ''rate=1MB/s&10Kpps@20ms'' -- meaning 20,000 bytes or 200 packets in > +every 20 millisecond periodShould indent this line with "meaning".> > NOTE: The actual underlying limits of rate limiting are dependent > on the underlying netback implementation. > diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h > index d477751..8bd112f 100644 > --- a/xen/include/public/io/netif.h > +++ b/xen/include/public/io/netif.h > @@ -79,6 +79,33 @@ > * Request N: netif_tx_request -- 0 > */ > > +/* > + * ------------------------- Backend Device Properties ------------------------- > + * > + * interval > + * Values: <uint64_t> > + * Default Value: 0 > + * > + * Used time interval in usecond for throughput and pps limits. The > + * same interval is used for both of them in order to simplify theWhat is "them" referring to?> + * implementation. Using the same check interval also avoid possible bugs > + * that invole bypassing these limits^^^^^^ typo?> + * > + * rate > + * Values: <uint64_t> > + * Default Value: ~0 > + * > + * The netback reading rate in bytes from the shared ring. This rate is > + * represented in bytes per interval. > + * > + * pps > + * Values: <uint64_t> > + * Default Value: ~0 > + * > + * The netback reading rate in tx slots from the shared ring.This rate isNeed a space before "This". Also this line exceeds 80 characters.> + * represented in request per interval.This is rather confusing, because 1) pps is packets per second, but you use "request per interval" here 2) "request" is just too generic (should use netif_tx_* instead) Wei.> + */ > + > /* Protocol checksum field is blank in the packet (hardware offload)? */ > #define _NETTXF_csum_blank (0) > #define NETTXF_csum_blank (1U<<_NETTXF_csum_blank) > -- > 1.7.9.5
On Mon, Aug 05, 2013 at 05:13:08PM +0200, William Dauchy wrote:> This patch provides a new option to limit VMs maximum packets per second > emission rate. > It follows the same credits logic used for throughput shaping. For the > moment we have considered each "txreq" as a packet. > PPS limits is passed to VIF at connection time via xenstore. > PPS credit uses the same usecond period used by rate shaping check. > > known limitations: > - by using the same usecond period, PPS shaping depends on throughput > shaping. > - it is not always true that a "txreq" correspond to a packet > (fragmentation cases) but as this shaping is meant to avoid DDOS > (small packets) such an approximation should not impact the results. > - Some help on burst handling will be appreciated. > > Signed-off-by: Ahmed Amamou <ahmed@gandi.net> > Signed-off-by: William Dauchy <william@gandi.net> > Signed-off-by: Kamel Haddadou <kamel@gandi.net> > --- > drivers/net/xen-netback/common.h | 2 ++ > drivers/net/xen-netback/interface.c | 1 + > drivers/net/xen-netback/netback.c | 41 +++++++++++++++++++++++++++++++++++ > drivers/net/xen-netback/xenbus.c | 35 ++++++++++++++++++++++++------ > 4 files changed, 72 insertions(+), 7 deletions(-) > > diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h > index 8a4d77e..e1a2d4f 100644 > --- a/drivers/net/xen-netback/common.h > +++ b/drivers/net/xen-netback/common.h > @@ -89,8 +89,10 @@ struct xenvif { > > /* Transmit shaping: allow ''credit_bytes'' every ''credit_usec''. */ > unsigned long credit_bytes; > + unsigned long credit_packets; > unsigned long credit_usec; > unsigned long remaining_credit; > + unsigned long remaining_packets; > struct timer_list credit_timeout; > > /* Statistics */ > diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c > index 087d2db..43c2da7 100644 > --- a/drivers/net/xen-netback/interface.c > +++ b/drivers/net/xen-netback/interface.c > @@ -295,6 +295,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, > INIT_LIST_HEAD(&vif->notify_list); > > vif->credit_bytes = vif->remaining_credit = ~0UL; > + vif->credit_packets = vif->remaining_packets = ~0UL; > vif->credit_usec = 0UL; > init_timer(&vif->credit_timeout); > /* Initialize ''expires'' now: it''s used to track the credit window. */ > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c > index 64828de..6ab3cb2 100644 > --- a/drivers/net/xen-netback/netback.c > +++ b/drivers/net/xen-netback/netback.c > @@ -912,10 +912,16 @@ static void tx_add_credit(struct xenvif *vif) > vif->remaining_credit = min(max_credit, max_burst); > } > > +static void tx_add_packets(struct xenvif *vif) > +{ > + vif->remaining_packets = vif->credit_packets; > +} > + > static void tx_credit_callback(unsigned long data) > { > struct xenvif *vif = (struct xenvif *)data; > tx_add_credit(vif); > + tx_add_packets(vif); > xen_netbk_check_rx_xenvif(vif); > } > > @@ -1426,6 +1432,34 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) > return false; > } > > +static bool tx_packets_exceeded(struct xenvif *vif) > +{ > + unsigned long now = jiffies; > + unsigned long next_credit > + vif->credit_timeout.expires + > + msecs_to_jiffies(vif->credit_usec / 1000); > + > + /* Timer could already be pending in rare cases. */ > + if (timer_pending(&vif->credit_timeout)) > + return true; > + > + /* Passed the point where we can replenish credit? */ > + if (time_after_eq(now, next_credit)) { > + vif->credit_timeout.expires = now; > + tx_add_packets(vif); > + } > + > + /* Not enough slot to send right now? Set a callback. */ > + if (vif->remaining_packets < 1) { > + vif->credit_timeout.data = (unsigned long)vif; > + vif->credit_timeout.function = tx_credit_callback; > + mod_timer(&vif->credit_timeout, next_credit); > + return true; > + } > + > + return false; > +} > + > static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) > { > struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop; > @@ -1477,6 +1511,12 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) > rmb(); /* Ensure that we see the request before we copy it. */ > memcpy(&txreq, RING_GET_REQUEST(&vif->tx, idx), sizeof(txreq)); > > + /* pps-based scheduling. */ > + if(vif->remaining_packets < 1 && tx_packets_exceeded(vif)) { > + xenvif_put(vif); > + continue; > + } > +Could we move this before memcpy, as this check doesn''t seem to rely on the content of txreq.> /* Credit-based scheduling. */ > if (txreq.size > vif->remaining_credit && > tx_credit_exceeded(vif, txreq.size)) { > @@ -1485,6 +1525,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk) > } > > vif->remaining_credit -= txreq.size; > + vif->remaining_packets--; > > work_to_do--; > vif->tx.req_cons = ++idx; > diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c > index 1fe48fe3..2b52a09 100644 > --- a/drivers/net/xen-netback/xenbus.c > +++ b/drivers/net/xen-netback/xenbus.c > @@ -276,15 +276,18 @@ static void frontend_changed(struct xenbus_device *dev, > > > static void xen_net_read_rate(struct xenbus_device *dev, > - unsigned long *bytes, unsigned long *usec) > + unsigned long *bytes, > + unsigned long *packets, > + unsigned long *usec) > { > char *s, *e; > - unsigned long b, u; > - char *ratestr; > + unsigned long b, u, pps; > + char *ratestr, *ppsstr; > > /* Default to unlimited bandwidth. */ > *bytes = ~0UL; > *usec = 0; > + *packets = ~0UL; > > ratestr = xenbus_read(XBT_NIL, dev->nodename, "rate", NULL); > if (IS_ERR(ratestr)) > @@ -293,22 +296,39 @@ static void xen_net_read_rate(struct xenbus_device *dev, > s = ratestr; > b = simple_strtoul(s, &e, 10); > if ((s == e) || (*e != '','')) > - goto fail; > + goto fail_ratestr; > > s = e + 1; > u = simple_strtoul(s, &e, 10); > if ((s == e) || (*e != ''\0'')) > - goto fail; > + goto fail_ratestr; > > *bytes = b; > *usec = u; > > + ppsstr = xenbus_read(XBT_NIL, dev->nodename, "pps", NULL); > + if (IS_ERR(ppsstr)) > + return; > + s = ppsstr; > + pps = simple_strtoul(s, &e, 10); > + if ((s == e) || (*e != ''\0'')) > + goto fail_ppsstr; > + *packets = pps; > + > kfree(ratestr); > + kfree(ppsstr); > return; > > - fail: > + fail_ppsstr: > + pr_warn("Failed to parse network PPS limit. PPS unlimited.\n"); > + kfree(ppsstr); > + goto free_ratestr; > + > + fail_ratestr: > pr_warn("Failed to parse network rate limit. Traffic unlimited.\n"); > + free_ratestr: > kfree(ratestr); > + return;No need to return here. :-) Wei.> } > > static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[]) > @@ -379,8 +399,9 @@ static void connect(struct backend_info *be) > } > > xen_net_read_rate(dev, &be->vif->credit_bytes, > - &be->vif->credit_usec); > + &be->vif->credit_packets, &be->vif->credit_usec); > be->vif->remaining_credit = be->vif->credit_bytes; > + be->vif->remaining_packets = be->vif->credit_packets; > > unregister_hotplug_status_watch(be); > err = xenbus_watch_pathfmt(dev, &be->hotplug_status_watch, > -- > 1.7.9.5
Wei Liu writes ("Re: [PATCH v4 2/3] handle pps limit parameter"):> Suggest adding "libxl: " prefix in subject line.Yes.> Plus, you should CC Ian Jackson for toolstack patches (which I''ve > already done for this mail).Thanks. Did this patch include a change to the documentation ? (I don''t know if Wei trimmed the patch.) It introduces a new syntax. The new syntax is rather exciting. Is it used anywhere else ? What do other tools do ?> One question I need to ask is that is it possible for user to only > specify PPS? From previous email and the code below it doesn''t seem to > allow users to do so. I know rate and PPS use the same interval but that > doesn''t mean that the later depend on the former, right?Good question. Also, I have some comments about the implementation: This patch seems to involve a lot of duplication of existing code. I''m afraid I''ll have to ask you to integrate it into the existing code, factoring out common functionality as necessary. The mixture of parsing with regexps and strtok was less than ideal to start with, but it becomes worse with the new syntax complexity. If this syntax is really the one we want, I''d suggest looking again at the possibility of using flex for part of it. flex allows the syntax to be described using an interleaved mixture of regexps and C code. Ian.