Hi: This is a new version of trying to enable tx interrupts for virtio-net. We used to try to avoid tx interrupts and orphan packets before transmission for virtio-net. This breaks socket accounting and can lead serveral other side effects e.g: - Several other functions which depends on socket accounting can not work correctly (e.g TCP Small Queue) - No tx completion which make BQL or packet generator can not work correctly. This series tries to solve the issue by enabling tx interrupts. To minize the performance impacts of this, several optimizations were used: - In guest side, try to use delayed callbacks as much as possible. - In host side, try to use interrupt coalescing for reduce interrupts. About 10% - 15% performance were improved with this. Perforamnce test shows: - Few regression (10% - 15%) were noticed TCP_RR were noticed, this regresssion were not seen in previous version. Still not clear the reason. - CPU utilization is increased in some cases. - All other cases, tx interrupts can perform equal or better than orphaning especially for small packet tx TODO: - Try to fix the regressions of TCP_RR - Determine a suitable coalescing paramters Test Environmets: - Two Intel Xeon E5620 @ 2.40GHz with back to back connected Intel 82599EB - Both host and guest were 4.1-rc4 - Vhost zerocopy disabled - idle=poll - Netperf 2.6.0 - tx-frames=8 tx-usecs=64 (which was chosen to be the best performance during testing other combinations) - Irqbalance were disabled by host, and smp affinity were set manually - Using default ixgbe coalescing parameters Test Result: 1 VCPU guest 1 Queue Guest TX size/session/+thu%/+normalize% 64/ 1/ +22%/ +23% 64/ 2/ +25%/ +26% 64/ 4/ +24%/ +24% 64/ 8/ +24%/ +25% 256/ 1/ +134%/ +141% 256/ 2/ +126%/ +132% 256/ 4/ +126%/ +134% 256/ 8/ +130%/ +135% 512/ 1/ +157%/ +170% 512/ 2/ +155%/ +169% 512/ 4/ +153%/ +168% 512/ 8/ +162%/ +176% 1024/ 1/ +84%/ +119% 1024/ 2/ +120%/ +146% 1024/ 4/ +105%/ +131% 1024/ 8/ +103%/ +134% 2048/ 1/ +20%/ +97% 2048/ 2/ +29%/ +76% 2048/ 4/ 0%/ +11% 2048/ 8/ 0%/ +3% 16384/ 1/ 0%/ -5% 16384/ 2/ 0%/ -10% 16384/ 4/ 0%/ -3% 16384/ 8/ 0%/ 0% 65535/ 1/ 0%/ -10% 65535/ 2/ 0%/ -5% 65535/ 4/ 0%/ -3% 65535/ 8/ 0%/ -5% TCP_RR size/session/+thu%/+normalize% 1/ 1/ 0%/ -9% 1/ 25/ -5%/ -5% 1/ 50/ -4%/ -3% 64/ 1/ 0%/ -7% 64/ 25/ -5%/ -6% 64/ 50/ -5%/ -6% 256/ 1/ 0%/ -6% 256/ 25/ -14%/ -14% 256/ 50/ -14%/ -14% Guest RX size/session/+thu%/+normalize% 64/ 1/ 0%/ -1% 64/ 2/ +3%/ +3% 64/ 4/ 0%/ -1% 64/ 8/ 0%/ 0% 256/ 1/ +5%/ +1% 256/ 2/ -9%/ -13% 256/ 4/ 0%/ -2% 256/ 8/ 0%/ -3% 512/ 1/ +1%/ -2% 512/ 2/ -3%/ -6% 512/ 4/ 0%/ -3% 512/ 8/ 0%/ -1% 1024/ 1/ +11%/ +16% 1024/ 2/ 0%/ -3% 1024/ 4/ 0%/ -2% 1024/ 8/ 0%/ -1% 2048/ 1/ 0%/ -3% 2048/ 2/ 0%/ -1% 2048/ 4/ 0%/ -1% 2048/ 8/ 0%/ -2% 16384/ 1/ 0%/ -2% 16384/ 2/ 0%/ -4% 16384/ 4/ 0%/ -3% 16384/ 8/ 0%/ -3% 65535/ 1/ 0%/ -2% 65535/ 2/ 0%/ -5% 65535/ 4/ 0%/ -1% 65535/ 8/ +1%/ 0% 4 VCPU guest 4 QUEUE Guest TX size/session/+thu%/+normalize% 64/ 1/ +42%/ +38% 64/ 2/ +33%/ +33% 64/ 4/ +16%/ +19% 64/ 8/ +19%/ +22% 256/ 1/ +139%/ +134% 256/ 2/ +43%/ +52% 256/ 4/ +1%/ +6% 256/ 8/ 0%/ +4% 512/ 1/ +171%/ +175% 512/ 2/ -1%/ +26% 512/ 4/ +9%/ +8% 512/ 8/ +48%/ +31% 1024/ 1/ +162%/ +171% 1024/ 2/ 0%/ +2% 1024/ 4/ +3%/ 0% 1024/ 8/ +6%/ +2% 2048/ 1/ +60%/ +94% 2048/ 2/ 0%/ +2% 2048/ 4/ +23%/ +11% 2048/ 8/ -1%/ -6% 16384/ 1/ 0%/ -12% 16384/ 2/ 0%/ -8% 16384/ 4/ 0%/ -9% 16384/ 8/ 0%/ -11% 65535/ 1/ 0%/ -15% 65535/ 2/ 0%/ -10% 65535/ 4/ 0%/ -6% 65535/ 8/ +1%/ -10% TCP_RR size/session/+thu%/+normalize% 1/ 1/ 0%/ -15% 1/ 25/ -14%/ -9% 1/ 50/ +3%/ +3% 64/ 1/ -3%/ -10% 64/ 25/ -13%/ -4% 64/ 50/ -7%/ -4% 256/ 1/ -1%/ -19% 256/ 25/ -15%/ -3% 256/ 50/ -16%/ -9% Guest RX size/session/+thu%/+normalize% 64/ 1/ +4%/ +21% 64/ 2/ +81%/ +140% 64/ 4/ +51%/ +196% 64/ 8/ -10%/ +33% 256/ 1/ +139%/ +216% 256/ 2/ +53%/ +114% 256/ 4/ -9%/ -5% 256/ 8/ -9%/ -14% 512/ 1/ +257%/ +413% 512/ 2/ +11%/ +32% 512/ 4/ -4%/ -6% 512/ 8/ -7%/ -10% 1024/ 1/ +98%/ +138% 1024/ 2/ -6%/ -9% 1024/ 4/ -3%/ -4% 1024/ 8/ -7%/ -10% 2048/ 1/ +32%/ +29% 2048/ 2/ -7%/ -14% 2048/ 4/ -3%/ -3% 2048/ 8/ -7%/ -3% 16384/ 1/ -13%/ -19% 16384/ 2/ -3%/ -9% 16384/ 4/ -7%/ -9% 16384/ 8/ -9%/ -10% 65535/ 1/ 0%/ -3% 65535/ 2/ -2%/ -10% 65535/ 4/ -6%/ -11% 65535/ 8/ -9%/ -9% 4 VCPU Guest 4 Queue Guest TX size/session/+thu%/+normalize% 64/ 1/ +33%/ +31% 64/ 2/ +26%/ +29% 64/ 4/ +24%/ +29% 64/ 8/ +19%/ +24% 256/ 1/ +117%/ +128% 256/ 2/ +96%/ +109% 256/ 4/ +123%/ +198% 256/ 8/ +54%/ +111% 512/ 1/ +153%/ +171% 512/ 2/ +77%/ +135% 512/ 4/ 0%/ +11% 512/ 8/ 0%/ +2% 1024/ 1/ +133%/ +156% 1024/ 2/ +21%/ +78% 1024/ 4/ 0%/ +3% 1024/ 8/ 0%/ -7% 2048/ 1/ +41%/ +60% 2048/ 2/ +50%/ +153% 2048/ 4/ 0%/ -10% 2048/ 8/ +2%/ -3% 16384/ 1/ 0%/ -7% 16384/ 2/ 0%/ -3% 16384/ 4/ +1%/ -9% 16384/ 8/ +4%/ -9% 65535/ 1/ 0%/ -7% 65535/ 2/ 0%/ -7% 65535/ 4/ +5%/ -2% 65535/ 8/ 0%/ -5% TCP_RR size/session/+thu%/+normalize% 1/ 1/ 0%/ -6% 1/ 25/ -17%/ -15% 1/ 50/ -24%/ -21% 64/ 1/ -1%/ -1% 64/ 25/ -14%/ -12% 64/ 50/ -23%/ -21% 256/ 1/ 0%/ -12% 256/ 25/ -4%/ -8% 256/ 50/ -7%/ -8% Guest RX size/session/+thu%/+normalize% 64/ 1/ +3%/ -4% 64/ 2/ +32%/ +41% 64/ 4/ +5%/ -3% 64/ 8/ +7%/ 0% 256/ 1/ 0%/ -10% 256/ 2/ -15%/ -26% 256/ 4/ 0%/ -5% 256/ 8/ -1%/ -11% 512/ 1/ +4%/ -7% 512/ 2/ -6%/ 0% 512/ 4/ 0%/ -8% 512/ 8/ 0%/ -8% 1024/ 1/ +71%/ -2% 1024/ 2/ -4%/ 0% 1024/ 4/ 0%/ -11% 1024/ 8/ 0%/ -9% 2048/ 1/ -1%/ +9% 2048/ 2/ -2%/ -2% 2048/ 4/ 0%/ -6% 2048/ 8/ 0%/ -10% 16384/ 1/ 0%/ -3% 16384/ 2/ 0%/ -14% 16384/ 4/ 0%/ -10% 16384/ 8/ -2%/ -13% 65535/ 1/ 0%/ -4% 65535/ 2/ +1%/ -16% 65535/ 4/ +1%/ -8% 65535/ 8/ +4%/ -6% Changes from RFCv5: - rebase the HEAD - Move net specific codes to virtio/vhost generic codes - Drop the wrong virtqueue_enable_cb_delayed() optimization from the series - Limit the enabling of tx interrupt only for host with interrupt coalescing. This can reduce the performance impact for older host. - Avoid expensive dividing in vhost code. - Try to avoid the overhead of timer callback by using mutex_trylock() and inject the irq directly from the timer callback. Changes from RFCv4: - fix the virtqueue_enable_cb_delayed() return value when only 1 buffer is pending. - try to disable callbacks by publish event index in virtqueue_disable_cb(). Tests shows about 2% - 3% improvement on multiple sessions of TCP_RR. - Revert some of Micahel's tweaks from RFC v1 (see patch 3 for details). - use netif_wake_subqueue() instead of netif_start_subqueue() in free_old_xmit_skbs(), since it may be called in tx napi. - in start_xmit(), try to enable the callback only when current skb is the last in the list or tx has already been stopped. This avoid the callbacks enabling in heavy load. - return ns instead of us in vhost_net_check_coalesce_and_signal() - measure the time interval of real interrupts instead of calls to vhost_signal() - drop bql from the series since it does not affact performance from the test result. Changes from RFC V3: - Don't free tx packets in ndo_start_xmit() - Add interrupt coalescing support for virtio-net Changes from RFC v2: - clean up code, address issues raised by Jason Changes from RFC v1: - address comments by Jason Wang, use delayed cb everywhere - rebased Jason's patch on top of mine and include it (with some tweaks) Jason Wang (7): virito-cpi: add coalescing parameters setting virtio_ring: try to disable event index callbacks in virtqueue_disable_cb() virtio-net: optimize free_old_xmit_skbs stats virtio-net: add basic interrupt coalescing support virtio_net: enable tx interrupt vhost: interrupt coalescing support vhost_net: add interrupt coalescing support drivers/net/virtio_net.c | 266 +++++++++++++++++++++++++++++++------ drivers/vhost/net.c | 8 ++ drivers/vhost/vhost.c | 88 +++++++++++- drivers/vhost/vhost.h | 20 +++ drivers/virtio/virtio_pci_modern.c | 15 +++ drivers/virtio/virtio_ring.c | 3 + include/linux/virtio_config.h | 8 ++ include/uapi/linux/vhost.h | 13 +- include/uapi/linux/virtio_pci.h | 4 + include/uapi/linux/virtio_ring.h | 1 + 10 files changed, 382 insertions(+), 44 deletions(-) -- 1.8.3.1
Jason Wang
2015-May-25 05:23 UTC
[RFC V7 PATCH 1/7] virito-pci: add coalescing parameters setting
This patch introduces a transport specific methods to set or get the coalescing parameters and implement the pci methods. Signed-off-by: Jason Wang <jasowang at redhat.com> --- drivers/virtio/virtio_pci_modern.c | 15 +++++++++++++++ include/linux/virtio_config.h | 8 ++++++++ include/uapi/linux/virtio_pci.h | 4 ++++ 3 files changed, 27 insertions(+) diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index e88e099..ce801ae 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -266,6 +266,16 @@ static void vp_set_status(struct virtio_device *vdev, u8 status) vp_iowrite8(status, &vp_dev->common->device_status); } +static void vp_set_coalesce(struct virtio_device *vdev, int n, + u32 coalesce_count, u32 coalesce_us) +{ + struct virtio_pci_device *vp_dev = to_vp_device(vdev); + + iowrite16(n, &vp_dev->common->queue_select); + iowrite32(coalesce_count, &vp_dev->common->queue_coalesce_count); + iowrite32(coalesce_us, &vp_dev->common->queue_coalesce_us); +} + static void vp_reset(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); @@ -481,6 +491,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = { .generation = vp_generation, .get_status = vp_get_status, .set_status = vp_set_status, + .set_coalesce = vp_set_coalesce, .reset = vp_reset, .find_vqs = vp_modern_find_vqs, .del_vqs = vp_del_vqs, @@ -588,6 +599,10 @@ static inline void check_offsets(void) offsetof(struct virtio_pci_common_cfg, queue_used_lo)); BUILD_BUG_ON(VIRTIO_PCI_COMMON_Q_USEDHI ! offsetof(struct virtio_pci_common_cfg, queue_used_hi)); + BUILD_BUG_ON(VIRTIO_PCI_COMMON_Q_COALESCE_C !+ offsetof(struct virtio_pci_common_cfg, queue_coalesce_count)); + BUILD_BUG_ON(VIRTIO_PCI_COMMON_Q_COALESCE_U !+ offsetof(struct virtio_pci_common_cfg, queue_coalesce_us)); } /* the PCI probing function */ diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index 1e306f7..d100c32 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -28,6 +28,12 @@ * @set_status: write the status byte * vdev: the virtio_device * status: the new status byte + * @set_coalesce: set coalescing parameters + * vdev: the virtio_device + * n: the queue index + * coalesce_count: maximum coalesced count before issuing interrupt + * coalesce_count: maximum micro seconds to wait if there's a + * pending buffer * @reset: reset the device * vdev: the virtio device * After this, status and feature negotiation must be done again @@ -66,6 +72,8 @@ struct virtio_config_ops { u32 (*generation)(struct virtio_device *vdev); u8 (*get_status)(struct virtio_device *vdev); void (*set_status)(struct virtio_device *vdev, u8 status); + void (*set_coalesce)(struct virtio_device *vdev, int n, + u32 coalesce_count, u32 coalesce_us); void (*reset)(struct virtio_device *vdev); int (*find_vqs)(struct virtio_device *, unsigned nvqs, struct virtqueue *vqs[], diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h index 7530146..3396026 100644 --- a/include/uapi/linux/virtio_pci.h +++ b/include/uapi/linux/virtio_pci.h @@ -155,6 +155,8 @@ struct virtio_pci_common_cfg { __le32 queue_avail_hi; /* read-write */ __le32 queue_used_lo; /* read-write */ __le32 queue_used_hi; /* read-write */ + __le32 queue_coalesce_count; /* read-write */ + __le32 queue_coalesce_us; /* read-write */ }; /* Macro versions of offsets for the Old Timers! */ @@ -187,6 +189,8 @@ struct virtio_pci_common_cfg { #define VIRTIO_PCI_COMMON_Q_AVAILHI 44 #define VIRTIO_PCI_COMMON_Q_USEDLO 48 #define VIRTIO_PCI_COMMON_Q_USEDHI 52 +#define VIRTIO_PCI_COMMON_Q_COALESCE_C 56 +#define VIRTIO_PCI_COMMON_Q_COALESCE_U 60 #endif /* VIRTIO_PCI_NO_MODERN */ -- 1.8.3.1
Jason Wang
2015-May-25 05:23 UTC
[RFC V7 PATCH 2/7] virtio_ring: try to disable event index callbacks in virtqueue_disable_cb()
Currently, we do nothing to prevent the callbacks in virtqueue_disable_cb() when event index is used. This may cause spurious interrupts which may damage the performance. This patch tries to publish last_used_idx as the used even to prevent the callbacks. Signed-off-by: Jason Wang <jasowang at redhat.com> --- drivers/virtio/virtio_ring.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 096b857..a83aebc 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -538,6 +538,7 @@ void virtqueue_disable_cb(struct virtqueue *_vq) struct vring_virtqueue *vq = to_vvq(_vq); vq->vring.avail->flags |= cpu_to_virtio16(_vq->vdev, VRING_AVAIL_F_NO_INTERRUPT); + vring_used_event(&vq->vring) = cpu_to_virtio16(_vq->vdev, vq->last_used_idx); } EXPORT_SYMBOL_GPL(virtqueue_disable_cb); -- 1.8.3.1
Jason Wang
2015-May-25 05:24 UTC
[RFC V7 PATCH 3/7] virtio-net: optimize free_old_xmit_skbs stats
We already have counters for sent packets and sent bytes. Use them to reduce the number of u64_stats_update_begin/end(). Take care not to bother with stats update when called speculatively. Cc: Rusty Russell <rusty at rustcorp.com.au> Cc: Michael S. Tsirkin <mst at redhat.com> Signed-off-by: Jason Wang <jasowang at redhat.com> Signed-off-by: Michael S. Tsirkin <mst at redhat.com> --- drivers/net/virtio_net.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 63c7810..744f0b1 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -826,17 +826,27 @@ static void free_old_xmit_skbs(struct send_queue *sq) unsigned int len; struct virtnet_info *vi = sq->vq->vdev->priv; struct virtnet_stats *stats = this_cpu_ptr(vi->stats); + unsigned int packets = 0, bytes = 0; while ((skb = virtqueue_get_buf(sq->vq, &len)) != NULL) { pr_debug("Sent skb %p\n", skb); - u64_stats_update_begin(&stats->tx_syncp); - stats->tx_bytes += skb->len; - stats->tx_packets++; - u64_stats_update_end(&stats->tx_syncp); + bytes += skb->len; + packets++; dev_kfree_skb_any(skb); } + + /* Avoid overhead when no packets have been processed + * happens when called speculatively from start_xmit. + */ + if (!packets) + return ; + + u64_stats_update_begin(&stats->tx_syncp); + stats->tx_bytes += bytes; + stats->tx_packets += packets; + u64_stats_update_end(&stats->tx_syncp); } static int xmit_skb(struct send_queue *sq, struct sk_buff *skb) -- 1.8.3.1
Jason Wang
2015-May-25 05:24 UTC
[RFC V7 PATCH 4/7] virtio-net: add basic interrupt coalescing support
This patch enables the interrupt coalescing setting through ethtool. Cc: Rusty Russell <rusty at rustcorp.com.au> Cc: Michael S. Tsirkin <mst at redhat.com> Signed-off-by: Jason Wang <jasowang at redhat.com> --- drivers/net/virtio_net.c | 62 ++++++++++++++++++++++++++++++++++++++++ drivers/virtio/virtio_ring.c | 2 ++ include/uapi/linux/virtio_ring.h | 1 + 3 files changed, 65 insertions(+) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 744f0b1..4ad739f 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -140,6 +140,14 @@ struct virtnet_info { /* CPU hot plug notifier */ struct notifier_block nb; + + /* Budget for polling tx completion */ + u32 tx_work_limit; + + __u32 rx_coalesce_usecs; + __u32 rx_max_coalesced_frames; + __u32 tx_coalesce_usecs; + __u32 tx_max_coalesced_frames; }; struct padded_vnet_hdr { @@ -1384,6 +1392,58 @@ static void virtnet_get_channels(struct net_device *dev, channels->other_count = 0; } +static int virtnet_set_coalesce(struct net_device *dev, + struct ethtool_coalesce *ec) +{ + struct virtnet_info *vi = netdev_priv(dev); + int i; + + if (!vi->vdev->config->set_coalesce) { + dev_warn(&dev->dev, "Transport does not support coalescing.\n"); + return -EINVAL; + } + + if (vi->rx_coalesce_usecs != ec->rx_coalesce_usecs || + vi->rx_max_coalesced_frames != ec->rx_max_coalesced_frames) { + for (i = 0; i < vi->max_queue_pairs; i++) { + vi->vdev->config->set_coalesce(vi->vdev, rxq2vq(i), + ec->rx_max_coalesced_frames, + ec->rx_coalesce_usecs); + } + vi->rx_coalesce_usecs = ec->rx_coalesce_usecs; + vi->rx_max_coalesced_frames = ec->rx_max_coalesced_frames; + } + + if (vi->tx_coalesce_usecs != ec->tx_coalesce_usecs || + vi->tx_max_coalesced_frames != ec->tx_max_coalesced_frames) { + for (i = 0; i < vi->max_queue_pairs; i++) { + vi->vdev->config->set_coalesce(vi->vdev, txq2vq(i), + ec->tx_max_coalesced_frames, + ec->tx_coalesce_usecs); + } + vi->tx_coalesce_usecs = ec->tx_coalesce_usecs; + vi->tx_max_coalesced_frames = ec->tx_max_coalesced_frames; + } + + vi->tx_work_limit = ec->tx_max_coalesced_frames_irq; + + return 0; +} + +static int virtnet_get_coalesce(struct net_device *dev, + struct ethtool_coalesce *ec) +{ + struct virtnet_info *vi = netdev_priv(dev); + + ec->rx_coalesce_usecs = vi->rx_coalesce_usecs; + ec->rx_max_coalesced_frames = vi->rx_max_coalesced_frames; + ec->tx_coalesce_usecs = vi->tx_coalesce_usecs; + ec->tx_max_coalesced_frames = vi->tx_max_coalesced_frames; + ec->tx_max_coalesced_frames_irq = vi->tx_work_limit; + + return 0; +} + static const struct ethtool_ops virtnet_ethtool_ops = { .get_drvinfo = virtnet_get_drvinfo, .get_link = ethtool_op_get_link, @@ -1391,6 +1451,8 @@ static const struct ethtool_ops virtnet_ethtool_ops = { .set_channels = virtnet_set_channels, .get_channels = virtnet_get_channels, .get_ts_info = ethtool_op_get_ts_info, + .set_coalesce = virtnet_set_coalesce, + .get_coalesce = virtnet_get_coalesce, }; #define MIN_MTU 68 diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index a83aebc..a2cdbe3 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -780,6 +780,8 @@ void vring_transport_features(struct virtio_device *vdev) break; case VIRTIO_RING_F_EVENT_IDX: break; + case VIRTIO_RING_F_INTR_COALESCING: + break; case VIRTIO_F_VERSION_1: break; default: diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h index 915980a..e9756d8 100644 --- a/include/uapi/linux/virtio_ring.h +++ b/include/uapi/linux/virtio_ring.h @@ -58,6 +58,7 @@ /* The Host publishes the avail index for which it expects a kick * at the end of the used ring. Guest should ignore the used->flags field. */ #define VIRTIO_RING_F_EVENT_IDX 29 +#define VIRTIO_RING_F_INTR_COALESCING 31 /* Virtio ring descriptors: 16 bytes. These can chain together via "next". */ struct vring_desc { -- 1.8.3.1
This patch enable tx interrupt for virtio-net driver. This can make socket accounting works again and help to reduce the buffer bloat. To reduce the performance impacts, only enable tx interrupt on newer host with interrupt coalescing support. Signed-off-by: Jason Wang <jasowang at redhat.com> --- drivers/net/virtio_net.c | 214 ++++++++++++++++++++++++++++++++++++----------- 1 file changed, 164 insertions(+), 50 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 4ad739f..a48b1f9 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -72,6 +72,8 @@ struct send_queue { /* Name of the send queue: output.$index */ char name[40]; + + struct napi_struct napi; }; /* Internal representation of a receive virtqueue */ @@ -123,6 +125,9 @@ struct virtnet_info { /* Host can handle any s/g split between our header and packet data */ bool any_header_sg; + /* Host can coalesce interrupts */ + bool intr_coalescing; + /* Packet virtio header size */ u8 hdr_len; @@ -215,15 +220,54 @@ static struct page *get_a_page(struct receive_queue *rq, gfp_t gfp_mask) return p; } +static unsigned int free_old_xmit_skbs(struct netdev_queue *txq, + struct send_queue *sq, int budget) +{ + struct sk_buff *skb; + unsigned int len; + struct virtnet_info *vi = sq->vq->vdev->priv; + struct virtnet_stats *stats = this_cpu_ptr(vi->stats); + unsigned int packets = 0, bytes = 0; + + while (packets < budget && + (skb = virtqueue_get_buf(sq->vq, &len)) != NULL) { + pr_debug("Sent skb %p\n", skb); + + bytes += skb->len; + packets++; + + dev_kfree_skb_any(skb); + } + + if (vi->intr_coalescing && + sq->vq->num_free >= 2 + MAX_SKB_FRAGS) + netif_wake_subqueue(vi->dev, vq2txq(sq->vq)); + + u64_stats_update_begin(&stats->tx_syncp); + stats->tx_bytes += bytes; + stats->tx_packets += packets; + u64_stats_update_end(&stats->tx_syncp); + + return packets; +} + static void skb_xmit_done(struct virtqueue *vq) { struct virtnet_info *vi = vq->vdev->priv; + struct send_queue *sq = &vi->sq[vq2txq(vq)]; - /* Suppress further interrupts. */ - virtqueue_disable_cb(vq); + if (vi->intr_coalescing) { + if (napi_schedule_prep(&sq->napi)) { + virtqueue_disable_cb(sq->vq); + __napi_schedule(&sq->napi); + } + } else { + /* Suppress further interrupts. */ + virtqueue_disable_cb(vq); - /* We were probably waiting for more output buffers. */ - netif_wake_subqueue(vi->dev, vq2txq(vq)); + /* We were probably waiting for more output buffers. */ + netif_wake_subqueue(vi->dev, vq2txq(vq)); + } } static unsigned int mergeable_ctx_to_buf_truesize(unsigned long mrg_ctx) @@ -775,6 +819,30 @@ static int virtnet_poll(struct napi_struct *napi, int budget) return received; } +static int virtnet_poll_tx(struct napi_struct *napi, int budget) +{ + struct send_queue *sq + container_of(napi, struct send_queue, napi); + struct virtnet_info *vi = sq->vq->vdev->priv; + struct netdev_queue *txq = netdev_get_tx_queue(vi->dev, vq2txq(sq->vq)); + u32 limit = vi->tx_work_limit; + unsigned int r, sent; + + __netif_tx_lock(txq, smp_processor_id()); + sent = free_old_xmit_skbs(txq, sq, limit); + if (sent < limit) { + r = virtqueue_enable_cb_prepare(sq->vq); + napi_complete(napi); + if (unlikely(virtqueue_poll(sq->vq, r)) && + napi_schedule_prep(napi)) { + virtqueue_disable_cb(sq->vq); + __napi_schedule(napi); + } + } + __netif_tx_unlock(txq); + return sent < limit ? 0 : budget; +} + #ifdef CONFIG_NET_RX_BUSY_POLL /* must be called with local_bh_disable()d */ static int virtnet_busy_poll(struct napi_struct *napi) @@ -823,40 +891,12 @@ static int virtnet_open(struct net_device *dev) if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL)) schedule_delayed_work(&vi->refill, 0); virtnet_napi_enable(&vi->rq[i]); + napi_enable(&vi->sq[i].napi); } return 0; } -static void free_old_xmit_skbs(struct send_queue *sq) -{ - struct sk_buff *skb; - unsigned int len; - struct virtnet_info *vi = sq->vq->vdev->priv; - struct virtnet_stats *stats = this_cpu_ptr(vi->stats); - unsigned int packets = 0, bytes = 0; - - while ((skb = virtqueue_get_buf(sq->vq, &len)) != NULL) { - pr_debug("Sent skb %p\n", skb); - - bytes += skb->len; - packets++; - - dev_kfree_skb_any(skb); - } - - /* Avoid overhead when no packets have been processed - * happens when called speculatively from start_xmit. - */ - if (!packets) - return ; - - u64_stats_update_begin(&stats->tx_syncp); - stats->tx_bytes += bytes; - stats->tx_packets += packets; - u64_stats_update_end(&stats->tx_syncp); -} - static int xmit_skb(struct send_queue *sq, struct sk_buff *skb) { struct virtio_net_hdr_mrg_rxbuf *hdr; @@ -921,7 +961,9 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb) sg_set_buf(sq->sg, hdr, hdr_len); num_sg = skb_to_sgvec(skb, sq->sg + 1, 0, skb->len) + 1; } - return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC); + + return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, + GFP_ATOMIC); } static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) @@ -934,7 +976,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) bool kick = !skb->xmit_more; /* Free up any pending old buffers before queueing new ones. */ - free_old_xmit_skbs(sq); + free_old_xmit_skbs(txq, sq, virtqueue_get_vring_size(sq->vq)); /* timestamp packet in software */ skb_tx_timestamp(skb); @@ -957,21 +999,13 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) skb_orphan(skb); nf_reset(skb); - /* If running out of space, stop queue to avoid getting packets that we - * are then unable to transmit. - * An alternative would be to force queuing layer to requeue the skb by - * returning NETDEV_TX_BUSY. However, NETDEV_TX_BUSY should not be - * returned in a normal path of operation: it means that driver is not - * maintaining the TX queue stop/start state properly, and causes - * the stack to do a non-trivial amount of useless work. - * Since most packets only take 1 or 2 ring slots, stopping the queue - * early means 16 slots are typically wasted. - */ + /* Apparently nice girls don't return TX_BUSY; stop the queue + * before it gets out of hand. Naturally, this wastes entries. */ if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { netif_stop_subqueue(dev, qnum); if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { /* More just got used, free them then recheck. */ - free_old_xmit_skbs(sq); + free_old_xmit_skbs(txq, sq, virtqueue_get_vring_size(sq->vq)); if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { netif_start_subqueue(dev, qnum); virtqueue_disable_cb(sq->vq); @@ -985,6 +1019,50 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } +static netdev_tx_t start_xmit_txintr(struct sk_buff *skb, struct net_device *dev) +{ + struct virtnet_info *vi = netdev_priv(dev); + int qnum = skb_get_queue_mapping(skb); + struct send_queue *sq = &vi->sq[qnum]; + int err; + struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); + bool kick = !skb->xmit_more; + + virtqueue_disable_cb(sq->vq); + + /* timestamp packet in software */ + skb_tx_timestamp(skb); + + /* Try to transmit */ + err = xmit_skb(sq, skb); + + /* This should not happen! */ + if (unlikely(err)) { + dev->stats.tx_fifo_errors++; + if (net_ratelimit()) + dev_warn(&dev->dev, + "Unexpected TXQ (%d) queue failure: %d\n", qnum, err); + dev->stats.tx_dropped++; + dev_kfree_skb_any(skb); + return NETDEV_TX_OK; + } + + /* Apparently nice girls don't return TX_BUSY; stop the queue + * before it gets out of hand. Naturally, this wastes entries. */ + if (sq->vq->num_free < 2+MAX_SKB_FRAGS) + netif_stop_subqueue(dev, qnum); + + if (kick || netif_xmit_stopped(txq)) { + virtqueue_kick(sq->vq); + if (!virtqueue_enable_cb_delayed(sq->vq) && + napi_schedule_prep(&sq->napi)) { + virtqueue_disable_cb(sq->vq); + __napi_schedule(&sq->napi); + } + } + return NETDEV_TX_OK; +} + /* * Send command via the control virtqueue and check status. Commands * supported by the hypervisor, as indicated by feature bits, should @@ -1159,8 +1237,10 @@ static int virtnet_close(struct net_device *dev) /* Make sure refill_work doesn't re-enable napi! */ cancel_delayed_work_sync(&vi->refill); - for (i = 0; i < vi->max_queue_pairs; i++) + for (i = 0; i < vi->max_queue_pairs; i++) { napi_disable(&vi->rq[i].napi); + napi_disable(&vi->sq[i].napi); + } return 0; } @@ -1485,6 +1565,25 @@ static const struct net_device_ops virtnet_netdev = { #endif }; +static const struct net_device_ops virtnet_netdev_txintr = { + .ndo_open = virtnet_open, + .ndo_stop = virtnet_close, + .ndo_start_xmit = start_xmit_txintr, + .ndo_validate_addr = eth_validate_addr, + .ndo_set_mac_address = virtnet_set_mac_address, + .ndo_set_rx_mode = virtnet_set_rx_mode, + .ndo_change_mtu = virtnet_change_mtu, + .ndo_get_stats64 = virtnet_stats, + .ndo_vlan_rx_add_vid = virtnet_vlan_rx_add_vid, + .ndo_vlan_rx_kill_vid = virtnet_vlan_rx_kill_vid, +#ifdef CONFIG_NET_POLL_CONTROLLER + .ndo_poll_controller = virtnet_netpoll, +#endif +#ifdef CONFIG_NET_RX_BUSY_POLL + .ndo_busy_poll = virtnet_busy_poll, +#endif +}; + static void virtnet_config_changed_work(struct work_struct *work) { struct virtnet_info *vi @@ -1531,6 +1630,7 @@ static void virtnet_free_queues(struct virtnet_info *vi) for (i = 0; i < vi->max_queue_pairs; i++) { napi_hash_del(&vi->rq[i].napi); netif_napi_del(&vi->rq[i].napi); + netif_napi_del(&vi->sq[i].napi); } kfree(vi->rq); @@ -1685,6 +1785,8 @@ static int virtnet_alloc_queues(struct virtnet_info *vi) netif_napi_add(vi->dev, &vi->rq[i].napi, virtnet_poll, napi_weight); napi_hash_add(&vi->rq[i].napi); + netif_napi_add(vi->dev, &vi->sq[i].napi, virtnet_poll_tx, + napi_weight); sg_init_table(vi->rq[i].sg, ARRAY_SIZE(vi->rq[i].sg)); ewma_init(&vi->rq[i].mrg_avg_pkt_len, 1, RECEIVE_AVG_WEIGHT); @@ -1819,7 +1921,10 @@ static int virtnet_probe(struct virtio_device *vdev) /* Set up network device as normal. */ dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE; - dev->netdev_ops = &virtnet_netdev; + if (virtio_has_feature(vdev, VIRTIO_RING_F_INTR_COALESCING)) + dev->netdev_ops = &virtnet_netdev_txintr; + else + dev->netdev_ops = &virtnet_netdev; dev->features = NETIF_F_HIGHDMA; dev->ethtool_ops = &virtnet_ethtool_ops; @@ -1906,6 +2011,9 @@ static int virtnet_probe(struct virtio_device *vdev) if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) vi->has_cvq = true; + if (virtio_has_feature(vdev, VIRTIO_RING_F_INTR_COALESCING)) + vi->intr_coalescing = true; + if (vi->any_header_sg) dev->needed_headroom = vi->hdr_len; @@ -1918,6 +2026,8 @@ static int virtnet_probe(struct virtio_device *vdev) if (err) goto free_stats; + vi->tx_work_limit = napi_weight; + #ifdef CONFIG_SYSFS if (vi->mergeable_rx_bufs) dev->sysfs_rx_queue_group = &virtio_net_mrg_rx_group; @@ -2030,8 +2140,10 @@ static int virtnet_freeze(struct virtio_device *vdev) cancel_delayed_work_sync(&vi->refill); if (netif_running(vi->dev)) { - for (i = 0; i < vi->max_queue_pairs; i++) + for (i = 0; i < vi->max_queue_pairs; i++) { napi_disable(&vi->rq[i].napi); + napi_disable(&vi->sq[i].napi); + } } remove_vq_common(vi); @@ -2055,8 +2167,10 @@ static int virtnet_restore(struct virtio_device *vdev) if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL)) schedule_delayed_work(&vi->refill, 0); - for (i = 0; i < vi->max_queue_pairs; i++) + for (i = 0; i < vi->max_queue_pairs; i++) { virtnet_napi_enable(&vi->rq[i]); + napi_enable(&vi->sq[i].napi); + } } netif_device_attach(vi->dev); -- 1.8.3.1
This patch implements basic interrupt coalescing support. This is done by introducing two new per virtqueue parameters: - max_coalescced_buffers: maximum number of buffers before trying to issue an interrupt. - coalesce_usecs: maximum number of microseconds waited if at least one buffer is pending before trying to issue an interrupt. A new ioctl was also introduced for userspace to set or get the above two values. The number of coalesced buffers were increased in vhost_add_used_n() and vhost_signal() was modified that it will only try to issue an interrupt when: - The number of coalesced buffers exceed or is equal to max_coalesced_buffes. - The time since last signal trying exceed or is equal to coalesce_usecs. When neither of the above two conditions were met, the interrupt was delayed and during exit of a round of processing, device specific code will call vhost_check_coalesce_and_signal() to check the above two conditions again and schedule a timer for delayed interrupt if the conditions were still not met. Signed-off-by: Jason Wang <jasowang at redhat.com> --- drivers/vhost/vhost.c | 88 ++++++++++++++++++++++++++++++++++++++++++++-- drivers/vhost/vhost.h | 20 +++++++++++ include/uapi/linux/vhost.h | 13 ++++++- 3 files changed, 117 insertions(+), 4 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 2ee2826..7739112 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -199,6 +199,11 @@ static void vhost_vq_reset(struct vhost_dev *dev, vq->call = NULL; vq->log_ctx = NULL; vq->memory = NULL; + vq->coalesce_usecs = ktime_set(0, 0); + vq->max_coalesced_buffers = 0; + vq->coalesced = 0; + vq->last_signal = ktime_get(); + hrtimer_cancel(&vq->ctimer); } static int vhost_worker(void *data) @@ -291,6 +296,23 @@ static void vhost_dev_free_iovecs(struct vhost_dev *dev) vhost_vq_free_iovecs(dev->vqs[i]); } +void vhost_check_coalesce_and_signal(struct vhost_dev *dev, + struct vhost_virtqueue *vq, + bool timer); +static enum hrtimer_restart vhost_ctimer_handler(struct hrtimer *timer) +{ + struct vhost_virtqueue *vq + container_of(timer, struct vhost_virtqueue, ctimer); + + if (mutex_trylock(&vq->mutex)) { + vhost_check_coalesce_and_signal(vq->dev, vq, false); + mutex_unlock(&vq->mutex); + } else + vhost_poll_queue(&vq->poll); + + return HRTIMER_NORESTART; +} + void vhost_dev_init(struct vhost_dev *dev, struct vhost_virtqueue **vqs, int nvqs) { @@ -315,6 +337,8 @@ void vhost_dev_init(struct vhost_dev *dev, vq->heads = NULL; vq->dev = dev; mutex_init(&vq->mutex); + hrtimer_init(&vq->ctimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + vq->ctimer.function = vhost_ctimer_handler; vhost_vq_reset(dev, vq); if (vq->handle_kick) vhost_poll_init(&vq->poll, vq->handle_kick, @@ -640,6 +664,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp) struct vhost_vring_state s; struct vhost_vring_file f; struct vhost_vring_addr a; + struct vhost_vring_coalesce c; u32 idx; long r; @@ -696,6 +721,19 @@ long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp) if (copy_to_user(argp, &s, sizeof s)) r = -EFAULT; break; + case VHOST_SET_VRING_COALESCE: + if (copy_from_user(&c, argp, sizeof c)) { + r = -EFAULT; + break; + } + vq->coalesce_usecs = ns_to_ktime(c.coalesce_usecs * NSEC_PER_USEC) ; + vq->max_coalesced_buffers = c.max_coalesced_buffers; + break; + case VHOST_GET_VRING_COALESCE: + s.index = idx; + if (copy_to_user(argp, &c, sizeof c)) + r = -EFAULT; + break; case VHOST_SET_VRING_ADDR: if (copy_from_user(&a, argp, sizeof a)) { r = -EFAULT; @@ -1415,6 +1453,9 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads, { int start, n, r; + if (vq->max_coalesced_buffers && ktime_to_ns(vq->coalesce_usecs)) + vq->coalesced += count; + start = vq->last_used_idx % vq->num; n = vq->num - start; if (n < count) { @@ -1440,6 +1481,7 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads, if (vq->log_ctx) eventfd_signal(vq->log_ctx, 1); } + return r; } EXPORT_SYMBOL_GPL(vhost_add_used_n); @@ -1481,15 +1523,55 @@ static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) return vring_need_event(vhost16_to_cpu(vq, event), new, old); } +static void __vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq) +{ + if (vq->call_ctx && vhost_notify(dev, vq)) { + eventfd_signal(vq->call_ctx, 1); + } + + vq->coalesced = 0; + vq->last_signal = ktime_get(); +} + /* This actually signals the guest, using eventfd. */ void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq) { - /* Signal the Guest tell them we used something up. */ - if (vq->call_ctx && vhost_notify(dev, vq)) - eventfd_signal(vq->call_ctx, 1); + bool can_coalesce = vq->max_coalesced_buffers && + ktime_to_ns(vq->coalesce_usecs); + + if (can_coalesce) { + ktime_t passed = ktime_sub(ktime_get(), vq->last_signal); + + if ((vq->coalesced >= vq->max_coalesced_buffers) || + !ktime_before(passed, vq->coalesce_usecs)) + __vhost_signal(dev, vq); + } else { + __vhost_signal(dev, vq); + } } EXPORT_SYMBOL_GPL(vhost_signal); +void vhost_check_coalesce_and_signal(struct vhost_dev *dev, + struct vhost_virtqueue *vq, + bool timer) +{ + bool can_coalesce = vq->max_coalesced_buffers && + ktime_to_ns(vq->coalesce_usecs); + + hrtimer_try_to_cancel(&vq->ctimer); + if (can_coalesce && vq->coalesced) { + ktime_t passed = ktime_sub(ktime_get(), vq->last_signal); + ktime_t left = ktime_sub(vq->coalesce_usecs, passed); + + if (ktime_to_ns(left) <= 0) { + __vhost_signal(dev, vq); + } else if (timer) { + hrtimer_start(&vq->ctimer, left, HRTIMER_MODE_REL); + } + } +} +EXPORT_SYMBOL_GPL(vhost_check_coalesce_and_signal); + /* And here's the combo meal deal. Supersize me! */ void vhost_add_used_and_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq, diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 8c1c792..2e6754d 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -92,6 +92,23 @@ struct vhost_virtqueue { /* Last used index value we have signalled on */ bool signalled_used_valid; + /* Maxinum microseconds waited after at least one buffer is + * processed before generating an interrupt. + */ + ktime_t coalesce_usecs; + + /* Maxinum number of pending buffers before genearting an interrupt. */ + __u32 max_coalesced_buffers; + + /* The number of buffers whose interrupt are coalesced */ + __u32 coalesced; + + /* Last time we singalled guest. */ + ktime_t last_signal; + + /* Timer used to trigger an coalesced interrupt. */ + struct hrtimer ctimer; + /* Log writes to used structure. */ bool log_used; u64 log_addr; @@ -149,6 +166,9 @@ void vhost_add_used_and_signal(struct vhost_dev *, struct vhost_virtqueue *, void vhost_add_used_and_signal_n(struct vhost_dev *, struct vhost_virtqueue *, struct vring_used_elem *heads, unsigned count); void vhost_signal(struct vhost_dev *, struct vhost_virtqueue *); +void vhost_check_coalesce_and_signal(struct vhost_dev *dev, + struct vhost_virtqueue *vq, + bool timer); void vhost_disable_notify(struct vhost_dev *, struct vhost_virtqueue *); bool vhost_enable_notify(struct vhost_dev *, struct vhost_virtqueue *); diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h index bb6a5b4..6362e6e 100644 --- a/include/uapi/linux/vhost.h +++ b/include/uapi/linux/vhost.h @@ -27,6 +27,12 @@ struct vhost_vring_file { }; +struct vhost_vring_coalesce { + unsigned int index; + __u32 coalesce_usecs; + __u32 max_coalesced_buffers; +}; + struct vhost_vring_addr { unsigned int index; /* Option flags. */ @@ -102,7 +108,12 @@ struct vhost_memory { #define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state) /* Get accessor: reads index, writes value in num */ #define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state) - +/* Set coalescing parameters for the ring. */ +#define VHOST_SET_VRING_COALESCE _IOW(VHOST_VIRTIO, 0x13, \ + struct vhost_vring_coalesce) +/* Get coalescing parameters for the ring. */ +#define VHOST_GET_VRING_COALESCE _IOW(VHOST_VIRTIO, 0x14, \ + struct vhost_vring_coalesce) /* The following ioctls use eventfd file descriptors to signal and poll * for events. */ -- 1.8.3.1
Jason Wang
2015-May-25 05:24 UTC
[RFC V7 PATCH 7/7] vhost_net: add interrupt coalescing support
Signed-off-by: Jason Wang <jasowang at redhat.com> --- drivers/vhost/net.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 7d137a4..5ee28b7 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -320,6 +320,9 @@ static void handle_tx(struct vhost_net *net) hdr_size = nvq->vhost_hlen; zcopy = nvq->ubufs; + /* Finish pending interrupts first */ + vhost_check_coalesce_and_signal(vq->dev, vq, false); + for (;;) { /* Release DMAs done buffers first */ if (zcopy) @@ -415,6 +418,7 @@ static void handle_tx(struct vhost_net *net) } } out: + vhost_check_coalesce_and_signal(vq->dev, vq, true); mutex_unlock(&vq->mutex); } @@ -554,6 +558,9 @@ static void handle_rx(struct vhost_net *net) vq->log : NULL; mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF); + /* Finish pending interrupts first */ + vhost_check_coalesce_and_signal(vq->dev, vq, false); + while ((sock_len = peek_head_len(sock->sk))) { sock_len += sock_hlen; vhost_len = sock_len + vhost_hlen; @@ -638,6 +645,7 @@ static void handle_rx(struct vhost_net *net) } } out: + vhost_check_coalesce_and_signal(vq->dev, vq, true); mutex_unlock(&vq->mutex); } -- 1.8.3.1
Stephen Hemminger
2015-May-26 18:02 UTC
[RFC V7 PATCH 7/7] vhost_net: add interrupt coalescing support
On Mon, 25 May 2015 01:24:04 -0400 Jason Wang <jasowang at redhat.com> wrote:> Signed-off-by: Jason Wang <jasowang at redhat.com> > --- > drivers/vhost/net.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c > index 7d137a4..5ee28b7 100644 > --- a/drivers/vhost/net.c > +++ b/drivers/vhost/net.c > @@ -320,6 +320,9 @@ static void handle_tx(struct vhost_net *net) > hdr_size = nvq->vhost_hlen; > zcopy = nvq->ubufs; > > + /* Finish pending interrupts first */ > + vhost_check_coalesce_and_signal(vq->dev, vq, false); > + > for (;;) { > /* Release DMAs done buffers first */ > if (zcopy) > @@ -415,6 +418,7 @@ static void handle_tx(struct vhost_net *net) > } > } > out: > + vhost_check_coalesce_and_signal(vq->dev, vq, true); > mutex_unlock(&vq->mutex); > } > > @@ -554,6 +558,9 @@ static void handle_rx(struct vhost_net *net) > vq->log : NULL; > mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF); > > + /* Finish pending interrupts first */ > + vhost_check_coalesce_and_signal(vq->dev, vq, false); > + > while ((sock_len = peek_head_len(sock->sk))) { > sock_len += sock_hlen; > vhost_len = sock_len + vhost_hlen; > @@ -638,6 +645,7 @@ static void handle_rx(struct vhost_net *net) > } > } > out: > + vhost_check_coalesce_and_signal(vq->dev, vq, true); > mutex_unlock(&vq->mutex); > } >Could you implement ethtool control of these coalescing parameters?
Reasonably Related Threads
- [RFC V7 PATCH 7/7] vhost_net: add interrupt coalescing support
- [RFC V7 PATCH 7/7] vhost_net: add interrupt coalescing support
- [PATCH V3 3/3] vhost_net: basic polling support
- [PATCH V2 3/3] vhost_net: basic polling support
- [RFC PATCH net-next 04/12] vhost_net: split out datacopy logic