This is a rebase/resend of earlier patches. I skipped the pure cosmetic patches for now. Mostly this is consolidation earlier changes, removing dead code etc. The important part is the change for allowing a vmbus channel to get callback directly in interrupt mode; this is necessary for NAPI support. Stephen Hemminger (14): vmbus: use kernel bitops for traversing interrupt mask vmbus: drop no longer used kick_q argument vmbus: remove no longer used signal_policy vmbus: remove unused kickq argument to sendpacket netvsc: remove no longer needed receive staging buffers vmbus: remove per channel state vmbus: callback is in softirq not workqueue vmbus: put related per-cpu variable together vmbus: change to per channel tasklet vmbus: add direct isr callback mode vmbus: remove conditional locking of vmbus_write vmbus: expose hv_begin/end_read vmbus: constify parameters where possible vmbus: replace modulus operation with subtraction Starting point was top of current char-misc-next branch. drivers/hv/channel.c | 47 +++++-------- drivers/hv/channel_mgmt.c | 41 ++++++------ drivers/hv/connection.c | 134 +++++--------------------------------- drivers/hv/hv.c | 124 +++++++++++++++-------------------- drivers/hv/hv_util.c | 3 +- drivers/hv/hyperv_vmbus.h | 80 ++++++++++++----------- drivers/hv/ring_buffer.c | 66 ++++++------------- drivers/hv/vmbus_drv.c | 115 ++++++++++++++++++++++++++------ drivers/net/hyperv/hyperv_net.h | 5 -- drivers/net/hyperv/netvsc.c | 104 ++++------------------------- drivers/net/hyperv/rndis_filter.c | 11 ---- drivers/uio/uio_hv_generic.c | 2 +- include/linux/hyperv.h | 134 +++++++++++++++++--------------------- 13 files changed, 338 insertions(+), 528 deletions(-) -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:28 UTC
[PATCH 01/14] vmbus: use kernel bitops for traversing interrupt mask
Use standard kernel operations for find first set bit to traverse the channel bit array. This has added benefit of speeding up lookup on 64 bit and because it uses find first set instruction. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/hv/channel.c | 8 ++----- drivers/hv/connection.c | 55 +++++++++++++++-------------------------------- drivers/hv/hyperv_vmbus.h | 16 ++++++++------ drivers/hv/vmbus_drv.c | 4 +--- 4 files changed, 29 insertions(+), 54 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index be34547cdb68..a016c5c0e472 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -47,12 +47,8 @@ void vmbus_setevent(struct vmbus_channel *channel) * For channels marked as in "low latency" mode * bypass the monitor page mechanism. */ - if ((channel->offermsg.monitor_allocated) && - (!channel->low_latency)) { - /* Each u32 represents 32 channels */ - sync_set_bit(channel->offermsg.child_relid & 31, - (unsigned long *) vmbus_connection.send_int_page + - (channel->offermsg.child_relid >> 5)); + if (channel->offermsg.monitor_allocated && !channel->low_latency) { + vmbus_send_interrupt(channel->offermsg.child_relid); /* Get the child to parent monitor page */ monitorpage = vmbus_connection.monitor_pages[1]; diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index 307a5a8937f6..1766ef03e78d 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -379,17 +379,11 @@ static void process_chn_event(u32 relid) */ void vmbus_on_event(unsigned long data) { - u32 dword; - u32 maxdword; - int bit; - u32 relid; - u32 *recv_int_page = NULL; - void *page_addr; - int cpu = smp_processor_id(); - union hv_synic_event_flags *event; + unsigned long *recv_int_page; + u32 maxbits, relid; if (vmbus_proto_version < VERSION_WIN8) { - maxdword = MAX_NUM_CHANNELS_SUPPORTED >> 5; + maxbits = MAX_NUM_CHANNELS_SUPPORTED; recv_int_page = vmbus_connection.recv_int_page; } else { /* @@ -397,35 +391,24 @@ void vmbus_on_event(unsigned long data) * can be directly checked to get the id of the channel * that has the interrupt pending. */ - maxdword = HV_EVENT_FLAGS_DWORD_COUNT; - page_addr = hv_context.synic_event_page[cpu]; - event = (union hv_synic_event_flags *)page_addr + + int cpu = smp_processor_id(); + void *page_addr = hv_context.synic_event_page[cpu]; + union hv_synic_event_flags *event + = (union hv_synic_event_flags *)page_addr + VMBUS_MESSAGE_SINT; - recv_int_page = event->flags32; - } - + maxbits = HV_EVENT_FLAGS_COUNT; + recv_int_page = event->flags; + } - /* Check events */ - if (!recv_int_page) + if (unlikely(!recv_int_page)) return; - for (dword = 0; dword < maxdword; dword++) { - if (!recv_int_page[dword]) - continue; - for (bit = 0; bit < 32; bit++) { - if (sync_test_and_clear_bit(bit, - (unsigned long *)&recv_int_page[dword])) { - relid = (dword << 5) + bit; - - if (relid == 0) - /* - * Special case - vmbus - * channel protocol msg - */ - continue; + for_each_set_bit(relid, recv_int_page, maxbits) { + if (sync_test_and_clear_bit(relid, recv_int_page)) { + /* Special case - vmbus channel protocol msg */ + if (relid != 0) process_chn_event(relid); - } } } } @@ -491,12 +474,8 @@ void vmbus_set_event(struct vmbus_channel *channel) { u32 child_relid = channel->offermsg.child_relid; - if (!channel->is_dedicated_interrupt) { - /* Each u32 represents 32 channels */ - sync_set_bit(child_relid & 31, - (unsigned long *)vmbus_connection.send_int_page + - (child_relid >> 5)); - } + if (!channel->is_dedicated_interrupt) + vmbus_send_interrupt(child_relid); hv_do_hypercall(HVCALL_SIGNAL_EVENT, channel->sig_event, NULL); } diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 86b56b677dc3..2749a4142889 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -40,11 +40,9 @@ */ #define HV_UTIL_NEGO_TIMEOUT 55 - - - -#define HV_EVENT_FLAGS_BYTE_COUNT (256) -#define HV_EVENT_FLAGS_DWORD_COUNT (256 / sizeof(u32)) +/* Define synthetic interrupt controller flag constants. */ +#define HV_EVENT_FLAGS_COUNT (256 * 8) +#define HV_EVENT_FLAGS_LONG_COUNT (256 / sizeof(unsigned long)) /* * Timer configuration register. @@ -65,8 +63,7 @@ union hv_timer_config { /* Define the synthetic interrupt controller event flags format. */ union hv_synic_event_flags { - u8 flags8[HV_EVENT_FLAGS_BYTE_COUNT]; - u32 flags32[HV_EVENT_FLAGS_DWORD_COUNT]; + unsigned long flags[HV_EVENT_FLAGS_LONG_COUNT]; }; /* Define SynIC control register. */ @@ -358,6 +355,11 @@ struct vmbus_msginfo { extern struct vmbus_connection vmbus_connection; +static inline void vmbus_send_interrupt(u32 relid) +{ + sync_set_bit(relid, vmbus_connection.send_int_page); +} + enum vmbus_message_handler_type { /* The related handler can sleep. */ VMHT_BLOCKING = 0, diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index f8ebe13cf251..c1b27026f744 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -908,10 +908,8 @@ static void vmbus_isr(void) (vmbus_proto_version == VERSION_WIN7)) { /* Since we are a child, we only need to check bit 0 */ - if (sync_test_and_clear_bit(0, - (unsigned long *) &event->flags32[0])) { + if (sync_test_and_clear_bit(0, event->flags)) handled = true; - } } else { /* * Our host is win8 or above. The signaling mechanism -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:28 UTC
[PATCH 02/14] vmbus: drop no longer used kick_q argument
The flag to cause notification of host is unused after commit a01a291a282f7c2e ("Drivers: hv: vmbus: Base host signaling strictly on the ring state"). Therefore remove it from the ring buffer internal API. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/hv/channel.c | 13 ++++--------- drivers/hv/hyperv_vmbus.h | 5 ++--- drivers/hv/ring_buffer.c | 8 +++----- 3 files changed, 9 insertions(+), 17 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index a016c5c0e472..e26285cde8e0 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -670,9 +670,7 @@ int vmbus_sendpacket_ctl(struct vmbus_channel *channel, void *buffer, bufferlist[2].iov_base = &aligned_data; bufferlist[2].iov_len = (packetlen_aligned - packetlen); - return hv_ringbuffer_write(channel, bufferlist, num_vecs, - lock, kick_q); - + return hv_ringbuffer_write(channel, bufferlist, num_vecs, lock); } EXPORT_SYMBOL(vmbus_sendpacket_ctl); @@ -757,8 +755,7 @@ int vmbus_sendpacket_pagebuffer_ctl(struct vmbus_channel *channel, bufferlist[2].iov_base = &aligned_data; bufferlist[2].iov_len = (packetlen_aligned - packetlen); - return hv_ringbuffer_write(channel, bufferlist, 3, - lock, kick_q); + return hv_ringbuffer_write(channel, bufferlist, 3, lock); } EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer_ctl); @@ -813,8 +810,7 @@ int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel, bufferlist[2].iov_base = &aligned_data; bufferlist[2].iov_len = (packetlen_aligned - packetlen); - return hv_ringbuffer_write(channel, bufferlist, 3, - lock, true); + return hv_ringbuffer_write(channel, bufferlist, 3, lock); } EXPORT_SYMBOL_GPL(vmbus_sendpacket_mpb_desc); @@ -871,8 +867,7 @@ int vmbus_sendpacket_multipagebuffer(struct vmbus_channel *channel, bufferlist[2].iov_base = &aligned_data; bufferlist[2].iov_len = (packetlen_aligned - packetlen); - return hv_ringbuffer_write(channel, bufferlist, 3, - lock, true); + return hv_ringbuffer_write(channel, bufferlist, 3, lock); } EXPORT_SYMBOL_GPL(vmbus_sendpacket_multipagebuffer); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 2749a4142889..c375ec89db6f 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -275,9 +275,8 @@ int hv_ringbuffer_init(struct hv_ring_buffer_info *ring_info, void hv_ringbuffer_cleanup(struct hv_ring_buffer_info *ring_info); int hv_ringbuffer_write(struct vmbus_channel *channel, - struct kvec *kv_list, - u32 kv_count, bool lock, - bool kick_q); + struct kvec *kv_list, + u32 kv_count, bool lock); int hv_ringbuffer_read(struct vmbus_channel *channel, void *buffer, u32 buflen, u32 *buffer_actual_len, diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 2cd402986858..30ca55aefd24 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -77,8 +77,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) * host logic is fixed. */ -static void hv_signal_on_write(u32 old_write, struct vmbus_channel *channel, - bool kick_q) +static void hv_signal_on_write(u32 old_write, struct vmbus_channel *channel) { struct hv_ring_buffer_info *rbi = &channel->outbound; @@ -285,8 +284,7 @@ void hv_ringbuffer_cleanup(struct hv_ring_buffer_info *ring_info) /* Write to the ring buffer. */ int hv_ringbuffer_write(struct vmbus_channel *channel, - struct kvec *kv_list, u32 kv_count, bool lock, - bool kick_q) + struct kvec *kv_list, u32 kv_count, bool lock) { int i = 0; u32 bytes_avail_towrite; @@ -352,7 +350,7 @@ int hv_ringbuffer_write(struct vmbus_channel *channel, if (lock) spin_unlock_irqrestore(&outring_info->ring_lock, flags); - hv_signal_on_write(old_write, channel, kick_q); + hv_signal_on_write(old_write, channel); if (channel->rescind) return -ENODEV; -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:28 UTC
[PATCH 03/14] vmbus: remove no longer used signal_policy
The explicit signal policy is no longer used. A different mechanism will be added later when xmit_more is supported. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- include/linux/hyperv.h | 18 ------------------ 1 file changed, 18 deletions(-) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 85b26f06e172..423fc96cc26a 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -670,11 +670,6 @@ struct hv_input_signal_event_buffer { struct hv_input_signal_event event; }; -enum hv_signal_policy { - HV_SIGNAL_POLICY_DEFAULT = 0, - HV_SIGNAL_POLICY_EXPLICIT, -}; - enum hv_numa_policy { HV_BALANCED = 0, HV_LOCALIZED, @@ -837,13 +832,6 @@ struct vmbus_channel { */ struct list_head percpu_list; /* - * Host signaling policy: The default policy will be - * based on the ring buffer state. We will also support - * a policy where the client driver can have explicit - * signaling control. - */ - enum hv_signal_policy signal_policy; - /* * On the channel send side, many of the VMBUS * device drivers explicity serialize access to the * outgoing ring buffer. Give more control to the @@ -904,12 +892,6 @@ static inline bool is_hvsock_channel(const struct vmbus_channel *c) VMBUS_CHANNEL_TLNPI_PROVIDER_OFFER); } -static inline void set_channel_signal_state(struct vmbus_channel *c, - enum hv_signal_policy policy) -{ - c->signal_policy = policy; -} - static inline void set_channel_affinity_state(struct vmbus_channel *c, enum hv_numa_policy policy) { -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:28 UTC
[PATCH 04/14] vmbus: remove unused kickq argument to sendpacket
Since sendpacket no longer uses kickq argument remove it. Remove it no longer used xmit_more in sendpacket in netvsc as well. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/hv/channel.c | 19 +++++++++---------- drivers/net/hyperv/netvsc.c | 21 +++------------------ include/linux/hyperv.h | 6 ++---- 3 files changed, 14 insertions(+), 32 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index e26285cde8e0..789c75f6df26 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -643,8 +643,8 @@ void vmbus_close(struct vmbus_channel *channel) EXPORT_SYMBOL_GPL(vmbus_close); int vmbus_sendpacket_ctl(struct vmbus_channel *channel, void *buffer, - u32 bufferlen, u64 requestid, - enum vmbus_packet_type type, u32 flags, bool kick_q) + u32 bufferlen, u64 requestid, + enum vmbus_packet_type type, u32 flags) { struct vmpacket_descriptor desc; u32 packetlen = sizeof(struct vmpacket_descriptor) + bufferlen; @@ -693,7 +693,7 @@ int vmbus_sendpacket(struct vmbus_channel *channel, void *buffer, enum vmbus_packet_type type, u32 flags) { return vmbus_sendpacket_ctl(channel, buffer, bufferlen, requestid, - type, flags, true); + type, flags); } EXPORT_SYMBOL(vmbus_sendpacket); @@ -705,11 +705,9 @@ EXPORT_SYMBOL(vmbus_sendpacket); * explicitly. */ int vmbus_sendpacket_pagebuffer_ctl(struct vmbus_channel *channel, - struct hv_page_buffer pagebuffers[], - u32 pagecount, void *buffer, u32 bufferlen, - u64 requestid, - u32 flags, - bool kick_q) + struct hv_page_buffer pagebuffers[], + u32 pagecount, void *buffer, u32 bufferlen, + u64 requestid, u32 flags) { int i; struct vmbus_channel_packet_page_buffer desc; @@ -769,9 +767,10 @@ int vmbus_sendpacket_pagebuffer(struct vmbus_channel *channel, u64 requestid) { u32 flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED; + return vmbus_sendpacket_pagebuffer_ctl(channel, pagebuffers, pagecount, - buffer, bufferlen, requestid, - flags, true); + buffer, bufferlen, + requestid, flags); } EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer); diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 5a1cc089acb7..e326e68f9f6d 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -723,8 +723,6 @@ static u32 netvsc_copy_to_send_buf(struct netvsc_device *net_device, char *dest = start + (section_index * net_device->send_section_size) + pend_size; int i; - bool is_data_pkt = (skb != NULL) ? true : false; - bool xmit_more = (skb != NULL) ? skb->xmit_more : false; u32 msg_size = 0; u32 padding = 0; u32 remain = packet->total_data_buflen % net_device->pkt_align; @@ -732,7 +730,7 @@ static u32 netvsc_copy_to_send_buf(struct netvsc_device *net_device, packet->page_buf_cnt; /* Add padding */ - if (is_data_pkt && xmit_more && remain && + if (skb && skb->xmit_more && remain && !packet->cp_partial) { padding = net_device->pkt_align - remain; rndis_msg->msg_len += padding; @@ -772,7 +770,6 @@ static inline int netvsc_send_pkt( int ret; struct hv_page_buffer *pgbuf; u32 ring_avail = hv_ringbuf_avail_percent(&out_channel->outbound); - bool xmit_more = (skb != NULL) ? skb->xmit_more : false; nvmsg.hdr.msg_type = NVSP_MSG1_TYPE_SEND_RNDIS_PKT; if (skb != NULL) { @@ -796,16 +793,6 @@ static inline int netvsc_send_pkt( if (out_channel->rescind) return -ENODEV; - /* - * It is possible that once we successfully place this packet - * on the ringbuffer, we may stop the queue. In that case, we want - * to notify the host independent of the xmit_more flag. We don't - * need to be precise here; in the worst case we may signal the host - * unnecessarily. - */ - if (ring_avail < (RING_AVAIL_PERCENT_LOWATER + 1)) - xmit_more = false; - if (packet->page_buf_cnt) { pgbuf = packet->cp_partial ? (*pb) + packet->rmsg_pgcnt : (*pb); @@ -815,15 +802,13 @@ static inline int netvsc_send_pkt( &nvmsg, sizeof(struct nvsp_message), req_id, - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED, - !xmit_more); + VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); } else { ret = vmbus_sendpacket_ctl(out_channel, &nvmsg, sizeof(struct nvsp_message), req_id, VM_PKT_DATA_INBAND, - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED, - !xmit_more); + VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); } if (ret == 0) { diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 423fc96cc26a..8c6a1505b876 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1023,8 +1023,7 @@ extern int vmbus_sendpacket_ctl(struct vmbus_channel *channel, u32 bufferLen, u64 requestid, enum vmbus_packet_type type, - u32 flags, - bool kick_q); + u32 flags); extern int vmbus_sendpacket_pagebuffer(struct vmbus_channel *channel, struct hv_page_buffer pagebuffers[], @@ -1039,8 +1038,7 @@ extern int vmbus_sendpacket_pagebuffer_ctl(struct vmbus_channel *channel, void *buffer, u32 bufferlen, u64 requestid, - u32 flags, - bool kick_q); + u32 flags); extern int vmbus_sendpacket_multipagebuffer(struct vmbus_channel *channel, struct hv_multipage_buffer *mpb, -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:28 UTC
[PATCH 05/14] netvsc: remove no longer needed receive staging buffers
Since commit aed8c164ca5199 ("Drivers: hv: ring_buffer: count on wrap around mappings") it is no longer necessary to handle ring wrapping by having a special receive buffer. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/net/hyperv/hyperv_net.h | 5 --- drivers/net/hyperv/netvsc.c | 83 ++++++--------------------------------- drivers/net/hyperv/rndis_filter.c | 11 ------ 3 files changed, 11 insertions(+), 88 deletions(-) diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index 3958adade7eb..cce70ceba6d5 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -748,11 +748,6 @@ struct netvsc_device { int ring_size; - /* The primary channel callback buffer */ - unsigned char *cb_buffer; - /* The sub channel callback buffer */ - unsigned char *sub_cb_buf; - struct multi_send_data msd[VRSS_CHANNEL_MAX]; u32 max_pkt; /* max number of pkt in one send, e.g. 8 */ u32 pkt_align; /* alignment bytes, e.g. 8 */ diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index e326e68f9f6d..7487498b663c 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -67,12 +67,6 @@ static struct netvsc_device *alloc_net_device(void) if (!net_device) return NULL; - net_device->cb_buffer = kzalloc(NETVSC_PACKET_SIZE, GFP_KERNEL); - if (!net_device->cb_buffer) { - kfree(net_device); - return NULL; - } - net_device->mrc[0].buf = vzalloc(NETVSC_RECVSLOT_MAX * sizeof(struct recv_comp_data)); @@ -93,7 +87,6 @@ static void free_netvsc_device(struct netvsc_device *nvdev) for (i = 0; i < VRSS_CHANNEL_MAX; i++) vfree(nvdev->mrc[i].buf); - kfree(nvdev->cb_buffer); kfree(nvdev); } @@ -584,7 +577,6 @@ void netvsc_device_remove(struct hv_device *device) vmbus_close(device->channel); /* Release all resources */ - vfree(net_device->sub_cb_buf); free_netvsc_device(net_device); } @@ -1256,16 +1248,11 @@ static void netvsc_process_raw_pkt(struct hv_device *device, void netvsc_channel_cb(void *context) { - int ret; - struct vmbus_channel *channel = (struct vmbus_channel *)context; + struct vmbus_channel *channel = context; u16 q_idx = channel->offermsg.offer.sub_channel_index; struct hv_device *device; struct netvsc_device *net_device; - u32 bytes_recvd; - u64 request_id; struct vmpacket_descriptor *desc; - unsigned char *buffer; - int bufferlen = NETVSC_PACKET_SIZE; struct net_device *ndev; bool need_to_commit = false; @@ -1277,65 +1264,19 @@ void netvsc_channel_cb(void *context) net_device = get_inbound_net_device(device); if (!net_device) return; + ndev = hv_get_drvdata(device); - buffer = get_per_channel_state(channel); - - do { - desc = get_next_pkt_raw(channel); - if (desc != NULL) { - netvsc_process_raw_pkt(device, - channel, - net_device, - ndev, - desc->trans_id, - desc); - - put_pkt_raw(channel, desc); - need_to_commit = true; - continue; - } - if (need_to_commit) { - need_to_commit = false; - commit_rd_index(channel); - } - ret = vmbus_recvpacket_raw(channel, buffer, bufferlen, - &bytes_recvd, &request_id); - if (ret == 0) { - if (bytes_recvd > 0) { - desc = (struct vmpacket_descriptor *)buffer; - netvsc_process_raw_pkt(device, - channel, - net_device, - ndev, - request_id, - desc); - } else { - /* - * We are done for this pass. - */ - break; - } - - } else if (ret == -ENOBUFS) { - if (bufferlen > NETVSC_PACKET_SIZE) - kfree(buffer); - /* Handle large packet */ - buffer = kmalloc(bytes_recvd, GFP_ATOMIC); - if (buffer == NULL) { - /* Try again next time around */ - netdev_err(ndev, - "unable to allocate buffer of size " - "(%d)!!\n", bytes_recvd); - break; - } - - bufferlen = bytes_recvd; - } - } while (1); + while ((desc = get_next_pkt_raw(channel)) != NULL) { + netvsc_process_raw_pkt(device, channel, net_device, + ndev, desc->trans_id, desc); - if (bufferlen > NETVSC_PACKET_SIZE) - kfree(buffer); + put_pkt_raw(channel, desc); + need_to_commit = true; + } + + if (need_to_commit) + commit_rd_index(channel); netvsc_chk_recv_comp(net_device, channel, q_idx); } @@ -1359,8 +1300,6 @@ int netvsc_device_add(struct hv_device *device, void *additional_info) net_device->ring_size = ring_size; - set_per_channel_state(device->channel, net_device->cb_buffer); - /* Open the channel */ ret = vmbus_open(device->channel, ring_size * PAGE_SIZE, ring_size * PAGE_SIZE, NULL, 0, diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c index 8d90904e0e49..113c7f4d1590 100644 --- a/drivers/net/hyperv/rndis_filter.c +++ b/drivers/net/hyperv/rndis_filter.c @@ -948,9 +948,6 @@ static void netvsc_sc_open(struct vmbus_channel *new_sc) if (chn_index >= nvscdev->num_chn) return; - set_per_channel_state(new_sc, nvscdev->sub_cb_buf + (chn_index - 1) * - NETVSC_PACKET_SIZE); - nvscdev->mrc[chn_index].buf = vzalloc(NETVSC_RECVSLOT_MAX * sizeof(struct recv_comp_data)); @@ -1099,14 +1096,6 @@ int rndis_filter_device_add(struct hv_device *dev, if (net_device->num_chn == 1) goto out; - net_device->sub_cb_buf = vzalloc((net_device->num_chn - 1) * - NETVSC_PACKET_SIZE); - if (!net_device->sub_cb_buf) { - net_device->num_chn = 1; - dev_info(&dev->device, "No memory for subchannels.\n"); - goto out; - } - vmbus_set_sc_create_callback(dev->channel, netvsc_sc_open); init_packet = &net_device->channel_init_pkt; -- 2.11.0
The netvsc no longer needs per channel state hook to track receive buffer. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- include/linux/hyperv.h | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 8c6a1505b876..39d493ce550d 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -823,10 +823,6 @@ struct vmbus_channel { */ struct vmbus_channel *primary_channel; /* - * Support per-channel state for use by vmbus drivers. - */ - void *per_channel_state; - /* * To support per-cpu lookup mapping of relid to channel, * link up channels based on their CPU affinity. */ @@ -903,16 +899,6 @@ static inline void set_channel_read_state(struct vmbus_channel *c, bool state) c->batched_reading = state; } -static inline void set_per_channel_state(struct vmbus_channel *c, void *s) -{ - c->per_channel_state = s; -} - -static inline void *get_per_channel_state(struct vmbus_channel *c) -{ - return c->per_channel_state; -} - static inline void set_channel_pending_send_size(struct vmbus_channel *c, u32 size) { -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:28 UTC
[PATCH 07/14] vmbus: callback is in softirq not workqueue
The callback is done via tasklet not workqueue. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- include/linux/hyperv.h | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 39d493ce550d..b30808f740f9 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -32,7 +32,6 @@ #include <linux/scatterlist.h> #include <linux/list.h> #include <linux/timer.h> -#include <linux/workqueue.h> #include <linux/completion.h> #include <linux/device.h> #include <linux/mod_devicetable.h> @@ -729,9 +728,7 @@ struct vmbus_channel { struct vmbus_close_msg close_msg; - /* Channel callback are invoked in this workqueue context */ - /* HANDLE dataWorkQueue; */ - + /* Channel callback's invoked in softirq context */ void (*onchannel_callback)(void *context); void *channel_callback_context; -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:28 UTC
[PATCH 08/14] vmbus: put related per-cpu variable together
The hv_context structure had several arrays which were per-cpu and was allocating small structures (tasklet_struct). Instead use a single per-cpu array. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/hv/channel_mgmt.c | 35 ++++++++----- drivers/hv/connection.c | 20 ++++--- drivers/hv/hv.c | 130 ++++++++++++++++++++-------------------------- drivers/hv/hyperv_vmbus.h | 53 +++++++++++-------- drivers/hv/vmbus_drv.c | 39 ++++++++------ 5 files changed, 143 insertions(+), 134 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index de90a9900fee..579ad2560a39 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -353,9 +353,10 @@ static void free_channel(struct vmbus_channel *channel) static void percpu_channel_enq(void *arg) { struct vmbus_channel *channel = arg; - int cpu = smp_processor_id(); + struct hv_per_cpu_context *hv_cpu + = this_cpu_ptr(hv_context.cpu_context); - list_add_tail(&channel->percpu_list, &hv_context.percpu_list[cpu]); + list_add_tail(&channel->percpu_list, &hv_cpu->chan_list); } static void percpu_channel_deq(void *arg) @@ -379,19 +380,21 @@ static void vmbus_release_relid(u32 relid) void hv_event_tasklet_disable(struct vmbus_channel *channel) { - struct tasklet_struct *tasklet; - tasklet = hv_context.event_dpc[channel->target_cpu]; - tasklet_disable(tasklet); + struct hv_per_cpu_context *hv_cpu; + + hv_cpu = per_cpu_ptr(hv_context.cpu_context, channel->target_cpu); + tasklet_disable(&hv_cpu->event_dpc); } void hv_event_tasklet_enable(struct vmbus_channel *channel) { - struct tasklet_struct *tasklet; - tasklet = hv_context.event_dpc[channel->target_cpu]; - tasklet_enable(tasklet); + struct hv_per_cpu_context *hv_cpu; + + hv_cpu = per_cpu_ptr(hv_context.cpu_context, channel->target_cpu); + tasklet_enable(&hv_cpu->event_dpc); /* In case there is any pending event */ - tasklet_schedule(tasklet); + tasklet_schedule(&hv_cpu->event_dpc); } void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) @@ -726,9 +729,12 @@ static void vmbus_wait_for_unload(void) break; for_each_online_cpu(cpu) { - page_addr = hv_context.synic_message_page[cpu]; - msg = (struct hv_message *)page_addr + - VMBUS_MESSAGE_SINT; + struct hv_per_cpu_context *hv_cpu + = per_cpu_ptr(hv_context.cpu_context, cpu); + + page_addr = hv_cpu->synic_message_page; + msg = (struct hv_message *)page_addr + + VMBUS_MESSAGE_SINT; message_type = READ_ONCE(msg->header.message_type); if (message_type == HVMSG_NONE) @@ -752,7 +758,10 @@ static void vmbus_wait_for_unload(void) * messages after we reconnect. */ for_each_online_cpu(cpu) { - page_addr = hv_context.synic_message_page[cpu]; + struct hv_per_cpu_context *hv_cpu + = per_cpu_ptr(hv_context.cpu_context, cpu); + + page_addr = hv_cpu->synic_message_page; msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; msg->header.message_type = HVMSG_NONE; } diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index 1766ef03e78d..158f12823baf 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -93,12 +93,10 @@ static int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo, * all the CPUs. This is needed for kexec to work correctly where * the CPU attempting to connect may not be CPU 0. */ - if (version >= VERSION_WIN8_1) { - msg->target_vcpu = hv_context.vp_index[get_cpu()]; - put_cpu(); - } else { + if (version >= VERSION_WIN8_1) + msg->target_vcpu = hv_context.vp_index[smp_processor_id()]; + else msg->target_vcpu = 0; - } /* * Add to list before we send the request since we may @@ -269,12 +267,12 @@ void vmbus_disconnect(void) */ static struct vmbus_channel *pcpu_relid2channel(u32 relid) { + struct hv_per_cpu_context *hv_cpu + = this_cpu_ptr(hv_context.cpu_context); + struct vmbus_channel *found_channel = NULL; struct vmbus_channel *channel; - struct vmbus_channel *found_channel = NULL; - int cpu = smp_processor_id(); - struct list_head *pcpu_head = &hv_context.percpu_list[cpu]; - list_for_each_entry(channel, pcpu_head, percpu_list) { + list_for_each_entry(channel, &hv_cpu->chan_list, percpu_list) { if (channel->offermsg.child_relid == relid) { found_channel = channel; break; @@ -379,6 +377,7 @@ static void process_chn_event(u32 relid) */ void vmbus_on_event(unsigned long data) { + struct hv_per_cpu_context *hv_cpu = (void *)data; unsigned long *recv_int_page; u32 maxbits, relid; @@ -391,8 +390,7 @@ void vmbus_on_event(unsigned long data) * can be directly checked to get the id of the channel * that has the interrupt pending. */ - int cpu = smp_processor_id(); - void *page_addr = hv_context.synic_event_page[cpu]; + void *page_addr = hv_cpu->synic_event_page; union hv_synic_event_flags *event = (union hv_synic_event_flags *)page_addr + VMBUS_MESSAGE_SINT; diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 0f73237bed0a..9fc2355a60ea 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -49,24 +49,13 @@ struct hv_context hv_context = { */ int hv_init(void) { - - memset(hv_context.synic_event_page, 0, sizeof(void *) * NR_CPUS); - memset(hv_context.synic_message_page, 0, - sizeof(void *) * NR_CPUS); - memset(hv_context.post_msg_page, 0, - sizeof(void *) * NR_CPUS); - memset(hv_context.vp_index, 0, - sizeof(int) * NR_CPUS); - memset(hv_context.event_dpc, 0, - sizeof(void *) * NR_CPUS); - memset(hv_context.msg_dpc, 0, - sizeof(void *) * NR_CPUS); - memset(hv_context.clk_evt, 0, - sizeof(void *) * NR_CPUS); - if (!hv_is_hypercall_page_setup()) return -ENOTSUPP; + hv_context.cpu_context = alloc_percpu(struct hv_per_cpu_context); + if (!hv_context.cpu_context) + return -ENOMEM; + return 0; } @@ -79,25 +68,24 @@ int hv_post_message(union hv_connection_id connection_id, enum hv_message_type message_type, void *payload, size_t payload_size) { - struct hv_input_post_message *aligned_msg; + struct hv_per_cpu_context *hv_cpu; u64 status; if (payload_size > HV_MESSAGE_PAYLOAD_BYTE_COUNT) return -EMSGSIZE; - aligned_msg = (struct hv_input_post_message *) - hv_context.post_msg_page[get_cpu()]; - + hv_cpu = get_cpu_ptr(hv_context.cpu_context); + aligned_msg = hv_cpu->post_msg_page; aligned_msg->connectionid = connection_id; aligned_msg->reserved = 0; aligned_msg->message_type = message_type; aligned_msg->payload_size = payload_size; memcpy((void *)aligned_msg->payload, payload, payload_size); + put_cpu_ptr(hv_cpu); status = hv_do_hypercall(HVCALL_POST_MESSAGE, aligned_msg, NULL); - put_cpu(); return status & 0xFFFF; } @@ -154,8 +142,6 @@ static void hv_init_clockevent_device(struct clock_event_device *dev, int cpu) int hv_synic_alloc(void) { - size_t size = sizeof(struct tasklet_struct); - size_t ced_size = sizeof(struct clock_event_device); int cpu; hv_context.hv_numa_map = kzalloc(sizeof(struct cpumask) * nr_node_ids, @@ -166,53 +152,43 @@ int hv_synic_alloc(void) } for_each_present_cpu(cpu) { - hv_context.event_dpc[cpu] = kmalloc(size, GFP_ATOMIC); - if (hv_context.event_dpc[cpu] == NULL) { - pr_err("Unable to allocate event dpc\n"); - goto err; - } - tasklet_init(hv_context.event_dpc[cpu], vmbus_on_event, cpu); - - hv_context.msg_dpc[cpu] = kmalloc(size, GFP_ATOMIC); - if (hv_context.msg_dpc[cpu] == NULL) { - pr_err("Unable to allocate event dpc\n"); - goto err; - } - tasklet_init(hv_context.msg_dpc[cpu], vmbus_on_msg_dpc, cpu); - - hv_context.clk_evt[cpu] = kzalloc(ced_size, GFP_ATOMIC); - if (hv_context.clk_evt[cpu] == NULL) { + struct hv_per_cpu_context *hv_cpu + = per_cpu_ptr(hv_context.cpu_context, cpu); + + memset(hv_cpu, 0, sizeof(*hv_cpu)); + tasklet_init(&hv_cpu->event_dpc, + vmbus_on_event, (unsigned long) hv_cpu); + tasklet_init(&hv_cpu->msg_dpc, + vmbus_on_msg_dpc, (unsigned long) hv_cpu); + + hv_cpu->clk_evt = kzalloc(sizeof(struct clock_event_device), + GFP_KERNEL); + if (hv_cpu->clk_evt == NULL) { pr_err("Unable to allocate clock event device\n"); goto err; } + hv_init_clockevent_device(hv_cpu->clk_evt, cpu); - hv_init_clockevent_device(hv_context.clk_evt[cpu], cpu); - - hv_context.synic_message_page[cpu] + hv_cpu->synic_message_page (void *)get_zeroed_page(GFP_ATOMIC); - - if (hv_context.synic_message_page[cpu] == NULL) { + if (hv_cpu->synic_message_page == NULL) { pr_err("Unable to allocate SYNIC message page\n"); goto err; } - hv_context.synic_event_page[cpu] - (void *)get_zeroed_page(GFP_ATOMIC); - - if (hv_context.synic_event_page[cpu] == NULL) { + hv_cpu->synic_event_page = (void *)get_zeroed_page(GFP_ATOMIC); + if (hv_cpu->synic_event_page == NULL) { pr_err("Unable to allocate SYNIC event page\n"); goto err; } - hv_context.post_msg_page[cpu] - (void *)get_zeroed_page(GFP_ATOMIC); - - if (hv_context.post_msg_page[cpu] == NULL) { + hv_cpu->post_msg_page = (void *)get_zeroed_page(GFP_ATOMIC); + if (hv_cpu->post_msg_page == NULL) { pr_err("Unable to allocate post msg page\n"); goto err; } - INIT_LIST_HEAD(&hv_context.percpu_list[cpu]); + INIT_LIST_HEAD(&hv_cpu->chan_list); } return 0; @@ -220,26 +196,24 @@ int hv_synic_alloc(void) return -ENOMEM; } -static void hv_synic_free_cpu(int cpu) -{ - kfree(hv_context.event_dpc[cpu]); - kfree(hv_context.msg_dpc[cpu]); - kfree(hv_context.clk_evt[cpu]); - if (hv_context.synic_event_page[cpu]) - free_page((unsigned long)hv_context.synic_event_page[cpu]); - if (hv_context.synic_message_page[cpu]) - free_page((unsigned long)hv_context.synic_message_page[cpu]); - if (hv_context.post_msg_page[cpu]) - free_page((unsigned long)hv_context.post_msg_page[cpu]); -} void hv_synic_free(void) { int cpu; + for_each_present_cpu(cpu) { + struct hv_per_cpu_context *hv_cpu + = per_cpu_ptr(hv_context.cpu_context, cpu); + + if (hv_cpu->synic_event_page) + free_page((unsigned long)hv_cpu->synic_event_page); + if (hv_cpu->synic_message_page) + free_page((unsigned long)hv_cpu->synic_message_page); + if (hv_cpu->post_msg_page) + free_page((unsigned long)hv_cpu->post_msg_page); + } + kfree(hv_context.hv_numa_map); - for_each_present_cpu(cpu) - hv_synic_free_cpu(cpu); } /* @@ -251,6 +225,8 @@ void hv_synic_free(void) */ int hv_synic_init(unsigned int cpu) { + struct hv_per_cpu_context *hv_cpu + = per_cpu_ptr(hv_context.cpu_context, cpu); union hv_synic_simp simp; union hv_synic_siefp siefp; union hv_synic_sint shared_sint; @@ -260,7 +236,7 @@ int hv_synic_init(unsigned int cpu) /* Setup the Synic's message page */ hv_get_simp(simp.as_uint64); simp.simp_enabled = 1; - simp.base_simp_gpa = virt_to_phys(hv_context.synic_message_page[cpu]) + simp.base_simp_gpa = virt_to_phys(hv_cpu->synic_message_page) >> PAGE_SHIFT; hv_set_simp(simp.as_uint64); @@ -268,7 +244,7 @@ int hv_synic_init(unsigned int cpu) /* Setup the Synic's event page */ hv_get_siefp(siefp.as_uint64); siefp.siefp_enabled = 1; - siefp.base_siefp_gpa = virt_to_phys(hv_context.synic_event_page[cpu]) + siefp.base_siefp_gpa = virt_to_phys(hv_cpu->synic_event_page) >> PAGE_SHIFT; hv_set_siefp(siefp.as_uint64); @@ -305,7 +281,7 @@ int hv_synic_init(unsigned int cpu) * Register the per-cpu clockevent source. */ if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE) - clockevents_config_and_register(hv_context.clk_evt[cpu], + clockevents_config_and_register(hv_cpu->clk_evt, HV_TIMER_FREQUENCY, HV_MIN_DELTA_TICKS, HV_MAX_MAX_DELTA_TICKS); @@ -322,8 +298,12 @@ void hv_synic_clockevents_cleanup(void) if (!(ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE)) return; - for_each_present_cpu(cpu) - clockevents_unbind_device(hv_context.clk_evt[cpu], cpu); + for_each_present_cpu(cpu) { + struct hv_per_cpu_context *hv_cpu + = per_cpu_ptr(hv_context.cpu_context, cpu); + + clockevents_unbind_device(hv_cpu->clk_evt, cpu); + } } /* @@ -372,8 +352,12 @@ int hv_synic_cleanup(unsigned int cpu) /* Turn off clockevent device */ if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE) { - clockevents_unbind_device(hv_context.clk_evt[cpu], cpu); - hv_ce_shutdown(hv_context.clk_evt[cpu]); + struct hv_per_cpu_context *hv_cpu + = this_cpu_ptr(hv_context.cpu_context); + + clockevents_unbind_device(hv_cpu->clk_evt, cpu); + hv_ce_shutdown(hv_cpu->clk_evt); + put_cpu_ptr(hv_cpu); } hv_get_synint_state(HV_X64_MSR_SINT0 + VMBUS_MESSAGE_SINT, diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index c375ec89db6f..c8ce9ab2e16a 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -29,6 +29,7 @@ #include <asm/sync_bitops.h> #include <linux/atomic.h> #include <linux/hyperv.h> +#include <linux/interrupt.h> /* * Timeout for services such as KVP and fcopy. @@ -189,6 +190,33 @@ enum { VMBUS_MESSAGE_SINT = 2, }; +/* + * Per cpu state for channel handling + */ +struct hv_per_cpu_context { + void *synic_message_page; + void *synic_event_page; + /* + * buffer to post messages to the host. + */ + void *post_msg_page; + + /* + * Starting with win8, we can take channel interrupts on any CPU; + * we will manage the tasklet that handles events messages on a per CPU + * basis. + */ + struct tasklet_struct event_dpc; + struct tasklet_struct msg_dpc; + + /* + * To optimize the mapping of relid to channel, maintain + * per-cpu list of the channels based on their CPU affinity. + */ + struct list_head chan_list; + struct clock_event_device *clk_evt; +}; + struct hv_context { /* We only support running on top of Hyper-V * So at this point this really can only contain the Hyper-V ID @@ -199,8 +227,8 @@ struct hv_context { bool synic_initialized; - void *synic_message_page[NR_CPUS]; - void *synic_event_page[NR_CPUS]; + struct hv_per_cpu_context __percpu *cpu_context; + /* * Hypervisor's notion of virtual processor ID is different from * Linux' notion of CPU ID. This information can only be retrieved @@ -211,26 +239,7 @@ struct hv_context { * Linux cpuid 'a'. */ u32 vp_index[NR_CPUS]; - /* - * Starting with win8, we can take channel interrupts on any CPU; - * we will manage the tasklet that handles events messages on a per CPU - * basis. - */ - struct tasklet_struct *event_dpc[NR_CPUS]; - struct tasklet_struct *msg_dpc[NR_CPUS]; - /* - * To optimize the mapping of relid to channel, maintain - * per-cpu list of the channels based on their CPU affinity. - */ - struct list_head percpu_list[NR_CPUS]; - /* - * buffer to post messages to the host. - */ - void *post_msg_page[NR_CPUS]; - /* - * Support PV clockevent device. - */ - struct clock_event_device *clk_evt[NR_CPUS]; + /* * To manage allocations in a NUMA node. * Array indexed by numa node ID. diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index c1b27026f744..cf8540c1df4a 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -835,9 +835,10 @@ static void vmbus_onmessage_work(struct work_struct *work) kfree(ctx); } -static void hv_process_timer_expiration(struct hv_message *msg, int cpu) +static void hv_process_timer_expiration(struct hv_message *msg, + struct hv_per_cpu_context *hv_cpu) { - struct clock_event_device *dev = hv_context.clk_evt[cpu]; + struct clock_event_device *dev = hv_cpu->clk_evt; if (dev->event_handler) dev->event_handler(dev); @@ -847,8 +848,8 @@ static void hv_process_timer_expiration(struct hv_message *msg, int cpu) void vmbus_on_msg_dpc(unsigned long data) { - int cpu = smp_processor_id(); - void *page_addr = hv_context.synic_message_page[cpu]; + struct hv_per_cpu_context *hv_cpu = (void *)data; + void *page_addr = hv_cpu->synic_message_page; struct hv_message *msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; struct vmbus_channel_message_header *hdr; @@ -886,14 +887,14 @@ void vmbus_on_msg_dpc(unsigned long data) static void vmbus_isr(void) { - int cpu = smp_processor_id(); - void *page_addr; + struct hv_per_cpu_context *hv_cpu + = this_cpu_ptr(hv_context.cpu_context); + void *page_addr = hv_cpu->synic_event_page; struct hv_message *msg; union hv_synic_event_flags *event; bool handled = false; - page_addr = hv_context.synic_event_page[cpu]; - if (page_addr == NULL) + if (unlikely(page_addr == NULL)) return; event = (union hv_synic_event_flags *)page_addr + @@ -921,18 +922,18 @@ static void vmbus_isr(void) } if (handled) - tasklet_schedule(hv_context.event_dpc[cpu]); + tasklet_schedule(&hv_cpu->event_dpc); - page_addr = hv_context.synic_message_page[cpu]; + page_addr = hv_cpu->synic_message_page; msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; /* Check if there are actual msgs to be processed */ if (msg->header.message_type != HVMSG_NONE) { if (msg->header.message_type == HVMSG_TIMER_EXPIRED) - hv_process_timer_expiration(msg, cpu); + hv_process_timer_expiration(msg, hv_cpu); else - tasklet_schedule(hv_context.msg_dpc[cpu]); + tasklet_schedule(&hv_cpu->msg_dpc); } add_interrupt_randomness(HYPERVISOR_CALLBACK_VECTOR, 0); @@ -1521,9 +1522,14 @@ static void __exit vmbus_exit(void) hv_synic_clockevents_cleanup(); vmbus_disconnect(); hv_remove_vmbus_irq(); - for_each_online_cpu(cpu) - tasklet_kill(hv_context.msg_dpc[cpu]); + for_each_online_cpu(cpu) { + struct hv_per_cpu_context *hv_cpu + = per_cpu_ptr(hv_context.cpu_context, cpu); + + tasklet_kill(&hv_cpu->msg_dpc); + } vmbus_free_channels(); + if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { unregister_die_notifier(&hyperv_die_block); atomic_notifier_chain_unregister(&panic_notifier_list, @@ -1531,7 +1537,10 @@ static void __exit vmbus_exit(void) } bus_unregister(&hv_bus); for_each_online_cpu(cpu) { - tasklet_kill(hv_context.event_dpc[cpu]); + struct hv_per_cpu_context *hv_cpu + = per_cpu_ptr(hv_context.cpu_context, cpu); + + tasklet_kill(&hv_cpu->event_dpc); } cpuhp_remove_state(hyperv_cpuhp_online); hv_synic_free(); -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:28 UTC
[PATCH 09/14] vmbus: change to per channel tasklet
Make the event handling tasklet per channel rather than per-cpu. This allows for better fairness when getting lots of data on the same cpu. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/hv/channel.c | 2 +- drivers/hv/channel_mgmt.c | 16 +++++----- drivers/hv/connection.c | 78 ++--------------------------------------------- drivers/hv/hv.c | 2 -- drivers/hv/hyperv_vmbus.h | 1 - drivers/hv/vmbus_drv.c | 58 ++++++++++++++++++++++++++++++----- include/linux/hyperv.h | 3 +- 7 files changed, 64 insertions(+), 96 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 789c75f6df26..18cc1c78260d 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -530,7 +530,7 @@ static int vmbus_close_internal(struct vmbus_channel *channel) int ret; /* - * process_chn_event(), running in the tasklet, can race + * vmbus_on_event(), running in the tasklet, can race * with vmbus_close_internal() in the case of SMP guest, e.g., when * the former is accessing channel->inbound.ring_buffer, the latter * could be freeing the ring_buffer pages. diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 579ad2560a39..2f6270d76b79 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -339,6 +339,9 @@ static struct vmbus_channel *alloc_channel(void) INIT_LIST_HEAD(&channel->sc_list); INIT_LIST_HEAD(&channel->percpu_list); + tasklet_init(&channel->callback_event, + vmbus_on_event, (unsigned long)channel); + return channel; } @@ -347,6 +350,7 @@ static struct vmbus_channel *alloc_channel(void) */ static void free_channel(struct vmbus_channel *channel) { + tasklet_kill(&channel->callback_event); kfree(channel); } @@ -380,21 +384,15 @@ static void vmbus_release_relid(u32 relid) void hv_event_tasklet_disable(struct vmbus_channel *channel) { - struct hv_per_cpu_context *hv_cpu; - - hv_cpu = per_cpu_ptr(hv_context.cpu_context, channel->target_cpu); - tasklet_disable(&hv_cpu->event_dpc); + tasklet_disable(&channel->callback_event); } void hv_event_tasklet_enable(struct vmbus_channel *channel) { - struct hv_per_cpu_context *hv_cpu; - - hv_cpu = per_cpu_ptr(hv_context.cpu_context, channel->target_cpu); - tasklet_enable(&hv_cpu->event_dpc); + tasklet_enable(&channel->callback_event); /* In case there is any pending event */ - tasklet_schedule(&hv_cpu->event_dpc); + tasklet_schedule(&channel->callback_event); } void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index 158f12823baf..27e72dc07e12 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -260,29 +260,6 @@ void vmbus_disconnect(void) } /* - * Map the given relid to the corresponding channel based on the - * per-cpu list of channels that have been affinitized to this CPU. - * This will be used in the channel callback path as we can do this - * mapping in a lock-free fashion. - */ -static struct vmbus_channel *pcpu_relid2channel(u32 relid) -{ - struct hv_per_cpu_context *hv_cpu - = this_cpu_ptr(hv_context.cpu_context); - struct vmbus_channel *found_channel = NULL; - struct vmbus_channel *channel; - - list_for_each_entry(channel, &hv_cpu->chan_list, percpu_list) { - if (channel->offermsg.child_relid == relid) { - found_channel = channel; - break; - } - } - - return found_channel; -} - -/* * relid2channel - Get the channel object given its * child relative id (ie channel id) */ @@ -318,25 +295,16 @@ struct vmbus_channel *relid2channel(u32 relid) } /* - * process_chn_event - Process a channel event notification + * vmbus_on_event - Process a channel event notification */ -static void process_chn_event(u32 relid) +void vmbus_on_event(unsigned long data) { - struct vmbus_channel *channel; + struct vmbus_channel *channel = (void *) data; void *arg; bool read_state; u32 bytes_to_read; /* - * Find the channel based on this relid and invokes the - * channel callback to process the event - */ - channel = pcpu_relid2channel(relid); - - if (!channel) - return; - - /* * A channel once created is persistent even when there * is no driver handling the device. An unloading driver * sets the onchannel_callback to NULL on the same CPU @@ -344,7 +312,6 @@ static void process_chn_event(u32 relid) * Thus, checking and invoking the driver specific callback takes * care of orderly unloading of the driver. */ - if (channel->onchannel_callback != NULL) { arg = channel->channel_callback_context; read_state = channel->batched_reading; @@ -373,45 +340,6 @@ static void process_chn_event(u32 relid) } /* - * vmbus_on_event - Handler for events - */ -void vmbus_on_event(unsigned long data) -{ - struct hv_per_cpu_context *hv_cpu = (void *)data; - unsigned long *recv_int_page; - u32 maxbits, relid; - - if (vmbus_proto_version < VERSION_WIN8) { - maxbits = MAX_NUM_CHANNELS_SUPPORTED; - recv_int_page = vmbus_connection.recv_int_page; - } else { - /* - * When the host is win8 and beyond, the event page - * can be directly checked to get the id of the channel - * that has the interrupt pending. - */ - void *page_addr = hv_cpu->synic_event_page; - union hv_synic_event_flags *event - = (union hv_synic_event_flags *)page_addr + - VMBUS_MESSAGE_SINT; - - maxbits = HV_EVENT_FLAGS_COUNT; - recv_int_page = event->flags; - } - - if (unlikely(!recv_int_page)) - return; - - for_each_set_bit(relid, recv_int_page, maxbits) { - if (sync_test_and_clear_bit(relid, recv_int_page)) { - /* Special case - vmbus channel protocol msg */ - if (relid != 0) - process_chn_event(relid); - } - } -} - -/* * vmbus_post_msg - Send a msg on the vmbus's message connection */ int vmbus_post_msg(void *buffer, size_t buflen, bool can_sleep) diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 9fc2355a60ea..665a64f1611e 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -156,8 +156,6 @@ int hv_synic_alloc(void) = per_cpu_ptr(hv_context.cpu_context, cpu); memset(hv_cpu, 0, sizeof(*hv_cpu)); - tasklet_init(&hv_cpu->event_dpc, - vmbus_on_event, (unsigned long) hv_cpu); tasklet_init(&hv_cpu->msg_dpc, vmbus_on_msg_dpc, (unsigned long) hv_cpu); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index c8ce9ab2e16a..558a798c407c 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -206,7 +206,6 @@ struct hv_per_cpu_context { * we will manage the tasklet that handles events messages on a per CPU * basis. */ - struct tasklet_struct event_dpc; struct tasklet_struct msg_dpc; /* diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index cf8540c1df4a..eaf1a10b0245 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -885,6 +885,56 @@ void vmbus_on_msg_dpc(unsigned long data) vmbus_signal_eom(msg, message_type); } + +/* + * Schedule all channels with events pending + */ +static void vmbus_chan_sched(struct hv_per_cpu_context *hv_cpu) +{ + unsigned long *recv_int_page; + u32 maxbits, relid; + + if (vmbus_proto_version < VERSION_WIN8) { + maxbits = MAX_NUM_CHANNELS_SUPPORTED; + recv_int_page = vmbus_connection.recv_int_page; + } else { + /* + * When the host is win8 and beyond, the event page + * can be directly checked to get the id of the channel + * that has the interrupt pending. + */ + void *page_addr = hv_cpu->synic_event_page; + union hv_synic_event_flags *event + = (union hv_synic_event_flags *)page_addr + + VMBUS_MESSAGE_SINT; + + maxbits = HV_EVENT_FLAGS_COUNT; + recv_int_page = event->flags; + } + + if (unlikely(!recv_int_page)) + return; + + for_each_set_bit(relid, recv_int_page, maxbits) { + struct vmbus_channel *channel; + + if (!sync_test_and_clear_bit(relid, recv_int_page)) + continue; + + /* Special case - vmbus channel protocol msg */ + if (relid == 0) + continue; + + /* Find channel based on relid */ + list_for_each_entry(channel, &hv_cpu->chan_list, percpu_list) { + if (channel->offermsg.child_relid == relid) { + tasklet_schedule(&channel->callback_event); + break; + } + } + } +} + static void vmbus_isr(void) { struct hv_per_cpu_context *hv_cpu @@ -922,8 +972,7 @@ static void vmbus_isr(void) } if (handled) - tasklet_schedule(&hv_cpu->event_dpc); - + vmbus_chan_sched(hv_cpu); page_addr = hv_cpu->synic_message_page; msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; @@ -1536,12 +1585,7 @@ static void __exit vmbus_exit(void) &hyperv_panic_block); } bus_unregister(&hv_bus); - for_each_online_cpu(cpu) { - struct hv_per_cpu_context *hv_cpu - = per_cpu_ptr(hv_context.cpu_context, cpu); - tasklet_kill(&hv_cpu->event_dpc); - } cpuhp_remove_state(hyperv_cpuhp_online); hv_synic_free(); acpi_bus_unregister_driver(&vmbus_acpi_driver); diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index b30808f740f9..dce5aa6751b4 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -35,7 +35,7 @@ #include <linux/completion.h> #include <linux/device.h> #include <linux/mod_devicetable.h> - +#include <linux/interrupt.h> #define MAX_PAGE_BUFFER_COUNT 32 #define MAX_MULTIPAGE_BUFFER_COUNT 32 /* 128K */ @@ -729,6 +729,7 @@ struct vmbus_channel { struct vmbus_close_msg close_msg; /* Channel callback's invoked in softirq context */ + struct tasklet_struct callback_event; void (*onchannel_callback)(void *context); void *channel_callback_context; -- 2.11.0
Change the simple boolean batched_reading into a tri-value. For future NAPI support in netvsc driver, the callback needs to occur directly in interrupt handler. Batched mode is also changed to disable host interrupts immediately in interrupt routine (to avoid unnecessary host signals), and the tasklet is rescheduled if more data is detected. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/hv/channel_mgmt.c | 7 ------- drivers/hv/connection.c | 27 ++++++++++++--------------- drivers/hv/hv_util.c | 3 +-- drivers/hv/vmbus_drv.c | 26 ++++++++++++++++++++++++-- drivers/uio/uio_hv_generic.c | 2 +- include/linux/hyperv.h | 31 +++++++++++++++++-------------- 6 files changed, 55 insertions(+), 41 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 2f6270d76b79..b2bb5aafaa2f 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -820,13 +820,6 @@ static void vmbus_onoffer(struct vmbus_channel_message_header *hdr) } /* - * By default we setup state to enable batched - * reading. A specific service can choose to - * disable this prior to opening the channel. - */ - newchannel->batched_reading = true; - - /* * Setup state for signalling the host. */ newchannel->sig_event = (struct hv_input_signal_event *) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index 27e72dc07e12..a8366fec1458 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -300,9 +300,7 @@ struct vmbus_channel *relid2channel(u32 relid) void vmbus_on_event(unsigned long data) { struct vmbus_channel *channel = (void *) data; - void *arg; - bool read_state; - u32 bytes_to_read; + void (*callback_fn)(void *); /* * A channel once created is persistent even when there @@ -312,9 +310,13 @@ void vmbus_on_event(unsigned long data) * Thus, checking and invoking the driver specific callback takes * care of orderly unloading of the driver. */ - if (channel->onchannel_callback != NULL) { - arg = channel->channel_callback_context; - read_state = channel->batched_reading; + callback_fn = READ_ONCE(channel->onchannel_callback); + if (unlikely(callback_fn == NULL)) + return; + + (*callback_fn)(channel->channel_callback_context); + + if (channel->callback_mode == HV_CALL_BATCHED) { /* * This callback reads the messages sent by the host. * We can optimize host to guest signaling by ensuring: @@ -326,16 +328,11 @@ void vmbus_on_event(unsigned long data) * state is set we check to see if additional packets are * available to read. In this case we repeat the process. */ + if (hv_end_read(&channel->inbound) != 0) { + hv_begin_read(&channel->inbound); - do { - if (read_state) - hv_begin_read(&channel->inbound); - channel->onchannel_callback(arg); - if (read_state) - bytes_to_read = hv_end_read(&channel->inbound); - else - bytes_to_read = 0; - } while (read_state && (bytes_to_read != 0)); + tasklet_schedule(&channel->callback_event); + } } } diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c index d42ede78a9dd..8410191b4992 100644 --- a/drivers/hv/hv_util.c +++ b/drivers/hv/hv_util.c @@ -409,8 +409,7 @@ static int util_probe(struct hv_device *dev, * Turn off batched reading for all util drivers before we open the * channel. */ - - set_channel_read_state(dev->channel, false); + set_channel_read_mode(dev->channel, HV_CALL_DIRECT); hv_set_drvdata(dev, srv); diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index eaf1a10b0245..f7f6b9144b07 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -887,6 +887,18 @@ void vmbus_on_msg_dpc(unsigned long data) /* + * Direct callback for channels using other deferred processing + */ +static void vmbus_channel_isr(struct vmbus_channel *channel) +{ + void (*callback_fn)(void *); + + callback_fn = READ_ONCE(channel->onchannel_callback); + if (likely(callback_fn != NULL)) + (*callback_fn)(channel->channel_callback_context); +} + +/* * Schedule all channels with events pending */ static void vmbus_chan_sched(struct hv_per_cpu_context *hv_cpu) @@ -927,9 +939,19 @@ static void vmbus_chan_sched(struct hv_per_cpu_context *hv_cpu) /* Find channel based on relid */ list_for_each_entry(channel, &hv_cpu->chan_list, percpu_list) { - if (channel->offermsg.child_relid == relid) { - tasklet_schedule(&channel->callback_event); + if (channel->offermsg.child_relid != relid) + continue; + + switch (channel->callback_mode) { + case HV_CALL_ISR: + vmbus_channel_isr(channel); break; + + case HV_CALL_BATCHED: + hv_begin_read(&channel->inbound); + /* fallthrough */ + case HV_CALL_DIRECT: + tasklet_schedule(&channel->callback_event); } } } diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c index 50958f167305..48d5327d38d4 100644 --- a/drivers/uio/uio_hv_generic.c +++ b/drivers/uio/uio_hv_generic.c @@ -125,7 +125,7 @@ hv_uio_probe(struct hv_device *dev, goto fail; dev->channel->inbound.ring_buffer->interrupt_mask = 1; - dev->channel->batched_reading = false; + set_channel_read_mode(dev->channel, HV_CALL_DIRECT); /* Fill general uio info */ pdata->info.name = "uio_hv_generic"; diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index dce5aa6751b4..936e6b9d1fe4 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -734,19 +734,21 @@ struct vmbus_channel { void *channel_callback_context; /* - * A channel can be marked for efficient (batched) - * reading: - * If batched_reading is set to "true", we read until the - * channel is empty and hold off interrupts from the host - * during the entire read process. - * If batched_reading is set to "false", the client is not - * going to perform batched reading. - * - * By default we will enable batched reading; specific - * drivers that don't want this behavior can turn it off. + * A channel can be marked for one of three modes of reading: + * BUFFERED - callback called from taslket and should read + * channel until empty. Interrupts from the host + * are masked while read is in process (default). + * DIRECT - callback called from tasklet (softirq). + * ISR - callback called in interrupt context and must + * invoke its own deferred processing. + * Host interrupts are disabled and must be re-enabled + * when ring is empty. */ - - bool batched_reading; + enum hv_callback_mode { + HV_CALL_BATCHED, + HV_CALL_DIRECT, + HV_CALL_ISR + } callback_mode; bool is_dedicated_interrupt; struct hv_input_signal_event_buffer sig_buf; @@ -892,9 +894,10 @@ static inline void set_channel_affinity_state(struct vmbus_channel *c, c->affinity_policy = policy; } -static inline void set_channel_read_state(struct vmbus_channel *c, bool state) +static inline void set_channel_read_mode(struct vmbus_channel *c, + enum hv_callback_mode mode) { - c->batched_reading = state; + c->callback_mode = mode; } static inline void set_channel_pending_send_size(struct vmbus_channel *c, -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:28 UTC
[PATCH 11/14] vmbus: remove conditional locking of vmbus_write
All current usage of vmbus write uses the acquire_lock flag, therefore having it be optional is unnecessary. This also fixes a sparse warning since sparse doesn't like when a function has conditional locking. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/hv/channel.c | 13 ++++--------- drivers/hv/channel_mgmt.c | 1 - drivers/hv/hyperv_vmbus.h | 3 +-- drivers/hv/ring_buffer.c | 11 ++++------- include/linux/hyperv.h | 15 --------------- 5 files changed, 9 insertions(+), 34 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 18cc1c78260d..81a80c82f1bd 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -651,7 +651,6 @@ int vmbus_sendpacket_ctl(struct vmbus_channel *channel, void *buffer, u32 packetlen_aligned = ALIGN(packetlen, sizeof(u64)); struct kvec bufferlist[3]; u64 aligned_data = 0; - bool lock = channel->acquire_ring_lock; int num_vecs = ((bufferlen != 0) ? 3 : 1); @@ -670,7 +669,7 @@ int vmbus_sendpacket_ctl(struct vmbus_channel *channel, void *buffer, bufferlist[2].iov_base = &aligned_data; bufferlist[2].iov_len = (packetlen_aligned - packetlen); - return hv_ringbuffer_write(channel, bufferlist, num_vecs, lock); + return hv_ringbuffer_write(channel, bufferlist, num_vecs); } EXPORT_SYMBOL(vmbus_sendpacket_ctl); @@ -716,12 +715,10 @@ int vmbus_sendpacket_pagebuffer_ctl(struct vmbus_channel *channel, u32 packetlen_aligned; struct kvec bufferlist[3]; u64 aligned_data = 0; - bool lock = channel->acquire_ring_lock; if (pagecount > MAX_PAGE_BUFFER_COUNT) return -EINVAL; - /* * Adjust the size down since vmbus_channel_packet_page_buffer is the * largest size we support @@ -753,7 +750,7 @@ int vmbus_sendpacket_pagebuffer_ctl(struct vmbus_channel *channel, bufferlist[2].iov_base = &aligned_data; bufferlist[2].iov_len = (packetlen_aligned - packetlen); - return hv_ringbuffer_write(channel, bufferlist, 3, lock); + return hv_ringbuffer_write(channel, bufferlist, 3); } EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer_ctl); @@ -789,7 +786,6 @@ int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel, u32 packetlen_aligned; struct kvec bufferlist[3]; u64 aligned_data = 0; - bool lock = channel->acquire_ring_lock; packetlen = desc_size + bufferlen; packetlen_aligned = ALIGN(packetlen, sizeof(u64)); @@ -809,7 +805,7 @@ int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel, bufferlist[2].iov_base = &aligned_data; bufferlist[2].iov_len = (packetlen_aligned - packetlen); - return hv_ringbuffer_write(channel, bufferlist, 3, lock); + return hv_ringbuffer_write(channel, bufferlist, 3); } EXPORT_SYMBOL_GPL(vmbus_sendpacket_mpb_desc); @@ -827,7 +823,6 @@ int vmbus_sendpacket_multipagebuffer(struct vmbus_channel *channel, u32 packetlen_aligned; struct kvec bufferlist[3]; u64 aligned_data = 0; - bool lock = channel->acquire_ring_lock; u32 pfncount = NUM_PAGES_SPANNED(multi_pagebuffer->offset, multi_pagebuffer->len); @@ -866,7 +861,7 @@ int vmbus_sendpacket_multipagebuffer(struct vmbus_channel *channel, bufferlist[2].iov_base = &aligned_data; bufferlist[2].iov_len = (packetlen_aligned - packetlen); - return hv_ringbuffer_write(channel, bufferlist, 3, lock); + return hv_ringbuffer_write(channel, bufferlist, 3); } EXPORT_SYMBOL_GPL(vmbus_sendpacket_multipagebuffer); diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index b2bb5aafaa2f..f33465d78a02 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -332,7 +332,6 @@ static struct vmbus_channel *alloc_channel(void) if (!channel) return NULL; - channel->acquire_ring_lock = true; spin_lock_init(&channel->inbound_lock); spin_lock_init(&channel->lock); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 558a798c407c..6a9b54677218 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -283,8 +283,7 @@ int hv_ringbuffer_init(struct hv_ring_buffer_info *ring_info, void hv_ringbuffer_cleanup(struct hv_ring_buffer_info *ring_info); int hv_ringbuffer_write(struct vmbus_channel *channel, - struct kvec *kv_list, - u32 kv_count, bool lock); + struct kvec *kv_list, u32 kv_count); int hv_ringbuffer_read(struct vmbus_channel *channel, void *buffer, u32 buflen, u32 *buffer_actual_len, diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 30ca55aefd24..146fd8ab2a2a 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -284,7 +284,7 @@ void hv_ringbuffer_cleanup(struct hv_ring_buffer_info *ring_info) /* Write to the ring buffer. */ int hv_ringbuffer_write(struct vmbus_channel *channel, - struct kvec *kv_list, u32 kv_count, bool lock) + struct kvec *kv_list, u32 kv_count) { int i = 0; u32 bytes_avail_towrite; @@ -304,8 +304,7 @@ int hv_ringbuffer_write(struct vmbus_channel *channel, totalbytes_towrite += sizeof(u64); - if (lock) - spin_lock_irqsave(&outring_info->ring_lock, flags); + spin_lock_irqsave(&outring_info->ring_lock, flags); bytes_avail_towrite = hv_get_bytes_to_write(outring_info); @@ -315,8 +314,7 @@ int hv_ringbuffer_write(struct vmbus_channel *channel, * is empty since the read index == write index. */ if (bytes_avail_towrite <= totalbytes_towrite) { - if (lock) - spin_unlock_irqrestore(&outring_info->ring_lock, flags); + spin_unlock_irqrestore(&outring_info->ring_lock, flags); return -EAGAIN; } @@ -347,8 +345,7 @@ int hv_ringbuffer_write(struct vmbus_channel *channel, hv_set_next_write_location(outring_info, next_write_location); - if (lock) - spin_unlock_irqrestore(&outring_info->ring_lock, flags); + spin_unlock_irqrestore(&outring_info->ring_lock, flags); hv_signal_on_write(old_write, channel); diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 936e6b9d1fe4..9b0165b11c5c 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -828,16 +828,6 @@ struct vmbus_channel { */ struct list_head percpu_list; /* - * On the channel send side, many of the VMBUS - * device drivers explicity serialize access to the - * outgoing ring buffer. Give more control to the - * VMBUS device drivers in terms how to serialize - * accesss to the outgoing ring buffer. - * The default behavior will be to aquire the - * ring lock to preserve the current behavior. - */ - bool acquire_ring_lock; - /* * For performance critical channels (storage, networking * etc,), Hyper-V has a mechanism to enhance the throughput * at the expense of latency: @@ -877,11 +867,6 @@ struct vmbus_channel { }; -static inline void set_channel_lock_state(struct vmbus_channel *c, bool state) -{ - c->acquire_ring_lock = state; -} - static inline bool is_hvsock_channel(const struct vmbus_channel *c) { return !!(c->offermsg.offer.chn_flags & -- 2.11.0
In order to implement NAPI in netvsc, the driver needs access to control host interrupt mask. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/hv/hyperv_vmbus.h | 4 ---- drivers/hv/ring_buffer.c | 20 -------------------- include/linux/hyperv.h | 30 ++++++++++++++++++++++++++++++ 3 files changed, 30 insertions(+), 24 deletions(-) diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 6a9b54677218..e15a130de3c9 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -292,10 +292,6 @@ int hv_ringbuffer_read(struct vmbus_channel *channel, void hv_ringbuffer_get_debuginfo(struct hv_ring_buffer_info *ring_info, struct hv_ring_buffer_debug_info *debug_info); -void hv_begin_read(struct hv_ring_buffer_info *rbi); - -u32 hv_end_read(struct hv_ring_buffer_info *rbi); - /* * Maximum channels is determined by the size of the interrupt page * which is PAGE_SIZE. 1/2 of PAGE_SIZE is for send endpoint interrupt diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 146fd8ab2a2a..47ab69089115 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -32,26 +32,6 @@ #include "hyperv_vmbus.h" -void hv_begin_read(struct hv_ring_buffer_info *rbi) -{ - rbi->ring_buffer->interrupt_mask = 1; - virt_mb(); -} - -u32 hv_end_read(struct hv_ring_buffer_info *rbi) -{ - - rbi->ring_buffer->interrupt_mask = 0; - virt_mb(); - - /* - * Now check to see if the ring buffer is still empty. - * If it is not, we raced and we need to process new - * incoming messages. - */ - return hv_get_bytes_to_read(rbi); -} - /* * When we write to the ring buffer, check if the host needs to * be signaled. Here is the details of this protocol: diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 9b0165b11c5c..dc50997a3fba 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1473,6 +1473,36 @@ static inline void hv_signal_on_read(struct vmbus_channel *channel) } /* + * Mask off host interrupt callback notifications + */ +static inline void hv_begin_read(struct hv_ring_buffer_info *rbi) +{ + rbi->ring_buffer->interrupt_mask = 1; + + /* make sure mask update is not reordered */ + virt_mb(); +} + +/* + * Re-enable host callback and return number of outstanding bytes + */ +static inline u32 hv_end_read(struct hv_ring_buffer_info *rbi) +{ + + rbi->ring_buffer->interrupt_mask = 0; + + /* make sure mask update is not reordered */ + virt_mb(); + + /* + * Now check to see if the ring buffer is still empty. + * If it is not, we raced and we need to process new + * incoming messages. + */ + return hv_get_bytes_to_read(rbi); +} + +/* * An API to support in-place processing of incoming VMBUS packets. */ #define VMBUS_PKT_TRAILER 8 -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:29 UTC
[PATCH 13/14] vmbus: constify parameters where possible
Functions that just query state of ring buffer can have parameters marked const. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/hv/hyperv_vmbus.h | 6 +++--- drivers/hv/ring_buffer.c | 22 ++++++++++------------ include/linux/hyperv.h | 12 ++++++------ 3 files changed, 19 insertions(+), 21 deletions(-) diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index e15a130de3c9..884f83bba1ab 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -283,14 +283,14 @@ int hv_ringbuffer_init(struct hv_ring_buffer_info *ring_info, void hv_ringbuffer_cleanup(struct hv_ring_buffer_info *ring_info); int hv_ringbuffer_write(struct vmbus_channel *channel, - struct kvec *kv_list, u32 kv_count); + const struct kvec *kv_list, u32 kv_count); int hv_ringbuffer_read(struct vmbus_channel *channel, void *buffer, u32 buflen, u32 *buffer_actual_len, u64 *requestid, bool raw); -void hv_ringbuffer_get_debuginfo(struct hv_ring_buffer_info *ring_info, - struct hv_ring_buffer_debug_info *debug_info); +void hv_ringbuffer_get_debuginfo(const struct hv_ring_buffer_info *ring_info, + struct hv_ring_buffer_debug_info *debug_info); /* * Maximum channels is determined by the size of the interrupt page diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 47ab69089115..ee3e488d9dee 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -96,11 +96,9 @@ hv_set_next_write_location(struct hv_ring_buffer_info *ring_info, /* Get the next read location for the specified ring buffer. */ static inline u32 -hv_get_next_read_location(struct hv_ring_buffer_info *ring_info) +hv_get_next_read_location(const struct hv_ring_buffer_info *ring_info) { - u32 next = ring_info->ring_buffer->read_index; - - return next; + return ring_info->ring_buffer->read_index; } /* @@ -108,8 +106,8 @@ hv_get_next_read_location(struct hv_ring_buffer_info *ring_info) * This allows the caller to skip. */ static inline u32 -hv_get_next_readlocation_withoffset(struct hv_ring_buffer_info *ring_info, - u32 offset) +hv_get_next_readlocation_withoffset(const struct hv_ring_buffer_info *ring_info, + u32 offset) { u32 next = ring_info->ring_buffer->read_index; @@ -130,7 +128,7 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, /* Get the size of the ring buffer. */ static inline u32 -hv_get_ring_buffersize(struct hv_ring_buffer_info *ring_info) +hv_get_ring_buffersize(const struct hv_ring_buffer_info *ring_info) { return ring_info->ring_datasize; } @@ -147,7 +145,7 @@ hv_get_ring_bufferindices(struct hv_ring_buffer_info *ring_info) * Assume there is enough room. Handles wrap-around in src case only!! */ static u32 hv_copyfrom_ringbuffer( - struct hv_ring_buffer_info *ring_info, + const struct hv_ring_buffer_info *ring_info, void *dest, u32 destlen, u32 start_read_offset) @@ -171,7 +169,7 @@ static u32 hv_copyfrom_ringbuffer( static u32 hv_copyto_ringbuffer( struct hv_ring_buffer_info *ring_info, u32 start_write_offset, - void *src, + const void *src, u32 srclen) { void *ring_buffer = hv_get_ring_buffer(ring_info); @@ -186,8 +184,8 @@ static u32 hv_copyto_ringbuffer( } /* Get various debug metrics for the specified ring buffer. */ -void hv_ringbuffer_get_debuginfo(struct hv_ring_buffer_info *ring_info, - struct hv_ring_buffer_debug_info *debug_info) +void hv_ringbuffer_get_debuginfo(const struct hv_ring_buffer_info *ring_info, + struct hv_ring_buffer_debug_info *debug_info) { u32 bytes_avail_towrite; u32 bytes_avail_toread; @@ -264,7 +262,7 @@ void hv_ringbuffer_cleanup(struct hv_ring_buffer_info *ring_info) /* Write to the ring buffer. */ int hv_ringbuffer_write(struct vmbus_channel *channel, - struct kvec *kv_list, u32 kv_count) + const struct kvec *kv_list, u32 kv_count) { int i = 0; u32 bytes_avail_towrite; diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index dc50997a3fba..32a9cbc66b65 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -137,8 +137,8 @@ struct hv_ring_buffer_info { * for the specified ring buffer */ static inline void -hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi, - u32 *read, u32 *write) +hv_get_ringbuffer_availbytes(const struct hv_ring_buffer_info *rbi, + u32 *read, u32 *write) { u32 read_loc, write_loc, dsize; @@ -152,7 +152,7 @@ hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi, *read = dsize - *write; } -static inline u32 hv_get_bytes_to_read(struct hv_ring_buffer_info *rbi) +static inline u32 hv_get_bytes_to_read(const struct hv_ring_buffer_info *rbi) { u32 read_loc, write_loc, dsize, read; @@ -166,7 +166,7 @@ static inline u32 hv_get_bytes_to_read(struct hv_ring_buffer_info *rbi) return read; } -static inline u32 hv_get_bytes_to_write(struct hv_ring_buffer_info *rbi) +static inline u32 hv_get_bytes_to_write(const struct hv_ring_buffer_info *rbi) { u32 read_loc, write_loc, dsize, write; @@ -1420,9 +1420,9 @@ void vmbus_set_event(struct vmbus_channel *channel); /* Get the start of the ring buffer. */ static inline void * -hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) +hv_get_ring_buffer(const struct hv_ring_buffer_info *ring_info) { - return (void *)ring_info->ring_buffer->buffer; + return ring_info->ring_buffer->buffer; } /* -- 2.11.0
Stephen Hemminger
2017-Feb-01 16:29 UTC
[PATCH 14/14] vmbus: replace modulus operation with subtraction
Takes less clock cycles to check for ring wrap and subtract than to do a modulus instruction. Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> --- drivers/hv/ring_buffer.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index ee3e488d9dee..8ab6298fd5ae 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -112,7 +112,8 @@ hv_get_next_readlocation_withoffset(const struct hv_ring_buffer_info *ring_info, u32 next = ring_info->ring_buffer->read_index; next += offset; - next %= ring_info->ring_datasize; + if (next >= ring_info->ring_datasize) + next -= ring_info->ring_datasize; return next; } @@ -156,7 +157,8 @@ static u32 hv_copyfrom_ringbuffer( memcpy(dest, ring_buffer + start_read_offset, destlen); start_read_offset += destlen; - start_read_offset %= ring_buffer_size; + if (start_read_offset >= ring_buffer_size) + start_read_offset -= ring_buffer_size; return start_read_offset; } @@ -178,7 +180,8 @@ static u32 hv_copyto_ringbuffer( memcpy(ring_buffer + start_write_offset, src, srclen); start_write_offset += srclen; - start_write_offset %= ring_buffer_size; + if (start_write_offset >= ring_buffer_size) + start_write_offset -= ring_buffer_size; return start_write_offset; } -- 2.11.0
KY Srinivasan
2017-Feb-05 23:40 UTC
[PATCH 05/14] netvsc: remove no longer needed receive staging buffers
> -----Original Message----- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Wednesday, February 1, 2017 8:29 AM > To: KY Srinivasan <kys at microsoft.com>; gregkh at linuxfoundation.org > Cc: devel at linuxdriverproject.org; virtualization at lists.linux-foundation.org; > Stephen Hemminger <sthemmin at microsoft.com> > Subject: [PATCH 05/14] netvsc: remove no longer needed receive staging > buffers > > Since commit aed8c164ca5199 ("Drivers: hv: ring_buffer: count on wrap > around mappings") it is no longer necessary to handle ring wrapping > by having a special receive buffer. > > Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com> > --- > drivers/net/hyperv/hyperv_net.h | 5 --- > drivers/net/hyperv/netvsc.c | 83 ++++++--------------------------------- > drivers/net/hyperv/rndis_filter.c | 11 ------ > 3 files changed, 11 insertions(+), 88 deletions(-) > > diff --git a/drivers/net/hyperv/hyperv_net.h > b/drivers/net/hyperv/hyperv_net.h > index 3958adade7eb..cce70ceba6d5 100644 > --- a/drivers/net/hyperv/hyperv_net.h > +++ b/drivers/net/hyperv/hyperv_net.h > @@ -748,11 +748,6 @@ struct netvsc_device { > > int ring_size; > > - /* The primary channel callback buffer */ > - unsigned char *cb_buffer; > - /* The sub channel callback buffer */ > - unsigned char *sub_cb_buf; > - > struct multi_send_data msd[VRSS_CHANNEL_MAX]; > u32 max_pkt; /* max number of pkt in one send, e.g. 8 */ > u32 pkt_align; /* alignment bytes, e.g. 8 */ > diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c > index e326e68f9f6d..7487498b663c 100644 > --- a/drivers/net/hyperv/netvsc.c > +++ b/drivers/net/hyperv/netvsc.c > @@ -67,12 +67,6 @@ static struct netvsc_device *alloc_net_device(void) > if (!net_device) > return NULL; > > - net_device->cb_buffer = kzalloc(NETVSC_PACKET_SIZE, > GFP_KERNEL); > - if (!net_device->cb_buffer) { > - kfree(net_device); > - return NULL; > - } > - > net_device->mrc[0].buf = vzalloc(NETVSC_RECVSLOT_MAX * > sizeof(struct recv_comp_data)); > > @@ -93,7 +87,6 @@ static void free_netvsc_device(struct netvsc_device > *nvdev) > for (i = 0; i < VRSS_CHANNEL_MAX; i++) > vfree(nvdev->mrc[i].buf); > > - kfree(nvdev->cb_buffer); > kfree(nvdev); > } > > @@ -584,7 +577,6 @@ void netvsc_device_remove(struct hv_device > *device) > vmbus_close(device->channel); > > /* Release all resources */ > - vfree(net_device->sub_cb_buf); > free_netvsc_device(net_device); > } > > @@ -1256,16 +1248,11 @@ static void netvsc_process_raw_pkt(struct > hv_device *device, > > void netvsc_channel_cb(void *context) > { > - int ret; > - struct vmbus_channel *channel = (struct vmbus_channel *)context; > + struct vmbus_channel *channel = context; > u16 q_idx = channel->offermsg.offer.sub_channel_index; > struct hv_device *device; > struct netvsc_device *net_device; > - u32 bytes_recvd; > - u64 request_id; > struct vmpacket_descriptor *desc; > - unsigned char *buffer; > - int bufferlen = NETVSC_PACKET_SIZE; > struct net_device *ndev; > bool need_to_commit = false; > > @@ -1277,65 +1264,19 @@ void netvsc_channel_cb(void *context) > net_device = get_inbound_net_device(device); > if (!net_device) > return; > + > ndev = hv_get_drvdata(device); > - buffer = get_per_channel_state(channel); > - > - do { > - desc = get_next_pkt_raw(channel); > - if (desc != NULL) { > - netvsc_process_raw_pkt(device, > - channel, > - net_device, > - ndev, > - desc->trans_id, > - desc); > - > - put_pkt_raw(channel, desc); > - need_to_commit = true; > - continue; > - } > - if (need_to_commit) { > - need_to_commit = false; > - commit_rd_index(channel); > - } > > - ret = vmbus_recvpacket_raw(channel, buffer, bufferlen, > - &bytes_recvd, &request_id); > - if (ret == 0) { > - if (bytes_recvd > 0) { > - desc = (struct vmpacket_descriptor *)buffer; > - netvsc_process_raw_pkt(device, > - channel, > - net_device, > - ndev, > - request_id, > - desc); > - } else { > - /* > - * We are done for this pass. > - */ > - break; > - } > - > - } else if (ret == -ENOBUFS) { > - if (bufferlen > NETVSC_PACKET_SIZE) > - kfree(buffer); > - /* Handle large packet */ > - buffer = kmalloc(bytes_recvd, GFP_ATOMIC); > - if (buffer == NULL) { > - /* Try again next time around */ > - netdev_err(ndev, > - "unable to allocate buffer of size " > - "(%d)!!\n", bytes_recvd); > - break; > - } > - > - bufferlen = bytes_recvd; > - } > - } while (1); > + while ((desc = get_next_pkt_raw(channel)) != NULL) { > + netvsc_process_raw_pkt(device, channel, net_device, > + ndev, desc->trans_id, desc); > > - if (bufferlen > NETVSC_PACKET_SIZE) > - kfree(buffer); > + put_pkt_raw(channel, desc); > + need_to_commit = true; > + } > + > + if (need_to_commit) > + commit_rd_index(channel); > > netvsc_chk_recv_comp(net_device, channel, q_idx); > } > @@ -1359,8 +1300,6 @@ int netvsc_device_add(struct hv_device *device, > void *additional_info) > > net_device->ring_size = ring_size; > > - set_per_channel_state(device->channel, net_device->cb_buffer); > - > /* Open the channel */ > ret = vmbus_open(device->channel, ring_size * PAGE_SIZE, > ring_size * PAGE_SIZE, NULL, 0, > diff --git a/drivers/net/hyperv/rndis_filter.c > b/drivers/net/hyperv/rndis_filter.c > index 8d90904e0e49..113c7f4d1590 100644 > --- a/drivers/net/hyperv/rndis_filter.c > +++ b/drivers/net/hyperv/rndis_filter.c > @@ -948,9 +948,6 @@ static void netvsc_sc_open(struct vmbus_channel > *new_sc) > if (chn_index >= nvscdev->num_chn) > return; > > - set_per_channel_state(new_sc, nvscdev->sub_cb_buf + (chn_index > - 1) * > - NETVSC_PACKET_SIZE); > - > nvscdev->mrc[chn_index].buf = vzalloc(NETVSC_RECVSLOT_MAX * > sizeof(struct recv_comp_data)); > > @@ -1099,14 +1096,6 @@ int rndis_filter_device_add(struct hv_device > *dev, > if (net_device->num_chn == 1) > goto out; > > - net_device->sub_cb_buf = vzalloc((net_device->num_chn - 1) * > - NETVSC_PACKET_SIZE); > - if (!net_device->sub_cb_buf) { > - net_device->num_chn = 1; > - dev_info(&dev->device, "No memory for subchannels.\n"); > - goto out; > - } > - > vmbus_set_sc_create_callback(dev->channel, netvsc_sc_open); > > init_packet = &net_device->channel_init_pkt;Stephen, This patch can go to the net-next tree and does not need to be part of this patch set. Thanks, K. Y> -- > 2.11.0
Reasonably Related Threads
- [PATCH 05/14] netvsc: remove no longer needed receive staging buffers
- [PATCH 05/14] netvsc: remove no longer needed receive staging buffers
- [PATCH 00/14] hyperv: vmbus related patches
- [PATCH 00/14] hyperv: vmbus related patches
- [PATCH 1/2] staging: hv: Remove dead code from netvsc.c