Hi, Is anybody know the purpose of this method (xen_evtchn_do_upcall)? When I run a user level application involved in TCP receiving and the SoftIRQ for eth0 on the same CPU core, everything is OK. But if I run them on 2 different cores, there will be xen_evtchn_do_upcall() existing (maybe when the local_bh_disable<http://www.cs.fsu.edu/~baker/devices/lxr/http/ident?i=local_bh_disable>() or local_bh_enable<http://www.cs.fsu.edu/~baker/devices/lxr/http/ident?i=local_bh_disable>() is called) in __inet_lookup_established() routine which costs longer time than the first scenario. Is it due to the synchronization issue between process context and softirq context? Thanks for any reply. 1) | __inet_lookup_established() { 1) | xen_evtchn_do_upcall() { 1) 0.054 us | exit_idle(); 1) | irq_enter() { 1) | rcu_irq_enter() { 1) 0.102 us | rcu_exit_nohz(); 1) 0.431 us | } 1) 0.064 us | idle_cpu(); 1) 1.152 us | } 1) | __xen_evtchn_do_upcall() { 1) 0.119 us | irq_to_desc(); 1) | handle_edge_irq() { 1) 0.107 us | _raw_spin_lock(); 1) | ack_dynirq() { 1) | evtchn_from_irq() { 1) | info_for_irq() { 1) | irq_get_irq_data() { 1) 0.052 us | irq_to_desc(); 1) 0.418 us | } 1) 0.782 us | } 1) 1.135 us | } 1) 0.049 us | irq_move_irq(); 1) 1.800 us | } 1) | handle_irq_event() { 1) 0.161 us | _raw_spin_unlock(); 1) | handle_irq_event_percpu() { 1) | xennet_interrupt() { 1) 0.125 us | _raw_spin_lock_irqsave(); 1) | xennet_tx_buf_gc() { 1) 0.079 us | gnttab_query_foreign_access(); 1) 0.050 us | gnttab_end_foreign_access_ref(); 1) 0.069 us | gnttab_release_grant_reference(); 1) | dev_kfree_skb_irq() { 1) 0.055 us | raise_softirq_irqoff(); 1) 0.472 us | } 1) 0.049 us | gnttab_query_foreign_access(); 1) 0.058 us | gnttab_end_foreign_access_ref(); 1) 0.058 us | gnttab_release_grant_reference(); 1) | dev_kfree_skb_irq() { 1) 0.050 us | raise_softirq_irqoff(); 1) 0.456 us | } 1) 3.714 us | } 1) 0.102 us | _raw_spin_unlock_irqrestore(); 1) 4.857 us | } 1) 0.061 us | note_interrupt(); 1) 5.571 us | } 1) 0.054 us | _raw_spin_lock(); 1) 6.707 us | } 1) 0.083 us | _raw_spin_unlock(); 1) + 10.083 us | } 1) + 10.985 us | } 1) | irq_exit() { 1) | rcu_irq_exit() { 1) 0.087 us | rcu_enter_nohz(); 1) 0.429 us | } 1) 0.049 us | idle_cpu(); 1) 1.088 us | } 1) + 14.551 us | } 1) 0.191 us | } /* __inet_lookup_established */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Mon, 2012-10-22 at 02:51 +0100, David Xu wrote:> Hi, > > > Is anybody know the purpose of this method (xen_evtchn_do_upcall)?It is the callback used to inject event channels events (i.e. IRQs) into the guest. You would expect to see it at the base of any stack trace taken from interrupt context.> When I run a user level application involved in TCP receiving and the > SoftIRQ for eth0 on the same CPU core, everything is OK. But if I run > them on 2 different cores, there will be xen_evtchn_do_upcall() > existing (maybe when the local_bh_disable() or local_bh_enable() is > called)it would not be unusual to get an interrupt immediately after re-enabling interrupts.> in __inet_lookup_established() routine which costs longer time than > the first scenario. Is it due to the synchronization issue between > process context and softirq context? Thanks for any reply. > > > 1) | __inet_lookup_established() { > 1) | xen_evtchn_do_upcall() { > 1) 0.054 us | exit_idle(); > 1) | irq_enter() { > 1) | rcu_irq_enter() { > 1) 0.102 us | rcu_exit_nohz(); > 1) 0.431 us | } > 1) 0.064 us | idle_cpu(); > 1) 1.152 us | } > 1) | __xen_evtchn_do_upcall() { > 1) 0.119 us | irq_to_desc(); > 1) | handle_edge_irq() { > 1) 0.107 us | _raw_spin_lock(); > 1) | ack_dynirq() { > 1) | evtchn_from_irq() { > 1) | info_for_irq() { > 1) | irq_get_irq_data() { > 1) 0.052 us | irq_to_desc(); > 1) 0.418 us | } > 1) 0.782 us | } > 1) 1.135 us | } > 1) 0.049 us | irq_move_irq(); > 1) 1.800 us | } > 1) | handle_irq_event() { > 1) 0.161 us | _raw_spin_unlock(); > 1) | handle_irq_event_percpu() { > 1) | xennet_interrupt() { > 1) 0.125 us | _raw_spin_lock_irqsave(); > 1) | xennet_tx_buf_gc() { > 1) 0.079 us | gnttab_query_foreign_access(); > 1) 0.050 us | gnttab_end_foreign_access_ref(); > 1) 0.069 us | gnttab_release_grant_reference(); > 1) | dev_kfree_skb_irq() { > 1) 0.055 us | raise_softirq_irqoff(); > 1) 0.472 us | } > 1) 0.049 us | gnttab_query_foreign_access(); > 1) 0.058 us | gnttab_end_foreign_access_ref(); > 1) 0.058 us | gnttab_release_grant_reference(); > 1) | dev_kfree_skb_irq() { > 1) 0.050 us | raise_softirq_irqoff(); > 1) 0.456 us | } > 1) 3.714 us | } > 1) 0.102 us | _raw_spin_unlock_irqrestore(); > 1) 4.857 us | } > 1) 0.061 us | note_interrupt(); > 1) 5.571 us | } > 1) 0.054 us | _raw_spin_lock(); > 1) 6.707 us | } > 1) 0.083 us | _raw_spin_unlock(); > 1) + 10.083 us | } > 1) + 10.985 us | } > 1) | irq_exit() { > 1) | rcu_irq_exit() { > 1) 0.087 us | rcu_enter_nohz(); > 1) 0.429 us | } > 1) 0.049 us | idle_cpu(); > 1) 1.088 us | } > 1) + 14.551 us | } > 1) 0.191 us | } /* __inet_lookup_established */
Hi Lan, Thanks for your reply. I did a experiment as follows, I assigned 2 vCPU to a VM, one vCPU (vCPU0) was pinned to a physical CPU shared with several other VMs and the other vCPU (vCPU1) was pinned to an idle physical CPU occupied by this VM only. Then in Guest OS I run iperf server to measure its TCP receiving throughput. In order to shrink the receiving delay caused by vCPU scheduling, I pin the IRQ context of NIC to vCPU1 and run iperf server on the vCPU0. This method works well for UDP, but does not work for TCP. I track the involved function by ftrace and get the following results which contains lots of xen_evtchn_do_upcall routine. What''s the meaning of this process (xen_evtchn_do_upcall => handle_irq_event => xennet_tx_buf_gc => gnttab_query_foreign_access)? 1) | tcp_v4_rcv() { 1) 0.087 us | __inet_lookup_established(); 1) | sk_filter() { 1) | security_sock_rcv_skb() { 1) 0.049 us | cap_socket_sock_rcv_skb(); 1) 0.374 us | } 1) 0.708 us | } 1) | _raw_spin_lock() { 1) | xen_evtchn_do_upcall() { 1) 0.051 us | exit_idle(); 1) | irq_enter() { 1) | rcu_irq_enter() { 1) 0.094 us | rcu_exit_nohz(); 1) 0.432 us | } 1) 0.055 us | idle_cpu(); 1) 1.166 us | } 1) | __xen_evtchn_do_upcall() { 1) 0.120 us | irq_to_desc(); 1) | handle_edge_irq() { 1) 0.103 us | _raw_spin_lock(); 1) | ack_dynirq() { 1) | evtchn_from_irq() { 1) | info_for_irq() { 1) | irq_get_irq_data() { 1) 0.051 us | irq_to_desc(); 1) 0.400 us | } 1) 0.746 us | } 1) 1.074 us | } 1) 0.050 us | irq_move_irq(); 1) 1.767 us | } 1) | handle_irq_event() { 1) 0.164 us | _raw_spin_unlock(); 1) | handle_irq_event_percpu() { 1) | xennet_interrupt() { 1) 0.125 us | _raw_spin_lock_irqsave(); 1) | xennet_tx_buf_gc() { 1) 0.082 us | gnttab_query_foreign_access(); 1) 0.050 us | gnttab_end_foreign_access_ref(); 1) 0.070 us | gnttab_release_grant_reference(); 1) | dev_kfree_skb_irq() { 1) 0.061 us | raise_softirq_irqoff(); 1) 0.460 us | } 1) 0.058 us | gnttab_query_foreign_access(); 1) 0.050 us | gnttab_end_foreign_access_ref(); 1) 0.050 us | gnttab_release_grant_reference(); 1) | dev_kfree_skb_irq() { 1) 0.059 us | raise_softirq_irqoff(); 1) 0.440 us | } 1) 3.710 us | } 1) 0.092 us | _raw_spin_unlock_irqrestore(); 1) 4.845 us | } 1) 0.075 us | note_interrupt(); 1) 5.567 us | } 1) 0.055 us | _raw_spin_lock(); 1) 6.889 us | } 1) 0.080 us | _raw_spin_unlock(); 1) + 10.081 us | } 1) + 10.965 us | } 1) | irq_exit() { 1) | rcu_irq_exit() { 1) 0.086 us | rcu_enter_nohz(); 1) 0.424 us | } 1) 0.049 us | idle_cpu(); 1) 1.094 us | } 1) + 14.555 us | } 1) 0.120 us | } /* _raw_spin_lock */ 1) | __wake_up_sync_key() { 1) 0.099 us | _raw_spin_lock_irqsave(); 1) | __wake_up_common() { 1) | autoremove_wake_function() { 1) | default_wake_function() { 1) | try_to_wake_up() { 1) 0.103 us | _raw_spin_lock_irqsave(); 1) 0.078 us | task_waking_fair(); 1) 0.102 us | select_task_rq_fair(); 1) | xen_smp_send_reschedule() { 1) | xen_send_IPI_one() { 1) | notify_remote_via_irq() { 1) | evtchn_from_irq() { 1) | info_for_irq() { 1) | irq_get_irq_data() { 1) 0.067 us | irq_to_desc(); 1) 0.396 us | } 1) 0.727 us | } 1) 1.055 us | } 1) 1.699 us | } 1) 2.048 us | } 1) 2.407 us | } 1) 0.066 us | ttwu_stat(); 1) 0.114 us | _raw_spin_unlock_irqrestore(); 1) 4.941 us | } 1) 5.294 us | } 1) 5.645 us | } 1) 6.023 us | } 1) 0.094 us | _raw_spin_unlock_irqrestore(); 1) 7.156 us | } 1) 0.058 us | dst_metric(); 1) | inet_csk_reset_xmit_timer.constprop.34() { 1) | sk_reset_timer() { 1) | mod_timer() { 1) | lock_timer_base.isra.30() { 1) 0.099 us | _raw_spin_lock_irqsave(); 1) 0.436 us | } 1) 0.049 us | idle_cpu(); 1) 0.116 us | _raw_spin_unlock(); 1) 0.074 us | _raw_spin_lock(); 1) 0.072 us | internal_add_timer(); 1) 0.103 us | _raw_spin_unlock_irqrestore(); 1) 2.673 us | } 1) 3.039 us | } 1) 3.397 us | } 1) 0.082 us | _raw_spin_unlock(); 1) 0.061 us | sock_put(); 1) + 48.704 us | } When I run both process context of application ( e.g. iperf server ) and IRQ context on vCPU1 which is the ''''fast" core, no any xen_evtchn_do_upcall routine found. 1) | tcp_v4_rcv() { 1) 0.081 us | __inet_lookup_established(); 1) | sk_filter() { 1) | security_sock_rcv_skb() { 1) 0.059 us | cap_socket_sock_rcv_skb(); 1) 0.542 us | } 1) 0.875 us | } 1) 0.060 us | _raw_spin_lock(); 1) 0.117 us | _raw_spin_unlock(); 1) 0.053 us | sock_put(); 1) 2.703 us | } Do you think these xen_evtchn_do_upcall routines are due to the synchronization between process context and softirq context? Thanks. Regards, Cong 2012/10/24 Ian Campbell <Ian.Campbell@citrix.com>> On Mon, 2012-10-22 at 02:51 +0100, David Xu wrote: > > Hi, > > > > > > Is anybody know the purpose of this method (xen_evtchn_do_upcall)? > > It is the callback used to inject event channels events (i.e. IRQs) into > the guest. You would expect to see it at the base of any stack trace > taken from interrupt context. > > > When I run a user level application involved in TCP receiving and the > > SoftIRQ for eth0 on the same CPU core, everything is OK. But if I run > > them on 2 different cores, there will be xen_evtchn_do_upcall() > > existing (maybe when the local_bh_disable() or local_bh_enable() is > > called) > > it would not be unusual to get an interrupt immediately after > re-enabling interrupts. > > > in __inet_lookup_established() routine which costs longer time than > > the first scenario. Is it due to the synchronization issue between > > process context and softirq context? Thanks for any reply. > > > > > > 1) | __inet_lookup_established() { > > 1) | xen_evtchn_do_upcall() { > > 1) 0.054 us | exit_idle(); > > 1) | irq_enter() { > > 1) | rcu_irq_enter() { > > 1) 0.102 us | rcu_exit_nohz(); > > 1) 0.431 us | } > > 1) 0.064 us | idle_cpu(); > > 1) 1.152 us | } > > 1) | __xen_evtchn_do_upcall() { > > 1) 0.119 us | irq_to_desc(); > > 1) | handle_edge_irq() { > > 1) 0.107 us | _raw_spin_lock(); > > 1) | ack_dynirq() { > > 1) | evtchn_from_irq() { > > 1) | info_for_irq() { > > 1) | irq_get_irq_data() { > > 1) 0.052 us | irq_to_desc(); > > 1) 0.418 us | } > > 1) 0.782 us | } > > 1) 1.135 us | } > > 1) 0.049 us | irq_move_irq(); > > 1) 1.800 us | } > > 1) | handle_irq_event() { > > 1) 0.161 us | _raw_spin_unlock(); > > 1) | handle_irq_event_percpu() { > > 1) | xennet_interrupt() { > > 1) 0.125 us | _raw_spin_lock_irqsave(); > > 1) | xennet_tx_buf_gc() { > > 1) 0.079 us | gnttab_query_foreign_access(); > > 1) 0.050 us | gnttab_end_foreign_access_ref(); > > 1) 0.069 us | gnttab_release_grant_reference(); > > 1) | dev_kfree_skb_irq() { > > 1) 0.055 us | raise_softirq_irqoff(); > > 1) 0.472 us | } > > 1) 0.049 us | gnttab_query_foreign_access(); > > 1) 0.058 us | gnttab_end_foreign_access_ref(); > > 1) 0.058 us | gnttab_release_grant_reference(); > > 1) | dev_kfree_skb_irq() { > > 1) 0.050 us | raise_softirq_irqoff(); > > 1) 0.456 us | } > > 1) 3.714 us | } > > 1) 0.102 us | _raw_spin_unlock_irqrestore(); > > 1) 4.857 us | } > > 1) 0.061 us | note_interrupt(); > > 1) 5.571 us | } > > 1) 0.054 us | _raw_spin_lock(); > > 1) 6.707 us | } > > 1) 0.083 us | _raw_spin_unlock(); > > 1) + 10.083 us | } > > 1) + 10.985 us | } > > 1) | irq_exit() { > > 1) | rcu_irq_exit() { > > 1) 0.087 us | rcu_enter_nohz(); > > 1) 0.429 us | } > > 1) 0.049 us | idle_cpu(); > > 1) 1.088 us | } > > 1) + 14.551 us | } > > 1) 0.191 us | } /* __inet_lookup_established */ > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Wed, 2012-10-24 at 14:39 +0100, David Xu wrote:> Hi Lan, > > > Thanks for your reply. I did a experiment as follows, > I assigned 2 vCPU to a VM, one vCPU (vCPU0) was pinned to a physical > CPU shared with several other VMs and the other vCPU (vCPU1) was > pinned to an idle physical CPU occupied by this VM only. Then in Guest > OS I run iperf server to measure its TCP receiving throughput. In > order to shrink the receiving delay caused by vCPU scheduling, I pin > the IRQ context of NIC to vCPU1 and run iperf server on the vCPU0. > This method works well for UDP, but does not work for TCP. I track the > involved function by ftrace and get the following results which > contains lots of xen_evtchn_do_upcall routine. What''s the meaning of > this process (xen_evtchn_do_upcall => handle_irq_event => > xennet_tx_buf_gc => gnttab_query_foreign_access)?Have you looked at the code for any of those functions? If you had done you''d find it is pretty obviously an interrupt being delivered to the network device and the associated work to satisfy that interrupt. It doesn''t seem that surprising that an iperf test should involve lots of network interrupts. It''s not entirely clear to me what you are expecting to find and/or what you are trying to prove.> When I run both process context of application ( e.g. iperf server ) > and IRQ context on vCPU1 which is the ''''fast" core, no > any xen_evtchn_do_upcall routine found.Perhaps on the fast core NAPI is able to kick in and therefore the NIC becomes polled instead of interrupt driven? Ian.
Hi Lan, I am trying to improve the TCP/UDP performance in VM. Due the the vCPU scheduling on a pCPU sharing platform, the TCP receiving delay will be significant which hurt the TCP/UDP throughput. So I want to offload the softIRQ context to another Idle pCPU which I call fast-tick CPU. The packet receiving process is like this: IRQ routine can continue picking packet from ring buffer and put it to TCP receive buffer ( receive_queue, prequeue, backlog_queue ) in kernel no matter whether the user process on another CPU shared with other VMs is running or not. Once the vCPU holding the user process gets scheduled, user process will fetch all packets from receive buffer in kernel, which can improve the throughput. This works well for UDP unfortunately does not work for TCP currently. I found those xen_evtchn_do_upcall routines existing when irq context try to get the spinlock on the socket (Of course, they may happen in other paths). If this spinlock is held by process context, irq context has to spin on it and can not put any packet to receive buffer in time. So I doubt these xen_evtchn_do_upcall routine are due to the synchronization between process context and irq context. Since I run process context and irq context on 2 different vCPU, when they try to get the spinlock on the same socket there will be interrupts between 2 vCPU which are implemented by event in Xen. If there is any error in my description, please correct me. Thanks. Regards, Cong 2012/10/24 Ian Campbell <Ian.Campbell@citrix.com>> On Wed, 2012-10-24 at 14:39 +0100, David Xu wrote: > > Hi Lan, > > > > > > Thanks for your reply. I did a experiment as follows, > > I assigned 2 vCPU to a VM, one vCPU (vCPU0) was pinned to a physical > > CPU shared with several other VMs and the other vCPU (vCPU1) was > > pinned to an idle physical CPU occupied by this VM only. Then in Guest > > OS I run iperf server to measure its TCP receiving throughput. In > > order to shrink the receiving delay caused by vCPU scheduling, I pin > > the IRQ context of NIC to vCPU1 and run iperf server on the vCPU0. > > This method works well for UDP, but does not work for TCP. I track the > > involved function by ftrace and get the following results which > > contains lots of xen_evtchn_do_upcall routine. What''s the meaning of > > this process (xen_evtchn_do_upcall => handle_irq_event => > > xennet_tx_buf_gc => gnttab_query_foreign_access)? > > Have you looked at the code for any of those functions? > > If you had done you''d find it is pretty obviously an interrupt being > delivered to the network device and the associated work to satisfy that > interrupt. > > It doesn''t seem that surprising that an iperf test should involve lots > of network interrupts. > > It''s not entirely clear to me what you are expecting to find and/or what > you are trying to prove. > > > When I run both process context of application ( e.g. iperf server ) > > and IRQ context on vCPU1 which is the ''''fast" core, no > > any xen_evtchn_do_upcall routine found. > > Perhaps on the fast core NAPI is able to kick in and therefore the NIC > becomes polled instead of interrupt driven? > > Ian. > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Maybe Matching Threads
- [PATCH] xen: Use wmb instead of rmb in xen_evtchn_do_upcall().
- [PATCH 0/3] x86: mwait_idle improvements ported from Linux
- [Xen-ia64-devel] RE: IPF/Xen VTI domain testing report for Xen 3.0.3 RC1
- How to retrieve legacy cgroups location ?
- [PATCH next] xen: Use more current logging styles