thr3ads.net - Xen devel - [Xen-devel] event delay issue on SMP machine when xen0 is SMP enabled [Dec 2005]

If this information is useful, please help other people find it:
Share via:

Li, Xin B

2005-Dec-09 07:03 UTC

[Xen-devel] event delay issue on SMP machine when xen0 is SMP enabled

Hi Ian/Keir,
We found event delay issue on SMP machine when xen0 is SMP enabled, the
worse case is that sometimes seems like it''s lost, the phenomena is:
When we start VMX domain on 64bit SMP xen0 on a SMP system with 16
processors, most cases, QEMU device model window pops up with black
screen and stops there, "xm list" shows that VMX domain is in block
state.  If we use "info vmxiopage" in QEMU command line, it reports
that
the IO request state is 1, STATE_IOREQ_READY state.  I added printf just
after select() in QEMU DM main loop, and found evtchn fd never got
readable, that''s to say, QEMU DM never got notified by the IO request
from VMX domain.
The root cause is:
1) QEMU DM does an evtchn interdoamin bind and a port number greater
than 63 is allocated on SMP xen0 on a big SMP machine, here it''s 65,
and
in xen HV, this will notify vcpu0 of xen0:
	    /*
	     * We may have lost notifications on the remote unbound
port. Fix that up
	     * here by conservatively always setting a notification on
the local port.
	     */
	    evtchn_set_pending(ld->vcpu[lchn->notify_vcpu_id], lport);in
evtchn_set_pending:

Then in evtchn_set_pending:
	    /* These four operations must happen in strict order. */
	    if ( !test_and_set_bit(port, &s->evtchn_pending[0]) &&
	         !test_bit        (port, &s->evtchn_mask[0])    &&
	         !test_and_set_bit(port / BITS_PER_LONG,
                           &v->vcpu_info->evtchn_pending_sel)
&&
	         !test_and_set_bit(0,
&v->vcpu_info->evtchn_upcall_pending) )
	    {
	        evtchn_notify(v);
	    }

If the port is not masked, bit 1 in evtchn_pending_sel of vcpu0 will be
set, this is a typical case.  But when doing interdomain evtchn binding,
this port is masked, so bit 1 in evtchn_pending_sel of vcpu0 is not set.

2) just after returning from xen HV, this port will be unmasked in xen0
kernel:
	static inline void unmask_evtchn(int port)
	{
		shared_info_t *s = HYPERVISOR_shared_info;
		vcpu_info_t *vcpu_info &s->vcpu_info[smp_processor_id()];

		synch_clear_bit(port, &s->evtchn_mask[0]);

		/*
		 * The following is basically the equivalent of
''hw_resend_irq''. Just
		 * like a real IO-APIC we ''lose the interrupt edge'' if
the channel is
		 * masked.
		 */
		if (synch_test_bit(port, &s->evtchn_pending[0]) && 
		    !synch_test_and_set_bit(port / BITS_PER_LONG,
	
&vcpu_info->evtchn_pending_sel)) {
			vcpu_info->evtchn_upcall_pending = 1;
			if (!vcpu_info->evtchn_upcall_mask)
				force_evtchn_callback();
		}
	}

But this is done on the current vcpu, for most cases, it''s not vcpu0,
so
this event is notified on the current vcpu, not vcpu0, and bit 1 in
evtchn_pending_sel of current vcpu is set.

3) however, this event won''t be handled on the current vcpu.
	asmlinkage void evtchn_do_upcall(struct pt_regs *regs)
	{
		unsigned long  l1, l2;
		unsigned int   l1i, l2i, port;
		int            irq, cpu = smp_processor_id();
		shared_info_t *s = HYPERVISOR_shared_info;
		vcpu_info_t   *vcpu_info = &s->vcpu_info[cpu];
	
		vcpu_info->evtchn_upcall_pending = 0;
	
		/* NB. No need for a barrier here -- XCHG is a barrier
on x86. */
		l1 = xchg(&vcpu_info->evtchn_pending_sel, 0);
		while (l1 != 0) {
			l1i = __ffs(l1);
			l1 &= ~(1UL << l1i);
        
			while ((l2 = active_evtchns(cpu, s, l1i)) != 0)
{
				l2i = __ffs(l2);
				l2 &= ~(1UL << l2i);
            
				port = (l1i * BITS_PER_LONG) + l2i;
				if ((irq = evtchn_to_irq[port]) != -1)
					do_IRQ(irq, regs);
				else
					evtchn_device_upcall(port);
			}
		}
	}

This is because on a SMP kernel active_evtchns is defined as:
#define active_evtchns(cpu,sh,idx)		\
	((sh)->evtchn_pending[idx] &		\
	 cpu_evtchn_mask[cpu][idx] &		\
	 ~(sh)->evtchn_mask[idx])

While cpu_evtchn_mask is initialized as:
	static void init_evtchn_cpu_bindings(void)
	{
		/* By default all event channels notify CPU#0. */
		memset(cpu_evtchn, 0, sizeof(cpu_evtchn));
		memset(cpu_evtchn_mask[0], ~0,
sizeof(cpu_evtchn_mask[0]));
	}

So vcpu other than vcpu0 won''t handle it, even it sees this event is
pending there, and it won''t be delivered to the evtchn device.
Only under some cases, bit 1 in evtchn_pending_sel of vcpu0 is set
because some other port in this select area is notified, this event will
be delivered, but if we are unlucky, bit 1 in evtchn_pending_sel of
vcpu0 is not set by chance of any other port notification, we get stuck
there and seems this event is lost, though it still may be delivered to
evtchn device in unknow time :-(

The reason why we didn''t meet this problem before is that, we got an
event port smaller than 64 on most machines, and bit 0 in
evtchn_pending_sel of vcpu0 is very likely to be set, since this is a
hot event port erea.

We do need fix this issue because this is also a common issue in complex
environment. Actually that will also be a performance issue, even when
event channel port is <64 since it depends on other events to get
notified.

BTW, why vcpu other than vcpu0 won''t handle event by default?

thanks
-Xin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2005-Dec-09 08:28 UTC

head link

[Xen-devel] Re: event delay issue on SMP machine when xen0 is SMP enabled

On 9 Dec 2005, at 07:03, Li, Xin B wrote:
> We do need fix this issue because this is also a common issue in 
> complex
> environment. Actually that will also be a performance issue, even when
> event channel port is <64 since it depends on other events to get
> notified.
Thanks for working out what''s going on. A patch should be quite easy.
> BTW, why vcpu other than vcpu0 won''t handle event by default?
We could allow any vcpu to process any pending notifications. It would 
mean vcpus outside an irq''s affinity set could end up processing an 
interrupt and I''m not sure if that''s a good thing. It would
actually
slightly simplify evtchn.c though (no need to apply a per-cpu mask to 
the event-channel port array).

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2005-Dec-09 08:31 UTC

head link

Re: [Xen-devel] Re: event delay issue on SMP machine when xen0 is SMP enabled

On 9 Dec 2005, at 08:28, Keir Fraser wrote:
>> BTW, why vcpu other than vcpu0 won''t handle event by default?
>
> We could allow any vcpu to process any pending notifications. It would 
> mean vcpus outside an irq''s affinity set could end up processing
an
> interrupt and I''m not sure if that''s a good thing. It
would actually
> slightly simplify evtchn.c though (no need to apply a per-cpu mask to 
> the event-channel port array).
You always remember a good reason not to do something after sending the 
email. :-)

IPIs and VIRQs must be processed by their home vcpu. Hence we do need 
to limit evtchn processing to the current bound vcpu, and that leads 
(currently) to the problem you see.

So I think the right fix is just to fix unmask_evtchn(). Maybe you guys 
want to send a patch to move it into evtchn.c and fix it to send to 
cpu_from_evtchn(evtchn)?

  -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Li, Xin B

2005-Dec-09 08:49 UTC

head link

[Xen-devel] RE: event delay issue on SMP machine when xen0 is SMP enabled

>> BTW, why vcpu other than vcpu0 won''t handle event by default?
>
>We could allow any vcpu to process any pending notifications. It would 
>mean vcpus outside an irq''s affinity set could end up processing an
>interrupt and I''m not sure if that''s a good thing. It
would actually
>slightly simplify evtchn.c though (no need to apply a per-cpu mask to 
>the event-channel port array).
>
I have a question on the following function, why l1 is updated only one
time, while l2 is updated in each loop? I think they should handle in
the same way. Anyway, in current code, l2 &= ~(1UL << l2i); is not
needed.
Thanks
-Xin

/* NB. Interrupts are disabled on entry. */
asmlinkage void evtchn_do_upcall(struct pt_regs *regs)
{
	unsigned long  l1, l2;
	unsigned int   l1i, l2i, port;
	int            irq, cpu = smp_processor_id();
	shared_info_t *s = HYPERVISOR_shared_info;
	vcpu_info_t   *vcpu_info = &s->vcpu_info[cpu];

	vcpu_info->evtchn_upcall_pending = 0;

	/* NB. No need for a barrier here -- XCHG is a barrier on x86.
*/
	l1 = xchg(&vcpu_info->evtchn_pending_sel, 0);
	while (l1 != 0) {
		l1i = __ffs(l1);
		l1 &= ~(1UL << l1i);
        
		while ((l2 = active_evtchns(cpu, s, l1i)) != 0) {
			l2i = __ffs(l2);
			l2 &= ~(1UL << l2i);
            
			port = (l1i * BITS_PER_LONG) + l2i;
			if ((irq = evtchn_to_irq[port]) != -1)
				do_IRQ(irq, regs);
			else
				evtchn_device_upcall(port);
		}
	}
}

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Li, Xin B

2005-Dec-09 09:08 UTC

head link

RE: [Xen-devel] Re: event delay issue on SMP machine when xen0 is SMP enabled

>>> BTW, why vcpu other than vcpu0 won''t handle event by
default?
>>
>> We could allow any vcpu to process any pending 
>notifications. It would 
>> mean vcpus outside an irq''s affinity set could end up
processing an
>> interrupt and I''m not sure if that''s a good thing. It
would actually
>> slightly simplify evtchn.c though (no need to apply a 
>per-cpu mask to 
>> the event-channel port array).
>
>You always remember a good reason not to do something after 
>sending the 
>email. :-)
>
>IPIs and VIRQs must be processed by their home vcpu. Hence we do need 
>to limit evtchn processing to the current bound vcpu, and that leads 
>(currently) to the problem you see.
>
>So I think the right fix is just to fix unmask_evtchn(). Maybe 
>you guys 
>want to send a patch to move it into evtchn.c and fix it to send to 
>cpu_from_evtchn(evtchn)?
>
Since evtchn code is critical to xen, we are very careful when finding
any possible fixes :-)
Why not turn on cpu_evtchn_mask by default, and when calling
bind_evtchn_to_cpu, we turn off it on other cpus.

Thanks
-Xin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Dec 2005 - event delay issue on SMP machine when xen0 is SMP enabled

[Xen-devel] event delay issue on SMP machine when xen0 is SMP enabled

[Xen-devel] Re: event delay issue on SMP machine when xen0 is SMP enabled

Re: [Xen-devel] Re: event delay issue on SMP machine when xen0 is SMP enabled

[Xen-devel] RE: event delay issue on SMP machine when xen0 is SMP enabled

RE: [Xen-devel] Re: event delay issue on SMP machine when xen0 is SMP enabled