hj lee
2009-Apr-08 05:41 UTC
[Xen-devel] Deadload between sched_adjust() in schedule.c and compat_failsafe_callback in entry.S
Hi, We have a deadlock in dom0 running X86_64 CentOs 5.2 when dom0 runs libvirtd and xentop together, this deadlock is easily reproducible. The dom0 has four vcpus assigned. The libvirt is running on vcpu#0 and xentop is running on vcpu#3. The vcpu#0 is processing XEN_DOMCTL_scheduler_op of domctl.c which calls sched_adjust(). The sched_adjust() calls vcpu_pause(v) for each vcpu in the domain, and vcpu_pause(v) calls vcpu_sleep_sync(v) where it waits for vcpu#3 pause. On the other hand vcpu#3 is executing vcpu_runstate_get() in schedule.c called from XEN_SYSCTL_getdomaininfolist in sysctl.c. At the time of deadlock somehow this vcpu#3''s exception RIP is pointing [compat_failsafe_callback+86], which is cmpb $0x0,87987(%rip) # 0xffff828c8019ef00 <domctl_lock.10183>. I am not sure how vcpu#3 gets into this code, but what I believe it is trying to get the spinlock on domctl_lock. But vcpu#0 had a lock on the domctl_lock when it enters do_comctl(). So two vcpus are in deadlock. Can anybody explain how and when compat_failsafe_callback in entry.S is get called? Why does it try to get a lock on domctl_lock? Thanks in advance _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Apr-08 06:03 UTC
Re: [Xen-devel] Deadload between sched_adjust() in schedule.c and compat_failsafe_callback in entry.S
This should be easily fixable. I¹ll look into it. -- Keir On 08/04/2009 06:41, "hj lee" <kerdosa@gmail.com> wrote:> We have a deadlock in dom0 running X86_64 CentOs 5.2 when dom0 runs libvirtd > and xentop together, this deadlock is easily reproducible. The dom0 has four > vcpus assigned. The libvirt is running on vcpu#0 and xentop is running on > vcpu#3. The vcpu#0 is processing XEN_DOMCTL_scheduler_op of domctl.c which > calls sched_adjust(). The sched_adjust() calls vcpu_pause(v) for each vcpu in > the domain, and vcpu_pause(v) calls vcpu_sleep_sync(v) where it waits for > vcpu#3 pause. On the other hand vcpu#3 is executing vcpu_runstate_get() in > schedule.c called from XEN_SYSCTL_getdomaininfolist in sysctl.c. At the time > of deadlock somehow this vcpu#3''s exception RIP is pointing > [compat_failsafe_callback+86], which is cmpb $0x0,87987(%rip) # > 0xffff828c8019ef00 <domctl_lock.10183>. I am not sure how vcpu#3 gets into > this code, but what I believe it is trying to get the spinlock on domctl_lock. > But vcpu#0 had a lock on the domctl_lock when it enters do_comctl(). So two > vcpus are in deadlock. > > Can anybody explain how and when compat_failsafe_callback in entry.S is get > called? Why does it try to get a lock on domctl_lock?_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
hj lee
2009-Apr-10 00:19 UTC
Re: [Xen-devel] Deadload between sched_adjust() in schedule.c and compat_failsafe_callback in entry.S
Now I figured out that this is not a C code error, it is linker error! Thanks On Wed, Apr 8, 2009 at 12:03 AM, Keir Fraser <keir.fraser@eu.citrix.com>wrote:> This should be easily fixable. I’ll look into it. > > -- Keir >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
hj lee
2009-May-05 21:19 UTC
Re: [Xen-devel] Deadload between sched_adjust() in schedule.c and compat_failsafe_callback in entry.S
I was wrong about link error, there is no link error. The changeset 19519 by Keir was committed in to xen-unstable.hq to fix this deadlock. Thank you very much> > On Wed, Apr 8, 2009 at 12:03 AM, Keir Fraser <keir.fraser@eu.citrix.com>wrote: > >> This should be easily fixable. I’ll look into it. >> >> -- Keir >> > >-- Dream with longterm vision! kerdosa _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel