The coredumps that I''m getting from a domain that crashes without calling domain_crash are not terribly useful. These are the register contents: (gdb) i r eax 0x6000 0x6000 ecx 0xfbc06000 0xfbc06000 edx 0xfbc06040 0xfbc06040 ebx 0x4 0x4 esp 0xc02a9008 0xc02a9008 ebp 0xc02aed78 0xc02aed78 esi 0x4 0x4 edi 0x2 0x2 eip 0x0 0x0 eflags 0x10216 0x10216 cs 0x819 0x819 ss 0x821 0x821 ds 0x821 0x821 es 0x821 0x821 fs 0x821 0x821 gs 0x821 0x821 Is it really necessary for xen to overwrite part of a domain''s context on a fatal trap? -Kip _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2005-May-07 09:15 UTC
Re: [Xen-devel] garbage registers when domain killed by xen
On 7 May 2005, at 02:26, Kip Macy wrote:> Is it really necessary for xen to overwrite part of a domain''s context > on a fatal trap?Which ones are overwritten -- ecx/edx? I''m not sure how that could happen but it''s clearly a bug rather than intentional. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kip Macy
2005-May-07 14:10 UTC
Re: [Xen-devel] garbage registers when domain killed by xen
I''m not sure about ecx/edx but I know eip is bad. There are legitimate cases of trying to call a null function pointer, but I know from the context that that isn''t the case. It appears to be a page fault - but I don''t have trap handlers installed yet. -Kip On 5/7/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > On 7 May 2005, at 02:26, Kip Macy wrote: > > > Is it really necessary for xen to overwrite part of a domain''s context > > on a fatal trap? > > Which ones are overwritten -- ecx/edx? I''m not sure how that could > happen but it''s clearly a bug rather than intentional. > > -- Keir > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kip Macy
2005-May-07 14:23 UTC
Re: [Xen-devel] garbage registers when domain killed by xen
There is an odd relationship between hitting the send button and epiphanies. Show_guest_stack says the eip is 0 and the stack has eflags, eip, and CS over and over and over again. That would indicate that I''m running off my stack by trapping over and over again. However, I don''t have traps or callback handlers installed. What may be happening is xen setting up a trapframe and then jumping to failsafe callback - over and over again because jumping to failsafe callback itself causes a page fault. In this case the eip is legitimately 0 - but not because of me but because xen isn''t checking that I''ve actually set my failsafe_callback. I''ll go look at FLT14 again to see if I''m on the right track. -Kip On 5/7/05, Kip Macy <kip.macy@gmail.com> wrote:> I''m not sure about ecx/edx but I know eip is bad. There are legitimate > cases of trying to call a null function pointer, but I know from the > context that that isn''t the case. It appears to be a page fault - but > I don''t have trap handlers installed yet. > > -Kip > > On 5/7/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote: > > > > On 7 May 2005, at 02:26, Kip Macy wrote: > > > > > Is it really necessary for xen to overwrite part of a domain''s context > > > on a fatal trap? > > > > Which ones are overwritten -- ecx/edx? I''m not sure how that could > > happen but it''s clearly a bug rather than intentional. > > > > -- Keir > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2005-May-07 14:56 UTC
Re: [Xen-devel] garbage registers when domain killed by xen
It''s probably repeatedly reentering your p-f handler at address 0. This will not cause the iret in Xen to fault (the fault will appear to occur in ring 1, address 0), and so the failsafe handler will not be triggered. Yes, we should just domain_crash() if we see a callback to address 0. Even more helpful would be some extra crash context with an explanation (some way of stating it was a virtual ''double fault'' of some kind), but I don;t know how you would represent that in a standard core dump file. -- Keir On 7 May 2005, at 15:23, Kip Macy wrote:> There is an odd relationship between hitting the send button and > epiphanies. > > Show_guest_stack says the eip is 0 and the stack has eflags, eip, and > CS over and over and over again. That would indicate that I''m running > off my stack by trapping over and over again. However, I don''t have > traps or callback handlers installed. What may be happening is xen > setting up a trapframe and then jumping to failsafe callback - over > and over again because jumping to failsafe callback itself causes a > page fault. In this case the eip is legitimately 0 - but not because > of me but because xen isn''t checking that I''ve actually set my > failsafe_callback. I''ll go look at FLT14 again to see if I''m on the > right track. > > -Kip > > On 5/7/05, Kip Macy <kip.macy@gmail.com> wrote: >> I''m not sure about ecx/edx but I know eip is bad. There are legitimate >> cases of trying to call a null function pointer, but I know from the >> context that that isn''t the case. It appears to be a page fault - but >> I don''t have trap handlers installed yet. >> >> -Kip >> >> On 5/7/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote: >>> >>> On 7 May 2005, at 02:26, Kip Macy wrote: >>> >>>> Is it really necessary for xen to overwrite part of a domain''s >>>> context >>>> on a fatal trap? >>> >>> Which ones are overwritten -- ecx/edx? I''m not sure how that could >>> happen but it''s clearly a bug rather than intentional. >>> >>> -- Keir >>> >>> >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kip Macy
2005-May-07 16:02 UTC
Re: [Xen-devel] garbage registers when domain killed by xen
On 5/7/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> It''s probably repeatedly reentering your p-f handler at address 0.Sounds about right.> > Yes, we should just domain_crash() if we see a callback to address 0.Your patch or mine? ;-)> Even more helpful would be some extra crash context with an explanation > (some way of stating it was a virtual ''double fault'' of some kind), but > I don;t know how you would represent that in a standard core dump file.One could add a set of flags to the dump. They wouldn''t be visible to GDB, but we could have a core reading utility that could see it and spit out some basic info about the crash. GDB wouldn''t need it per se'' as it would look just like a SIGSEGV crash in an application. -Kip _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel