Dan Smith
2005-Sep-30 16:54 UTC
[Xen-devel] [PATCH][RESEND] Fix stale-state issue with ''xm dom{id, name}''
This is a resend of my stale state fix, which is yet unapplied. If there are issues, please let me know. Note that this fixes the issue poked by xm-test, as shown in the following snippet of David''s latest FC3pae.report:> FAIL: 01_shutdown_basic_pos > I had to run an xm list to update xend state!Signed-off-by: Dan Smith <danms@us.ibm.com> -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ewan Mellor
2005-Oct-01 10:54 UTC
Re: [Xen-devel] [PATCH][RESEND] Fix stale-state issue with ''xm dom{id, name}''
On Fri, Sep 30, 2005 at 09:54:55AM -0700, Dan Smith wrote:> > This is a resend of my stale state fix, which is yet unapplied. If > there are issues, please let me know. > > Note that this fixes the issue poked by xm-test, as shown in the > following snippet of David''s latest FC3pae.report: > > > FAIL: 01_shutdown_basic_pos > > I had to run an xm list to update xend state!Sorry Dan, I didn''t mean to sit on this patch. The thing is, it solves the problem by making sure that SrvDomain can cope with stale domains being returned by XendDomain, but I _really_ don''t want XendDomain to be returning stale information in the first place. I''ve been trying to decide how easy it would be to fix the underlying problem -- if it''s going to take a long time, then I''ll apply your patch as a workaround, but I hope to solve the problem more definitively. If I haven''t fixed it by Monday, I''ll apply your patch. Thanks, Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Smith
2005-Oct-01 15:33 UTC
Re: [Xen-devel] [PATCH][RESEND] Fix stale-state issue with ''xm dom{id, name}''
EM> The thing is, it solves the problem by making sure that SrvDomain EM> can cope with stale domains being returned by XendDomain, but I EM> _really_ don''t want XendDomain to be returning stale information EM> in the first place. I agree. I recently submitted a patch that would cause XendDomainInfo to destroy itself whenever it realized that it was stale. Christian didn''t like the idea of random places modifying the Xend state. So, in this patch, I just handled the stale state instead of returning the bogus information, and without modifying the state itself. EM> I''ve been trying to decide how easy it would be to fix the EM> underlying problem -- if it''s going to take a long time, then I''ll EM> apply your patch as a workaround, but I hope to solve the problem EM> more definitively. I think the solution (or best fix) is to standardize on the fact that we always update domain information right before we return it, and purge that information if needed. I think that "xm list" triggers this somewhere deep inside xend, as I can always purge the stale data by running an "xm list". This tells me that some async signals aren''t always being sent to clean up, which means "xm list" has to trigger it. I *think* Anthony had a comment about polling being necessary in this case for some reason, so perhaps he can chime in and explain. Thanks Ewan! -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2005-Oct-01 19:40 UTC
Re: [Xen-devel] [PATCH][RESEND] Fix stale-state issue with ''xm dom{id, name}''
Dan Smith wrote:>EM> I''ve been trying to decide how easy it would be to fix the >EM> underlying problem -- if it''s going to take a long time, then I''ll >EM> apply your patch as a workaround, but I hope to solve the problem >EM> more definitively. > >I think the solution (or best fix) is to standardize on the fact that >we always update domain information right before we return it, and >purge that information if needed. I think that "xm list" triggers >this somewhere deep inside xend, as I can always purge the stale data >by running an "xm list". This tells me that some async signals aren''t >always being sent to clean up, which means "xm list" has to trigger >it. I *think* Anthony had a comment about polling being necessary in >this case for some reason, so perhaps he can chime in and explain. > >Actually, Keir made a recent change that will cause the @releaseDomain notification to go out when the domain disappears which was the thing that necessitated polling before. As long as Xend updates it''s state on every @introduceDomain and @releaseDomain watch, it should always be up-to-date (barring the obvious scheduling race between Xend and XenStore--but that only allows a stale domain state window of a few 10s of milliseconds at worse). What would be really ideal is to do away completely with the Xend internal state and just always pull things from the store. This is probably too big of a change for 3.0 though. Regards, Anthony Liguori>Thanks Ewan! > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ewan Mellor
2005-Oct-04 07:36 UTC
Re: [Xen-devel] [PATCH][RESEND] Fix stale-state issue with ''xm dom{id, name}''
On Fri, Sep 30, 2005 at 09:54:55AM -0700, Dan Smith wrote:> > This is a resend of my stale state fix, which is yet unapplied. If > there are issues, please let me know. > > Note that this fixes the issue poked by xm-test, as shown in the > following snippet of David''s latest FC3pae.report: > > > FAIL: 01_shutdown_basic_pos > > I had to run an xm list to update xend state!Hi Dan, I made a big change yesterday to XendDomain to make it thread-safe. As far as I can tell, most of the problems that you''ve been seeing were caused by watches firing and modifying XendDomain internal state at the same time as each other and as the xm commands. This meant that it was pretty easy to confuse Xend into thinking that domains existed when they didn''t and vice versa. I would be grateful if you could re-run xm-test and let me know how it looks. There might still be some bugs to iron out, but hopefully you will find that the behaviour under xm-test is much improved. We''ve got someone working right now on integrating xm-test with our automated test/build infrastructure here, so I expect to be able to run all your tests myself soon, but I would also appreciate your feedback on this. Thanks, Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Smith
2005-Oct-04 13:41 UTC
Re: [Xen-devel] [PATCH][RESEND] Fix stale-state issue with ''xm dom{id, name}''
EM> I made a big change yesterday to XendDomain to make it EM> thread-safe. As far as I can tell, most of the problems that EM> you''ve been seeing were caused by watches firing and modifying EM> XendDomain internal state at the same time as each other and as EM> the xm commands. This meant that it was pretty easy to confuse EM> Xend into thinking that domains existed when they didn''t and vice EM> versa. Sounds about right :) EM> I would be grateful if you could re-run xm-test and let me know EM> how it looks. There might still be some bugs to iron out, but EM> hopefully you will find that the behaviour under xm-test is much EM> improved. Absolutely. I will run it today and post my findings. It would be great if others "out there" could do the same to help verify that the problem is better or fixed. Since threads are involved, many tests across varying platforms will be more convincing :) EM> We''ve got someone working right now on integrating xm-test with EM> our automated test/build infrastructure here, so I expect to be EM> able to run all your tests myself soon, but I would also EM> appreciate your feedback on this. That is excellent news. The latest version generates an XML file of results that it automatically submits to us for review. It would be great if you could use that as your data source for merging the data into your own test infrastructure. -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Smith
2005-Oct-04 17:35 UTC
Re: [Xen-devel] [PATCH][RESEND] Fix stale-state issue with ''xm dom{id, name}''
EM> I would be grateful if you could re-run xm-test and let me know EM> how it looks. There might still be some bugs to iron out, but EM> hopefully you will find that the behaviour under xm-test is much EM> improved. After two runs of xm-test, I''m not seeing failures on either of the tests that poke specific stale-state issues. That''s good news :) I''ll see about writing some more tests targeted at stale-state detection, just to make sure ;) -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ewan Mellor
2005-Oct-04 23:14 UTC
Re: [Xen-devel] [PATCH][RESEND] Fix stale-state issue with ''xm dom{id, name}''
On Tue, Oct 04, 2005 at 10:35:04AM -0700, Dan Smith wrote:> EM> I would be grateful if you could re-run xm-test and let me know > EM> how it looks. There might still be some bugs to iron out, but > EM> hopefully you will find that the behaviour under xm-test is much > EM> improved. > > After two runs of xm-test, I''m not seeing failures on either of the > tests that poke specific stale-state issues. That''s good news :) > > I''ll see about writing some more tests targeted at stale-state > detection, just to make sure ;)Great, thanks a lot Dan, I appreciate it. Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel