John Levon
2008-Sep-06 13:47 UTC
Reviews wanted for 6717224 HVM domUs stop doing IO, peg CPU
Nevada fix: http://cr.opensolaris.org/~johnlev/hvm-evtchn/ Hypervisor workaround: http://cr.opensolaris.org/~johnlev/hvm-evtchn-workaround/ The first webrev: we must only ever use CPU0''s evtchn structures. This is part of the HVM PV ABI and it''s always been broken. Sadly, this is broken in our S10 backport too. The fix should get backported, but luckily there''s a fairly simple workaround available in the hypervisor, which is already in place for pit0 interrupts. This is the second webrev. Note that this means that if an HVM domU offlines CPU#0, it will still get the HVM callback IRQ on CPU#0. This goes against typical expectation for what "offlining" means, but I''ve verified that this works OK on Nevada and S10. Additionally, since offlining CPU#0 is not something that makes sense in HVM anyway, it''s not something we expect people to do, beyond PIT''s test suite. As well as fixing the bug listed above, this now allows Solaris HVM to boot with > 2 VCPUs. Previously this was re-distributing the callback IRQ onto VCPU!=0, triggering the same issue. I''ve booted 8-way HVM on both S10 and Nevada w/o problems. thanks john
Russ Blaine
2008-Sep-06 19:03 UTC
Re: Reviews wanted for 6717224 HVM domUs stop doing IO, peg CPU
Nice catch. Fixes look good. John Levon wrote:> Nevada fix: > > http://cr.opensolaris.org/~johnlev/hvm-evtchn/ > > Hypervisor workaround: > > http://cr.opensolaris.org/~johnlev/hvm-evtchn-workaround/ > > > The first webrev: we must only ever use CPU0''s evtchn structures. This > is part of the HVM PV ABI and it''s always been broken. > > Sadly, this is broken in our S10 backport too. The fix should get > backported, but luckily there''s a fairly simple workaround available in > the hypervisor, which is already in place for pit0 interrupts. This is > the second webrev. > > Note that this means that if an HVM domU offlines CPU#0, it will still > get the HVM callback IRQ on CPU#0. This goes against typical expectation > for what "offlining" means, but I''ve verified that this works OK on > Nevada and S10. Additionally, since offlining CPU#0 is not something > that makes sense in HVM anyway, it''s not something we expect people to > do, beyond PIT''s test suite. > > As well as fixing the bug listed above, this now allows Solaris HVM to > boot with > 2 VCPUs. Previously this was re-distributing the callback > IRQ onto VCPU!=0, triggering the same issue. I''ve booted 8-way HVM on > both S10 and Nevada w/o problems. > > thanks > john > _______________________________________________ > xen-discuss mailing list > xen-discuss@opensolaris.org-- ----------------------------------------------------- Russ Blaine | Solaris Kernel | russell.blaine@sun.com
John Levon
2008-Sep-07 00:05 UTC
Re: Reviews wanted for 6717224 HVM domUs stop doing IO, peg CPU
On Sat, Sep 06, 2008 at 12:03:51PM -0700, Russ Blaine wrote:> Nice catch. Fixes look good.Thanks for the review. After some comments from Gavin, I think I''m going to extend the ON fix so that it also permanently binds the callback IRQ to CPU0. I think I can do this by adding xpv_intpt_bind_cpus=0 to xpv.conf but I need to test this. I''ll have a webrev out with this additional change this week regards john
John Levon
2008-Sep-08 22:10 UTC
Re: Reviews wanted for 6717224 HVM domUs stop doing IO, peg CPU
On Sat, Sep 06, 2008 at 12:03:51PM -0700, Russ Blaine wrote:> John Levon wrote: > >Nevada fix: > > > >http://cr.opensolaris.org/~johnlev/hvm-evtchn/I''ve updated this as Gavin and Stu suggested so the interrupt is always bound to (V)CPU0. Russ, can you take another look? thanks john