In a previous mail, Jeremy Fitzhardinge wrote:> The softlockup watchdog is currently a nuisance in a virtual machine, > since the whole system could have the CPU stolen from it for a long > period of time. While it would be unlikely for a guest domain to be > denied timer interrupts for over 10s, it could happen and any > softlockup message would be completely spurious.I wonder how the guest domain can be denied timer interrupts for such a long time ? The only reason I see is that the guest domain is not scheduled at all (host domain or another higher priority guest running). Now in SMP host and guest, what happens if a guest CPU is not scheduled for a while ? An example: in kernel/pid.c:alloc_pid(), if one of the guest CPUs is descheduled when holding the pidmap_lock, what happens to the other guest CPUs who want to alloc/free pids ? Are they blocked too ? -- Cyprien Laplace
LAPLACE Cyprien wrote:> I wonder how the guest domain can be denied timer interrupts for such a > long time ? The only reason I see is that the guest domain is not > scheduled at all (host domain or another higher priority guest running). > > Now in SMP host and guest, what happens if a guest CPU is not scheduled > for a while ? >I think this mostly happens when you're doing an inherently time-stealing thing, like pausing vcpus or suspend/resume. But I guess it could happen with a low-prio domain/vcpu on a busy system.> An example: in kernel/pid.c:alloc_pid(), if one of the guest CPUs is > descheduled when holding the pidmap_lock, what happens to the other > guest CPUs who want to alloc/free pids ? Are they blocked too ? >Yep; its a problem. If a vcpu holding locks and not running, then everyone else will be prevented from taking those locks. If you have a very busy system, then presumably all the vcpus will be similarly loaded and the fact that it takes a long time to get locks just means you're trying to run too many vcpus for your hardware's capacity. The other problem is if you've got two vcpus sharing one real cpu, and vcpu A spins while waiting for vcpu B to release a lock, and vcpu B is waiting for A to get off the physical CPU. I don't know how often this is a real problem in practice. J
LAPLACE Cyprien wrote:> An example: in kernel/pid.c:alloc_pid(), if one of the guest CPUs is > descheduled when holding the pidmap_lock, what happens to the other > guest CPUs who want to alloc/free pids ? Are they blocked too ?Yup. This is where it's really nice to have directed yields, where you tell the hypervisor to give your physical CPU time to the vcpu that's holding the lock you're blocking on. I know s390 can do this. Perhaps it's something worth generalizing in paravirt_ops? -- Chris