Andres Lagar-Cavilla
2012-Feb-01 20:49 UTC
Domain relinquish resources racing with p2m access
So we''ve run into this interesting (race?) condition while doing stress-testing. We pummel the domain with paging, sharing and mmap operations from dom0, and concurrently we launch a domain destruction. Often we get in the logs something along these lines (XEN) mm.c:958:d0 Error getting mfn 859b1a (pfn ffffffffffffffff) from L1 entry 8000000859b1a625 for l1e_owner=0, pg_owner=1 We''re using the synchronized p2m patches just posted, so my analysis is as follows: - the domain destroy domctl kicks in. It calls relinquish resources. This disowns and puts most domain pages, resulting in invalid (0xff...ff) m2p entries - In parallel, a do_mmu_update is making progress, it has no issues performing a p2m lookup because the p2m has not been torn down yet; we haven''t gotten to the RCU callback. Eventually, the mapping fails in page_get_owner in get_pafe_from_l1e. The map is failed, as expected, but what makes me uneasy is the fact that there is a still active p2m lurking around, with seemingly valid translations to valid mfn''s, while all the domain pages are gone. Is this a race condition? Can this lead to trouble? Thanks! Andres
At 12:49 -0800 on 01 Feb (1328100564), Andres Lagar-Cavilla wrote:> So we''ve run into this interesting (race?) condition while doing > stress-testing. We pummel the domain with paging, sharing and mmap > operations from dom0, and concurrently we launch a domain destruction. > Often we get in the logs something along these lines > > (XEN) mm.c:958:d0 Error getting mfn 859b1a (pfn ffffffffffffffff) from L1 > entry 8000000859b1a625 for l1e_owner=0, pg_owner=1 > > We''re using the synchronized p2m patches just posted, so my analysis is as > follows: > > - the domain destroy domctl kicks in. It calls relinquish resources. This > disowns and puts most domain pages, resulting in invalid (0xff...ff) m2p > entries > > - In parallel, a do_mmu_update is making progress, it has no issues > performing a p2m lookup because the p2m has not been torn down yet; we > haven''t gotten to the RCU callback. Eventually, the mapping fails in > page_get_owner in get_pafe_from_l1e. > > The map is failed, as expected, but what makes me uneasy is the fact that > there is a still active p2m lurking around, with seemingly valid > translations to valid mfn''s, while all the domain pages are gone.Yes. That''s OK as long as we know that any user of that page will fail, but I''m not sure that we do. At one point we talked about get_gfn() taking a refcount on the underlying MFN, which would fix this more cleanly. ISTR the problem was how to make sure the refcount was moved when the gfn->mfn mapping changed. Can you stick a WARN() in mm.c to get the actual path that leads to the failure? Tim.
Andres Lagar-Cavilla
2012-Feb-10 18:05 UTC
Re: Domain relinquish resources racing with p2m access
> At 12:49 -0800 on 01 Feb (1328100564), Andres Lagar-Cavilla wrote: >> So we''ve run into this interesting (race?) condition while doing >> stress-testing. We pummel the domain with paging, sharing and mmap >> operations from dom0, and concurrently we launch a domain destruction. >> Often we get in the logs something along these lines >> >> (XEN) mm.c:958:d0 Error getting mfn 859b1a (pfn ffffffffffffffff) from >> L1 >> entry 8000000859b1a625 for l1e_owner=0, pg_owner=1 >> >> We''re using the synchronized p2m patches just posted, so my analysis is >> as >> follows: >> >> - the domain destroy domctl kicks in. It calls relinquish resources. >> This >> disowns and puts most domain pages, resulting in invalid (0xff...ff) m2p >> entries >> >> - In parallel, a do_mmu_update is making progress, it has no issues >> performing a p2m lookup because the p2m has not been torn down yet; we >> haven''t gotten to the RCU callback. Eventually, the mapping fails in >> page_get_owner in get_pafe_from_l1e. >> >> The map is failed, as expected, but what makes me uneasy is the fact >> that >> there is a still active p2m lurking around, with seemingly valid >> translations to valid mfn''s, while all the domain pages are gone. > > Yes. That''s OK as long as we know that any user of that page will > fail, but I''m not sure that we do. > > At one point we talked about get_gfn() taking a refcount on the > underlying MFN, which would fix this more cleanly. ISTR the problem was > how to make sure the refcount was moved when the gfn->mfn mapping > changed.Oh, I ditched that because it''s too hairy and error prone. There are plenty of nested get_gfn''s with the n>1 call changing the mfn. So unless we make a point of remembering the mfn at the point of get_gfn, it''s just impossible to make this work. And then "remembering the mfn" means a serious uglification of existing code.> > Can you stick a WARN() in mm.c to get the actual path that leads to the > failure?As a debug aid or as actual code to make it into the tree? This typically happens in batches of a few dozens, so a WARN is going to massively spam the console with stack traces. Guess how I found out ... The moral is that the code is reasonably defensive, so this gets caught, albeit in a rather verbose way. But this might eventually bite someone who does a get_gfn and doesn''t either check that the domain is dying or ensure that a get_page succeeds. Andres> > Tim. >