Andres Lagar-Cavilla
2013-Aug-01 13:30 UTC
Re: [PATCH] Xen: Fix retry calls into PRIVCMD_MMAPBATCH*.
On Aug 1, 2013, at 8:04 AM, David Vrabel <david.vrabel@citrix.com> wrote:> On 01/08/13 12:49, Andres Lagar-Cavilla wrote: >> On Aug 1, 2013, at 7:23 AM, David Vrabel <david.vrabel@citrix.com> wrote: >> >>> On 01/08/13 04:30, Andres Lagar-Cavilla wrote: >>>> -- Resend as I haven''t seen this hit the lists. Maybe some smtp misconfig. Apologies. Also expanded cc -- >>>> >>>> When a foreign mapper attempts to map guest frames that are paged out, >>>> the mapper receives an ENOENT response and will have to try again >>>> while a helper process pages the target frame back in. >>>> >>>> Gating checks on PRIVCMD_MMAPBATCH* ioctl args were preventing retries >>>> of mapping calls. >>> >>> This breaks the auto_translated_physmap case as will allocate another >>> set of empty pages and leak the previous set. >> >> David, >> not able to follow you here. Under what circumstances will another >> set of empty pages be allocated? And where? are we talking page table pages? > > .... > vma = find_vma(mm, m.addr); > if (!vma || > vma->vm_ops != &privcmd_vm_ops || > (m.addr != vma->vm_start) || > ((m.addr + (nr_pages << PAGE_SHIFT)) != vma->vm_end) || > !privcmd_enforce_singleshot_mapping(vma)) { > up_write(&mm->mmap_sem); > ret = -EINVAL; > goto out; > } > if (xen_feature(XENFEAT_auto_translated_physmap)) { > ret = alloc_empty_pages(vma, m.num); > > Here.Right right right. Excellent observation thanks. I fwd ported from 3.4 and this slipped through the cracks. Ok, V2 coming.> > if (ret < 0) { > up_write(&mm->mmap_sem); > goto out; > } > } > > >>> This privcmd_enforce_singleshot_mapping() stuff seems very odd anyway. >>> Does anyone know what it was for originally? It would be preferrable if >>> we could update the mappings with a new set of foreign MFNs without >>> having to tear down the VMA and recreate a new VMA. >> >> I believe it''s mostly historical. I agree with you on principle, but recreating VMAs is super-cheap. > > Tearing them down is not cheap as each page requires a trap-and-emulate > to clear the PTE (see ptep_get_and_clear_full() in zap_pte_range()).You need to tell the hypervisor to drop the ref on the mapped page. So you''d need a hyper call (arguably a multi-call) to do that, which is not free. Then you''d need privcmd and libxc to collude on agreeing to reuse the vma -- which has very low value in itself, just a piece of metadata. And you still need to deal with cleaning up the mapped refs when the mapping process crashes. So a whole lot of new complexity for small value, imho. Probably that''s the whole point of the singleshot: don''t forget you have something mapped in there. Because if you do you might leak the ref forever. Andres> > David