Daniel Stodden
2009-Nov-23  22:43 UTC
[Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
Hi.
Can someone explain a piece of code to me?
I''m looking at a dom0 crash dump involving the following piece in
grant_table.c:__gnttab_unmap_common
    if ( unlikely((op->rd = rd = rcu_lock_domain_by_id(dom)) == NULL) )
    {
        /* This can happen when a grant is implicitly unmapped. */
        gdprintk(XENLOG_INFO, "Could not find domain %d\n", dom);
        domain_crash(ld); /* naughty... */
        return;
    }
I assume ''implicitly unmapped'' therein refers to a case where
rd is gone
because ld in some or the other way already managed to tear down a
mapping without an exlicit gnttab call? This correct? Otherwise killing
ld would seem a bit rough to me :}
Either way: is domain_crash(ld) the appropriate response? Why not just
fail the op and let the caller live and learn?
Thanks.
Daniel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Keir Fraser
2009-Nov-23  22:52 UTC
Re: [Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
On 23/11/2009 22:43, "Daniel Stodden" <daniel.stodden@citrix.com> wrote:> I assume ''implicitly unmapped'' therein refers to a case where rd is gone > because ld in some or the other way already managed to tear down a > mapping without an exlicit gnttab call? This correct? Otherwise killing > ld would seem a bit rough to me :}You are correct.> Either way: is domain_crash(ld) the appropriate response? Why not just > fail the op and let the caller live and learn?It''s arguable I suppose. An implicitly unmapped grant leaves a grant entry which cannot be released until the mapping domain dies. It''s a nasty kind of leak, and I made the hypervisor''s response to it suitably abrupt. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2009-Nov-23  23:07 UTC
Re: [Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
On Mon, 2009-11-23 at 17:52 -0500, Keir Fraser wrote:> On 23/11/2009 22:43, "Daniel Stodden" <daniel.stodden@citrix.com> wrote: > > > I assume ''implicitly unmapped'' therein refers to a case where rd is gone > > because ld in some or the other way already managed to tear down a > > mapping without an exlicit gnttab call? This correct? Otherwise killing > > ld would seem a bit rough to me :} > > You are correct. > > > Either way: is domain_crash(ld) the appropriate response? Why not just > > fail the op and let the caller live and learn? > > It''s arguable I suppose. An implicitly unmapped grant leaves a grant entry > which cannot be released until the mapping domain dies. It''s a nasty kind of > leak, and I made the hypervisor''s response to it suitably abrupt.Forgive my ignorance: Why can''t it be released any more? To me it looks as if the mapping is already gone, so the entry is stale, and the caller just pointed at it somewhat asking for just that. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Nov-24  08:32 UTC
Re: [Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
On 23/11/2009 23:07, "Daniel Stodden" <daniel.stodden@citrix.com> wrote:>> It''s arguable I suppose. An implicitly unmapped grant leaves a grant entry >> which cannot be released until the mapping domain dies. It''s a nasty kind of >> leak, and I made the hypervisor''s response to it suitably abrupt. > > Forgive my ignorance: Why can''t it be released any more? To me it looks > as if the mapping is already gone, so the entry is stale, and the caller > just pointed at it somewhat asking for just that.We can''t usually reliably tell. In most cases the granting domain would still be hanging around. It''s just on that one unlikely path we happen to be able to tell. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2009-Nov-24  19:28 UTC
Re: [Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
On Tue, 2009-11-24 at 03:32 -0500, Keir Fraser wrote:> On 23/11/2009 23:07, "Daniel Stodden" <daniel.stodden@citrix.com> wrote: > > >> It''s arguable I suppose. An implicitly unmapped grant leaves a grant entry > >> which cannot be released until the mapping domain dies. It''s a nasty kind of > >> leak, and I made the hypervisor''s response to it suitably abrupt. > > > > Forgive my ignorance: Why can''t it be released any more? To me it looks > > as if the mapping is already gone, so the entry is stale, and the caller > > just pointed at it somewhat asking for just that. > > We can''t usually reliably tell. In most cases the granting domain would > still be hanging around. It''s just on that one unlikely path we happen to be > able to tell.Yes. Sorry, I figured only later that you were referring to the general case. The domain struct would stay around until all pages have been released, right? Certainly the ld crash is due to what remains to be filed as a bug in ld. But killing the host? Until then it was a resource leak and a zombie domain, bad enough to not let the issue go unnoticed. I think part of what bugs me is, the way this works right now, that the only case where Xen won''t let ld get away with it is actually the one where the problem happens to be resolved already. Also I wonder, if rd happens to remain pinned, couldn''t the buggy ld be identified more reliable as any one failing to present a valid pte together with the unmap request? Or am I missing something? Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Nov-24  20:01 UTC
Re: [Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
On 24/11/2009 19:28, "Daniel Stodden" <Daniel.Stodden@citrix.com> wrote:> The domain struct would stay around until all pages have been released, > right? Certainly the ld crash is due to what remains to be filed as a > bug in ld. > > But killing the host? Until then it was a resource leak and a zombie > domain, bad enough to not let the issue go unnoticed.Where does ''killing the host'' come from? The host won''t be killed. Well, unless dom0 is the offending domain. Bad dom0. :-) Well, I don''t really care about this very much. We can replace with a gdprintk() and not crash the domain, just as well. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel