Daniel Stodden
2009-Nov-23 22:43 UTC
[Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
Hi. Can someone explain a piece of code to me? I''m looking at a dom0 crash dump involving the following piece in grant_table.c:__gnttab_unmap_common if ( unlikely((op->rd = rd = rcu_lock_domain_by_id(dom)) == NULL) ) { /* This can happen when a grant is implicitly unmapped. */ gdprintk(XENLOG_INFO, "Could not find domain %d\n", dom); domain_crash(ld); /* naughty... */ return; } I assume ''implicitly unmapped'' therein refers to a case where rd is gone because ld in some or the other way already managed to tear down a mapping without an exlicit gnttab call? This correct? Otherwise killing ld would seem a bit rough to me :} Either way: is domain_crash(ld) the appropriate response? Why not just fail the op and let the caller live and learn? Thanks. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Nov-23 22:52 UTC
Re: [Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
On 23/11/2009 22:43, "Daniel Stodden" <daniel.stodden@citrix.com> wrote:> I assume ''implicitly unmapped'' therein refers to a case where rd is gone > because ld in some or the other way already managed to tear down a > mapping without an exlicit gnttab call? This correct? Otherwise killing > ld would seem a bit rough to me :}You are correct.> Either way: is domain_crash(ld) the appropriate response? Why not just > fail the op and let the caller live and learn?It''s arguable I suppose. An implicitly unmapped grant leaves a grant entry which cannot be released until the mapping domain dies. It''s a nasty kind of leak, and I made the hypervisor''s response to it suitably abrupt. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2009-Nov-23 23:07 UTC
Re: [Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
On Mon, 2009-11-23 at 17:52 -0500, Keir Fraser wrote:> On 23/11/2009 22:43, "Daniel Stodden" <daniel.stodden@citrix.com> wrote: > > > I assume ''implicitly unmapped'' therein refers to a case where rd is gone > > because ld in some or the other way already managed to tear down a > > mapping without an exlicit gnttab call? This correct? Otherwise killing > > ld would seem a bit rough to me :} > > You are correct. > > > Either way: is domain_crash(ld) the appropriate response? Why not just > > fail the op and let the caller live and learn? > > It''s arguable I suppose. An implicitly unmapped grant leaves a grant entry > which cannot be released until the mapping domain dies. It''s a nasty kind of > leak, and I made the hypervisor''s response to it suitably abrupt.Forgive my ignorance: Why can''t it be released any more? To me it looks as if the mapping is already gone, so the entry is stale, and the caller just pointed at it somewhat asking for just that. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Nov-24 08:32 UTC
Re: [Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
On 23/11/2009 23:07, "Daniel Stodden" <daniel.stodden@citrix.com> wrote:>> It''s arguable I suppose. An implicitly unmapped grant leaves a grant entry >> which cannot be released until the mapping domain dies. It''s a nasty kind of >> leak, and I made the hypervisor''s response to it suitably abrupt. > > Forgive my ignorance: Why can''t it be released any more? To me it looks > as if the mapping is already gone, so the entry is stale, and the caller > just pointed at it somewhat asking for just that.We can''t usually reliably tell. In most cases the granting domain would still be hanging around. It''s just on that one unlikely path we happen to be able to tell. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2009-Nov-24 19:28 UTC
Re: [Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
On Tue, 2009-11-24 at 03:32 -0500, Keir Fraser wrote:> On 23/11/2009 23:07, "Daniel Stodden" <daniel.stodden@citrix.com> wrote: > > >> It''s arguable I suppose. An implicitly unmapped grant leaves a grant entry > >> which cannot be released until the mapping domain dies. It''s a nasty kind of > >> leak, and I made the hypervisor''s response to it suitably abrupt. > > > > Forgive my ignorance: Why can''t it be released any more? To me it looks > > as if the mapping is already gone, so the entry is stale, and the caller > > just pointed at it somewhat asking for just that. > > We can''t usually reliably tell. In most cases the granting domain would > still be hanging around. It''s just on that one unlikely path we happen to be > able to tell.Yes. Sorry, I figured only later that you were referring to the general case. The domain struct would stay around until all pages have been released, right? Certainly the ld crash is due to what remains to be filed as a bug in ld. But killing the host? Until then it was a resource leak and a zombie domain, bad enough to not let the issue go unnoticed. I think part of what bugs me is, the way this works right now, that the only case where Xen won''t let ld get away with it is actually the one where the problem happens to be resolved already. Also I wonder, if rd happens to remain pinned, couldn''t the buggy ld be identified more reliable as any one failing to present a valid pte together with the unmap request? Or am I missing something? Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Nov-24 20:01 UTC
Re: [Xen-devel] Question: dom0 electrocuted by implicitly unmapped grantrefs
On 24/11/2009 19:28, "Daniel Stodden" <Daniel.Stodden@citrix.com> wrote:> The domain struct would stay around until all pages have been released, > right? Certainly the ld crash is due to what remains to be filed as a > bug in ld. > > But killing the host? Until then it was a resource leak and a zombie > domain, bad enough to not let the issue go unnoticed.Where does ''killing the host'' come from? The host won''t be killed. Well, unless dom0 is the offending domain. Bad dom0. :-) Well, I don''t really care about this very much. We can replace with a gdprintk() and not crash the domain, just as well. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel