David Edmondson
2007-Nov-27 09:26 UTC
[Xen-devel] spurious warnings from get_page() via gnttab_copy() during frontend shutdown
In testing our implementation of the hypervisor copy based backend- >frontend networking changes, we see what I believe are spurious warning messages during the shutdown of the frontend domain: (XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: 189:d0 Error pfn 30e290: rd=ffff830000fcf100, od=ffff830000fcf100, caf=00000000, taf=0000000000000000 (XEN) Xen call trace: (XEN) [<ffff83000010f240>] get_page+0x107/0x1b4 (XEN) [<ffff83000010f10a>] get_page_and_type+0x21/0x50 (XEN) [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4 (XEN) [<ffff830000111971>] gnttab_copy+0xee/0x1c4 (XEN) [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc (XEN) [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc (XEN) (XEN) Guest stack trace from rbp=ffff830000ff3cf8: (XEN) ???????????????? <G><2>grant_table.c:990:d0 do_grant_table_op: domain 0, cmd 5, count 1 (XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: 189:d0 Error pfn 30fc2a: rd=ffff830000fcf100, od=0000000000000000, caf=00000000, taf=0000000000000000 (XEN) Xen call trace: (XEN) [<ffff83000010f240>] get_page+0x107/0x1b4 (XEN) [<ffff83000010f10a>] get_page_and_type+0x21/0x50 (XEN) [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4 (XEN) [<ffff830000111971>] gnttab_copy+0xee/0x1c4 (XEN) [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc (XEN) [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc (XEN) What we think is happening is that the frontend dies and most of its'' pages are freed (those that are not referenced by another domain). The backend doesn''t know that the frontend died yet, so it''s still trying to pass packets along to it. It has the rx ring mapped (meaning that it can''t be freed) and reads previously advertised grant references from it. Those grants now refer to pages that are no longer valid, so get_page() complains (the pages are no longer valid as only the frontend had references to them and they were freed). __gnttab_copy() itself seems prepared for this situation, as failures to grab the target page due to a dying domain are correctly handled: if ( !get_page_and_type(mfn_to_page(d_frame), dd, PGT_writable_page) ) { if ( !test_bit(_DOMF_dying, &dd->domain_flags) ) gdprintk(XENLOG_WARNING, "Could not get dst frame %lx \n", d_frame); rc = GNTST_general_error; goto error_out; } In our testing we believe that we''re following this path (_DOMF_dying is set and rc == GNTST_general_error) and that we handle the failure correctly. The corresponding failure mode in the page flip code path doesn''t result in any INFO warnings. Should they exist in this case? dme. -- David Edmondson, Solaris Engineering, http://dme.org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Nov-27 09:41 UTC
Re: [Xen-devel] spurious warnings from get_page() via gnttab_copy() during frontend shutdown
I don''t think you should get a backtrace. Did you add that? -- Keir On 27/11/07 09:26, "David Edmondson" <dme@sun.com> wrote:> In testing our implementation of the hypervisor copy based backend- >> frontend networking changes, we see what I believe are spurious > warning messages during the shutdown of the frontend domain: > > (XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: > 189:d0 Error pfn 30e290: rd=ffff830000fcf100, od=ffff830000fcf100, > caf=00000000, taf=0000000000000000 > (XEN) Xen call trace: > (XEN) [<ffff83000010f240>] get_page+0x107/0x1b4 > (XEN) [<ffff83000010f10a>] get_page_and_type+0x21/0x50 > (XEN) [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4 > (XEN) [<ffff830000111971>] gnttab_copy+0xee/0x1c4 > (XEN) [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc > (XEN) [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc > (XEN) > (XEN) Guest stack trace from rbp=ffff830000ff3cf8: > (XEN) ???????????????? <G><2>grant_table.c:990:d0 do_grant_table_op: > domain 0, cmd 5, count 1 > (XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: > 189:d0 Error pfn 30fc2a: rd=ffff830000fcf100, od=0000000000000000, > caf=00000000, taf=0000000000000000 > (XEN) Xen call trace: > (XEN) [<ffff83000010f240>] get_page+0x107/0x1b4 > (XEN) [<ffff83000010f10a>] get_page_and_type+0x21/0x50 > (XEN) [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4 > (XEN) [<ffff830000111971>] gnttab_copy+0xee/0x1c4 > (XEN) [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc > (XEN) [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc > (XEN) > > What we think is happening is that the frontend dies and most of its'' > pages are freed (those that are not referenced by another domain). > The backend doesn''t know that the frontend died yet, so it''s still > trying to pass packets along to it. It has the rx ring mapped > (meaning that it can''t be freed) and reads previously advertised > grant references from it. Those grants now refer to pages that are no > longer valid, so get_page() complains (the pages are no longer valid > as only the frontend had references to them and they were freed). > > __gnttab_copy() itself seems prepared for this situation, as failures > to grab the target page due to a dying domain are correctly handled: > > if ( !get_page_and_type(mfn_to_page(d_frame), dd, > PGT_writable_page) ) > { > if ( !test_bit(_DOMF_dying, &dd->domain_flags) ) > gdprintk(XENLOG_WARNING, "Could not get dst frame %lx > \n", d_frame); > rc = GNTST_general_error; > goto error_out; > } > > In our testing we believe that we''re following this path (_DOMF_dying > is set and rc == GNTST_general_error) and that we handle the failure > correctly. > > The corresponding failure mode in the page flip code path doesn''t > result in any INFO warnings. Should they exist in this case? > > dme._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Edmondson
2007-Nov-27 09:42 UTC
Re: [Xen-devel] spurious warnings from get_page() via gnttab_copy() during frontend shutdown
On 27 Nov 2007, at 9:41am, Keir Fraser wrote:> I don''t think you should get a backtrace. Did you add that?Yes, to aid in debugging. Sorry for the confusion. dme. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Edmondson
2007-Nov-27 09:53 UTC
Re: [Xen-devel] spurious warnings from get_page() via gnttab_copy() during frontend shutdown
On 27 Nov 2007, at 9:42am, David Edmondson wrote:>> I don''t think you should get a backtrace. Did you add that? > > Yes, to aid in debugging. Sorry for the confusion. >This part:> (XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: > 189:d0 Error pfn 30e290: rd=ffff830000fcf100, od=ffff830000fcf100, > caf=00000000, taf=0000000000000000is what I''m concerned about. It still seems spurious. dme. -- David Edmondson, Solaris Engineering, http://dme.org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Nov-27 12:29 UTC
Re: [Xen-devel] spurious warnings from get_page() via gnttab_copy() during frontend shutdown
Okay, as for the issue of that line coming out in the first place at XENLOG_INFO level, I''ll take a look and maybe downgrade it to XENLOG_DEBUG. -- Keir On 27/11/07 09:42, "David Edmondson" <dme@sun.com> wrote:> > On 27 Nov 2007, at 9:41am, Keir Fraser wrote: > >> I don''t think you should get a backtrace. Did you add that? > > Yes, to aid in debugging. Sorry for the confusion. > > dme. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel