Rik van Riel
2005-Apr-25 22:29 UTC
[Xen-devel] BUG: xend oopses on munmap of /proc/xen/privcmd
This is with last night''s Xen snapshot (apr 24th), on kernel 2.6.12-rc3 - but the mess is so horrid that I''m not quite sure how to fix it... This oops prevents xen from starting xenU domains. Basically xend does the following: 1) mmap /proc/xen/privcmd 2) call an ioctl to populate the mmap 3) munmap the mapping created in (1) During the munmap, the dom0 kernel oopses, as follows: CPU: 0 EIP: 0061:[<c01505ed>] Not tainted VLI EFLAGS: 00010282 (2.6.11-1.1261_FC4.rielxen0) EIP is at set_page_dirty+0x1d/0x60 eax: 8b04ec83 ebx: c13da920 ecx: c13da920 edx: c025e1d0 esi: d4e0f730 edi: 3dd49067 ebp: b79cc000 esp: da503ebc ds: 007b es: 007b ss: 0069 Process python (pid: 2662, threadinfo=da502000 task=dc8fd550) Stack: db2b41c0 c015a487 00000000 00040004 da4d6b78 b79cd000 b79cd000 b79ccfff c015a5e5 c13eef00 da4d6b78 b79cc000 b79cd000 00000000 00001000 b79cd000 d1d5b134 b79cd000 c015a7e3 c13eef00 d1d5b134 b79cc000 b79cd000 00000000 Call Trace: [<c015a487>] zap_pte_range+0x1a7/0x280 [<c015a5e5>] unmap_page_range+0x85/0xc0 [<c015a7e3>] unmap_vmas+0x1c3/0x290 [<c015f8f5>] unmap_region+0xb5/0x170 [<c015fc87>] do_munmap+0x107/0x150 [<c015fd2a>] sys_munmap+0x5a/0x80 [<c0109493>] syscall_call+0x7/0xb I suspect the oops in set_page_dirty is because of either a junk page->mapping pointer, or a junk mapping->aops pointer, since neither are touched by the code that maps the page into the VMA: int fastcall set_page_dirty(struct page *page) { struct address_space *mapping = page_mapping(page); if (likely(mapping)) { int (*spd)(struct page *) = mapping->a_ops->set_page_dirty; if (spd) return (*spd)(page); I''m not quite sure in what way to fix this bug, since none of the functions involved seem to have access to the "right" data structures. The most obvious workaround would be for zap_pte_range() to not call set_page_dirty() on pages inside a VM_IO or VM_RESERVED VMA, but I don''t know if the VMA is guaranteed to still exist when zap_pte_range() is called... -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2005-Apr-25 23:00 UTC
Re: [Xen-devel] BUG: xend oopses on munmap of /proc/xen/privcmd
On 25 Apr 2005, at 23:29, Rik van Riel wrote:> This is with last night''s Xen snapshot (apr 24th), on kernel > 2.6.12-rc3 - but the mess is so horrid that I''m not quite sure > how to fix it... > > This oops prevents xen from starting xenU domains. > > Basically xend does the following: > 1) mmap /proc/xen/privcmd > 2) call an ioctl to populate the mmap > 3) munmap the mapping created in (1)Various people have been seeing this, although I haven''t reproduced it on my own test box. The problem is not the munmap itself (although it obviously needs robustifying) but that something (presumably the ioctl) has mmaped bogus pages. The ioctl is only supposed to mmap the new domain''s pages -- these will have pfn_valid() as false and so will not take the set_page_dirty() path in zap_pte_range. It sounds to be related to my patch from Friday that removed redundant dom0_ops and changed xc_domain_create() to also call DOM0_SETMAXMEM and use dom_mem_op(MEMOP_increase_reservation) to actually reserve memory for the new domain. You might want to confirm this by ''bk cset -x''ing the changeset -- also, working from that end (the mmap end rather than the munmap end) you may have better luck making sense of what is going wrong. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Rik van Riel
2005-Apr-26 00:43 UTC
Re: [Xen-devel] BUG: xend oopses on munmap of /proc/xen/privcmd
On Tue, 26 Apr 2005, Keir Fraser wrote:> Various people have been seeing this, although I haven''t reproduced it > on my own test box. The problem is not the munmap itself (although it > obviously needs robustifying) but that something (presumably the ioctl) > has mmaped bogus pages. The ioctl is only supposed to mmap the new > domain''s pages -- these will have pfn_valid() as false and so will not > take the set_page_dirty() path in zap_pte_range.How do these two go together with ballooning ? #define pfn_to_page(pfn) (mem_map + (pfn)) #define pfn_valid(pfn) ((pfn) < max_mapnr) I''ll comb through the changeset you mentioned. Maybe I''ll find something ;) -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2005-Apr-26 08:00 UTC
Re: [Xen-devel] BUG: xend oopses on munmap of /proc/xen/privcmd
On 26 Apr 2005, at 01:43, Rik van Riel wrote:> How do these two go together with ballooning ? > > #define pfn_to_page(pfn) (mem_map + (pfn)) > #define pfn_valid(pfn) ((pfn) < max_mapnr) > > I''ll comb through the changeset you mentioned. Maybe I''ll > find something ;)Thanks. I''m pretty sure this issue is not at all connected with ballooning. When a given pfn/page is ''ballooned out'' then from the p.o.v. of the rest of the OS that pfn/page is allocated to the balloon driver. Thus noone else will be using the above macros at the same time. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel