Today I found I can not ping some of my domU''s. I logged in to dom0, did a "xm list", all domU are there. Then I did a "ifconfig" and surprised to find some of vif''s disappeared! Like for example. "xm list" show dom13 is alive, but "brctl show" shows no vif13.0. If I "xm console 13", there''s some message: WARNING: g.e. still in use! WARNING: leaking g.e. and page still in use! I found following in /var/log/message: ------------------------------------------------------------------------------------------------ Jan 4 21:06:31 m7 kernel: xenbr0: port 3(vif1.0) entering disabled state Jan 4 21:06:31 m7 kernel: device vif1.0 left promiscuous mode Jan 4 21:06:31 m7 kernel: xenbr0: port 3(vif1.0) entering disabled state Jan 4 21:06:32 m7 kernel: xenbr0: port 12(vif10.0) entering disabled state Jan 4 21:06:32 m7 kernel: device vif10.0 left promiscuous mode Jan 4 21:06:32 m7 kernel: xenbr0: port 12(vif10.0) entering disabled state Jan 4 21:06:32 m7 kernel: xenbr0: port 8(vif13.0) entering disabled state Jan 4 21:06:32 m7 kernel: device vif13.0 left promiscuous mode Jan 4 21:06:32 m7 kernel: xenbr0: port 8(vif13.0) entering disabled state Jan 4 21:06:33 m7 kernel: device vif26.0 entered promiscuous mode Jan 4 21:06:33 m7 kernel: xenbr0: port 4(vif14.0) entering disabled state Jan 4 21:06:33 m7 kernel: device vif14.0 left promiscuous mode Jan 4 21:06:33 m7 kernel: xenbr0: port 4(vif14.0) entering disabled state Jan 4 21:06:33 m7 kernel: xenbr0: port 3(vif26.0) entering learning state Jan 4 21:06:33 m7 kernel: xenbr0: port 13(vif16.0) entering disabled state Jan 4 21:06:33 m7 kernel: xenbr0: topology change detected, propagating Jan 4 21:06:33 m7 kernel: xenbr0: port 3(vif26.0) entering forwarding state Jan 4 21:06:34 m7 kernel: device vif16.0 left promiscuous mode Jan 4 21:06:34 m7 kernel: xenbr0: port 13(vif16.0) entering disabled state -------------------------------------------------------------------------------------------------- And xend.log shows at around 21:06:31, dom 1 rebooted and no other domains were shutdown or rebooted. So it appears this dom 1 reboot knock off some other vif''s. I can reproduce this and found only dom 1 reboot disable other vif''s. Rebooting domains other than dom 1 has no problem. I use xen3.0-testing (changeset 8259:5baa96bedc13) dom0 OS is fedora core 4. use default bridge configuration. Bug? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
So nobody experienced this bug? I''d like to add some more info I just found. I can now reliably reproduce this bug with only 2 domainU''s Rebooting dom 1 will disable vif1?.0 (like vif11.0 vif12.0 vif13.0 ...); rebooting dom2 will disable vif2?.0, rebooting dom3 will disable vif3?.0 To reproduce, prepare 2 lvm for guest domain for example /dev/vg0/breezy and /dev/vg0/sarge Their config files are in /etc/xen/auto/breezy and /etc/xen/auto/sarge Bring one guest domain up first, it will have a domain id 1 #xm create /etc/xen/auto/breezy Mount the other lvm to /mnt and create guest domain. Because lvm is mounted, domain creation will not be successful, it will be in paused state. This is purely for incrementing dom id number. #mount /dev/vg0/sarge /mnt #for i in `seq 2 10`;do xm create /etc/xen/auto/sarge || xm destroy $i;done umount that lvm and create that domain, the domain will have id 11 #umount /mnt #xm create /etc/xen/auto/sarge Now 2 domainU alive, "brctl show" show vif0.0 vif1.0 vif11.0 #xm list #brctl show now reboot domain 1 #xm shutdown -R 1 What should happen is that after this there should be 2 domain with id 11 and 12 using vif11.0 and vif12.0 What actually happen is: 1, there is only one domain: domain 11 2, vif11 dissapeared. 3. there is dangling vif12.0 4, domain 11 dead here is what I get if I do "xm console 11" ---------------------------------------------------------------------- [root@m5 ~]# xm console 11 Unable to handle kernel paging request at virtual address f578ee01 printing eip: c0110220 00560000 -> *pde = 00000001:5d409001 0059e000 -> *pme = 00000001:6c190067 00040000 -> *pte = 00000000:00000000 Oops: 0000 [#1] SMP Modules linked in: CPU: 0 EIP: 0061:[<c0110220>] Not tainted VLI EFLAGS: 00010086 (2.6.12.6-xenU) EIP is at send_IPI_mask_bitmask+0x20/0x120 eax: f578ee00 ebx: 00000000 ecx: 00000000 edx: 00000000 esi: 00000000 edi: f578c000 ebp: 00000000 esp: c0586020 ds: 007b es: 007b ss: 0069 Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c01162c6 Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c01161af Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c01161af ................ ---------------------------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel