Dan Gora
2010-Mar-16 01:09 UTC
[Xen-devel] PCI BAR register space written with garbage in HVM guest.
Hi All, I''m having a problem where if I pass through two instances of my device to a HVM domU, one of the board instances is having it''s PCI BAR registers overwritten with garbage by some unknown actor 30 seconds to a minute after I load my driver. I cannnot for the life of me find what might possibly be overwriting the BAR registers. I''ve added a debugging printf to XEN in xen/arch/x86/pci.c:pci_conf_write() and I can see the entire PCI BAR address space being overwritten with garbage: (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0 bytes=4 value=0xffffffff (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0 bytes=4 value=0x1600ffff (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0 bytes=4 value=0x64d5323e (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0 bytes=4 value=0x450008 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080010 offset=0x0 bytes=4 value=0xa7e54002 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080014 offset=0x0 bytes=4 value=0x11400000 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080018 offset=0x0 bytes=4 value=0x693 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008001c offset=0x0 bytes=4 value=0xffff0000 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080020 offset=0x0 bytes=4 value=0x4400ffff (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080024 offset=0x0 bytes=4 value=0x2c024300 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080028 offset=0x0 bytes=4 value=0x1012dac (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008002c offset=0x0 bytes=4 value=0xa1c30006 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080030 offset=0x0 bytes=4 value=0xa00040d (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080034 offset=0x0 bytes=4 value=0x0 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080038 offset=0x0 bytes=4 value=0x0 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008003c offset=0x0 bytes=4 value=0x0 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080040 offset=0x0 bytes=4 value=0x0 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080044 offset=0x0 bytes=4 value=0x16000000 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080048 offset=0x0 bytes=4 value=0x64d5323e (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008004c offset=0x0 bytes=4 value=0x0 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080050 offset=0x0 bytes=4 value=0x0 <snipped, rest of PCI BAR registers written with 0x0...> I''ve added printks to the dom0 and domU kernels in the pci_bus_write_config_##size() macros in drivers/pci/access.c and in arch/x86/pci/direct.c to print every time the kernel accesses PCI configuration space, but I only see these printfs when my driver access my board''s PCI configuration space or some other driver accesses PCI configuration space, but I do NOT see them when this PCI BAR register space trashing happens. So I noticed also that lspci does not cause these kernel printfs to occur and upon reading the pciutils source code I learned that pretty much anything which can do an outl() to 0xcf8/0xcfc can mess with PCI configuration space. So now I figure it must be some user space thing unless I''m just missing some other way which the kernel or XEN can access PCI configuration space, but what could it possibly be? This problem only occurs in HVM guests and only seems to occur when I pass two instances of my device to the domU and only occurs many many seconds after I load my driver (30-60 seconds). I''m absolutely sure that it''s not my driver because the kernel printfs show up when my driver accesses PCI configuration space. I''m really pretty much at a loss as even how to debug this. There doesn''t appear to be any dump_stack() in XEN so that I can see what called pci_conf_write() in XEN, but even then it appears that it only gets called as a trap from the dom0 or domU. It''s not clear to me if you can even see what process/stack actually caused the trap back in the dom0 or domU. Is that possible? Is there anything else that I should look at? qemu? pciback? pcifront? Am I missing some access method to PCI configuration space down in the kernel or is pci_confl_read/write pretty much it? Any ideas what would possibly be trying to overwrite all of PCI configuration space like this? _any_ ideas are most welcome.. thanks dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Mar-16 01:48 UTC
Re: [Xen-devel] PCI BAR register space written with garbage in HVM guest.
On Mon, Mar 15, 2010 at 10:09:28PM -0300, Dan Gora wrote:> Hi All, > > I''m having a problem where if I pass through two instances of my > device to a HVM domU, one of the board instances is having it''s PCI > BAR registers overwritten with garbage by some unknown actor 30 > seconds to a minute after I load my driver. I cannnot for the life of > me find what might possibly be overwriting the BAR registers. > > I''ve added a debugging printf to XEN in > xen/arch/x86/pci.c:pci_conf_write() and I can see the entire PCI BAR > address space being overwritten with garbage: > > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0 > bytes=4 value=0xffffffff > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0 > bytes=4 value=0x1600ffff > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0 > bytes=4 value=0x64d5323e > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0 > bytes=4 value=0x450008 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080010 offset=0x0 > bytes=4 value=0xa7e54002 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080014 offset=0x0 > bytes=4 value=0x11400000Wow.. That is impressive. Are the values always the same? Or are they truly random?> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080018 offset=0x0 > bytes=4 value=0x693 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008001c offset=0x0 > bytes=4 value=0xffff0000 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080020 offset=0x0 > bytes=4 value=0x4400ffff > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080024 offset=0x0 > bytes=4 value=0x2c024300 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080028 offset=0x0 > bytes=4 value=0x1012dac > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008002c offset=0x0 > bytes=4 value=0xa1c30006 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080030 offset=0x0 > bytes=4 value=0xa00040d > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080034 offset=0x0 > bytes=4 value=0x0 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080038 offset=0x0 > bytes=4 value=0x0 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008003c offset=0x0 > bytes=4 value=0x0 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080040 offset=0x0 > bytes=4 value=0x0 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080044 offset=0x0 > bytes=4 value=0x16000000 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080048 offset=0x0 > bytes=4 value=0x64d5323e > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008004c offset=0x0 > bytes=4 value=0x0 > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080050 offset=0x0 > bytes=4 value=0x0 > <snipped, rest of PCI BAR registers written with 0x0...> > > I''ve added printks to the dom0 and domU kernels in the > pci_bus_write_config_##size() macros in drivers/pci/access.c and in > arch/x86/pci/direct.c to print every time the kernel accesses PCI > configuration space, but I only see these printfs when my driver > access my board''s PCI configuration space or some other driver > accesses PCI configuration space, but I do NOT see them when this PCI > BAR register space trashing happens. > > So I noticed also that lspci does not cause these kernel printfs to > occur and upon reading the pciutils source code I learned that pretty > much anything which can do an outl() to 0xcf8/0xcfc can mess with PCI > configuration space. > > So now I figure it must be some user space thing unless I''m just > missing some other way which the kernel or XEN can access PCI > configuration space, but what could it possibly be? > > This problem only occurs in HVM guests and only seems to occur when I > pass two instances of my device to the domU and only occurs many many > seconds after I load my driver (30-60 seconds). I''m absolutely sure > that it''s not my driver because the kernel printfs show up when my > driver accesses PCI configuration space. > > I''m really pretty much at a loss as even how to debug this. There > doesn''t appear to be any dump_stack() in XEN so that I can see what > called pci_conf_write() in XEN, but even then it appears that it only > gets called as a trap from the dom0 or domU. It''s not clear to me if > you can even see what process/stack actually caused the trap back in > the dom0 or domU. Is that possible?You could instrument the code (Xen) to crash the DomU domain when you detect garbage. Then you can pick at with xenctx to look at its stack..etc> > Is there anything else that I should look at? qemu? pciback? > pcifront? Am I missing some access method to PCI configuration space > down in the kernel or is pci_confl_read/write pretty much it? AnyQEMU uses libpci, which is the same as lspci, and that looks to work. You can crank up the verbosity of pciback and pcifront with its parameters to see if they are the ones doing this. But your domain is HVM DomU so the pcifront/pciback is not utilized. That narrows it down to QEMU or the Dom0 kernel.> ideas what would possibly be trying to overwrite all of PCI > configuration space like this? > > _any_ ideas are most welcome.. > > thanks > dan > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Gora
2010-Mar-16 02:55 UTC
Re: [Xen-devel] PCI BAR register space written with garbage in HVM guest.
On Mon, Mar 15, 2010 at 10:48 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:>> I''ve added a debugging printf to XEN in >> xen/arch/x86/pci.c:pci_conf_write() and I can see the entire PCI BAR >> address space being overwritten with garbage: >> >> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0 >> bytes=4 value=0xffffffff >> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0 >> bytes=4 value=0x1600ffff >> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0 >> bytes=4 value=0x64d5323e >> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0 >> bytes=4 value=0x450008 >> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080010 offset=0x0 >> bytes=4 value=0xa7e54002 >> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080014 offset=0x0 >> bytes=4 value=0x11400000 > > Wow.. That is impressive. Are the values always the same? Or are they > truly random?I''m pretty sure that they are always the same...>> I''m really pretty much at a loss as even how to debug this. There >> doesn''t appear to be any dump_stack() in XEN so that I can see what >> called pci_conf_write() in XEN, but even then it appears that it only >> gets called as a trap from the dom0 or domU. It''s not clear to me if >> you can even see what process/stack actually caused the trap back in >> the dom0 or domU. Is that possible? > > You could instrument the code (Xen) to crash the DomU domain when you > detect garbage. Then you can pick at with xenctx to look at its stack..etcThis is where I could really use some help... Is it possible crash only the domU from with xen? What would it leave behind to analyze with xenctx?>> >> Is there anything else that I should look at? qemu? pciback? >> pcifront? Am I missing some access method to PCI configuration space >> down in the kernel or is pci_confl_read/write pretty much it? Any > > QEMU uses libpci, which is the same as lspci, and that looks to work. > > You can crank up the verbosity of pciback and pcifront with its > parameters to see if they are the ones doing this. But your domain is > HVM DomU so the pcifront/pciback is not utilized. >Yeah I tried that and there doesn''t appear to be anything going on there. <aside> What is pcifront/pciback''s role for HVM guests exactly? I understand that you "hide" the devices from the dom0 with pciback and it definately loads and does *something* when the HVM guest comes up, but accesses from the domU don''t appear to go through it at all (I understand that it works with qemu somehow, but that channel too is not at all clear how it works...) I''ve looked through qemu enough to kind of understand that it''s responsible for abstracting the PCI configuration space for the domU, but I don''t really understand how accesses get channeled through to it from the domU. Does it use hypercalls somehow? Can someone explain how this whole flow is supposed to work for PCI configuration space accesses? </aside>> That narrows it down to QEMU or the Dom0 kernel.Yeah that''s what I figured.. I instrumented qemu a bit and also did not see anything going on in there, but I''m not at the office anymore so I cannot tell you exactly what functions I instrumented. I''m going to look into this more tomorrow. thanks a lot for your help... dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Mar-16 10:24 UTC
Re: [Xen-devel] PCI BAR register space written with garbage in HVM guest.
>>> Dan Gora <dan.gora@gmail.com> 16.03.10 02:09 >>> >I''m really pretty much at a loss as even how to debug this. There >doesn''t appear to be any dump_stack() in XEN so that I can see what >called pci_conf_write() in XEN, but even then it appears that it only >gets called as a trap from the dom0 or domU. It''s not clear to me if >you can even see what process/stack actually caused the trap back in >the dom0 or domU. Is that possible?show_execution_state(guest_cpu_user_regs()); Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Mar-16 10:27 UTC
Re: [Xen-devel] PCI BAR register space written with garbage in HVM guest.
>>> Dan Gora <dan.gora@gmail.com> 16.03.10 03:55 >>> >This is where I could really use some help... Is it possible crash >only the domU from with xen? What would it leave behind to analyze >with xenctx?domain_crash(current->domain); assuming the access happens in the context of the DomU. If it happens in Dom0''s context (e.g. coming from qemu) you''d still kill the whole system with this, so be careful. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Mar-16 15:20 UTC
Re: [Xen-devel] PCI BAR register space written with garbage in HVM guest.
> <aside> > What is pcifront/pciback''s role for HVM guests exactly? I understandNone functionaly. The purpose is to "bind" the PCI devices to pciback (or pcistub) so that no other kernel module usurps it and starts utilizing it.> that you "hide" the devices from the dom0 with pciback and it > definately loads and does *something* when the HVM guest comes up, but > accesses from the domU don''t appear to go through it at all (I > understand that it works with qemu somehow, but that channel too is > not at all clear how it works...)With HVM pciback is not used. You need an virtualization aware IOMMU to pass through PCI devices to the guest. PCIback/PCI front is for PV guests and on machine where you don''t neccessarily have this fancy IOMMU.> I''ve looked through qemu enough to kind of understand that it''s > responsible for abstracting the PCI configuration space for the domU, > but I don''t really understand how accesses get channeled through to it > from the domU. Does it use hypercalls somehow? Can someone explainThe Hypervisor gets "trapped" when an outb is made (look for emulate_privileged_op function and specifically out). Then it somehow injects the fault in QEMU which does the rest. I don''t remember the details of how it does thought :-(> how this whole flow is supposed to work for PCI configuration space > accesses? > </aside>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Mar-16 15:54 UTC
Re: [Xen-devel] PCI BAR register space written with garbage in HVM guest.
>>> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> 16.03.10 16:20 >>> >The Hypervisor gets "trapped" when an outb is made (look for >emulate_privileged_op function and specifically out).No, that''s only the path taken for pv guests. HVM guests use special exit events for trapping I/O instructions. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Gora
2010-Mar-17 00:14 UTC
[Xen-devel] Re: PCI BAR register space written with garbage in HVM guest.
Hi all, So I have definately tracked the writes as coming from qemu-dm, but I''m still having trouble with finding out what is causing it. I can now see the writes occuring down in the kernel (I only had pci_read_bus_XX traced, I didn''t have pci_read_user_XX traced before.. sigh...) and I can also see the problem occuring with strace as well, but I still cannot find where in qemu corresponds to this behavior... So here is what I see: Recall that from Xen HV I can see the bad writes occuring:> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0 > bytes=4 value=0xffffffff > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0 > bytes=4 value=0x1600ffff > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0 > bytes=4 value=0x64d5323e > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0 > bytes=4 value=0x450008I also traced the dom0 kernel better and can now also see these same writes occuring: Mar 16 21:35:49 sunblade1 kernel: pci_conf1_write: writing 8:0 reg 0x0 len=4 value=0xffffffff Mar 16 21:35:49 sunblade1 kernel: Pid: 5904, comm: qemu-dm Tainted: G 2.6.27.19-5-xen.dan #7 Mar 16 21:35:49 sunblade1 kernel: Mar 16 21:35:49 sunblade1 kernel: Call Trace: Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8020c55a>] dump_trace+0xba/0x379 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8020c867>] show_trace_log_lvl+0x4e/0x69 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8020d5d8>] show_trace+0x15/0x19 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8048d873>] dump_stack+0x77/0x80 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff80407a92>] pci_conf1_write+0x86/0x113 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8040940f>] raw_pci_write+0x24/0x3d Mar 16 21:35:49 sunblade1 kernel: [<ffffffff80409482>] pci_write+0x2c/0x2e Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8035ec0f>] pci_user_write_config_dword+0x62/0xba Mar 16 21:35:49 sunblade1 kernel: [<ffffffff80364e49>] pci_write_config+0x110/0x192 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff802ff3eb>] write+0xf5/0x12f Mar 16 21:35:49 sunblade1 kernel: [<ffffffff802a9bea>] vfs_write+0xb3/0x15c Mar 16 21:35:49 sunblade1 kernel: [<ffffffff802a9d61>] sys_write+0x4c/0x75 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8020b7f2>] tracesys+0xaf/0xb4 Mar 16 21:35:49 sunblade1 kernel: [<00007f05d53f97fb>] 0x7f05d53f97fb Mar 16 21:35:49 sunblade1 kernel: Mar 16 21:35:49 sunblade1 kernel: pci_conf1_write: writing 8:0 reg 0x4 len=4 value=0x1600ffff Mar 16 21:35:49 sunblade1 kernel: Pid: 5904, comm: qemu-dm Tainted: G 2.6.27.19-5-xen.dan #7 Mar 16 21:35:49 sunblade1 kernel: Mar 16 21:35:49 sunblade1 kernel: Call Trace: Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8020c55a>] dump_trace+0xba/0x379 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8020c867>] show_trace_log_lvl+0x4e/0x69 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8020d5d8>] show_trace+0x15/0x19 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8048d873>] dump_stack+0x77/0x80 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff80407a92>] pci_conf1_write+0x86/0x113 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8040940f>] raw_pci_write+0x24/0x3d Mar 16 21:35:49 sunblade1 kernel: [<ffffffff80409482>] pci_write+0x2c/0x2e Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8035ec0f>] pci_user_write_config_dword+0x62/0xba Mar 16 21:35:49 sunblade1 kernel: [<ffffffff80364e49>] pci_write_config+0x110/0x192 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff802ff3eb>] write+0xf5/0x12f Mar 16 21:35:49 sunblade1 kernel: [<ffffffff802a9bea>] vfs_write+0xb3/0x15c Mar 16 21:35:49 sunblade1 kernel: [<ffffffff802a9d61>] sys_write+0x4c/0x75 Mar 16 21:35:49 sunblade1 kernel: [<ffffffff8020b7f2>] tracesys+0xaf/0xb4 Mar 16 21:35:50 sunblade1 kernel: [<00007f05d53f97fb>] 0x7f05d53f97fb Mar 16 21:35:50 sunblade1 kernel: I also traced qemu-dm with strace and found the culprit: 5904 21:35:47 [ 7f05d3b68b2b] open("/sys/bus/pci/devices/0000:08:00.0/config", O_RDWR) = 6 5904 21:35:47 [ 7f05d53fa3c8] pwrite(6, "\6\1", 2, 4) = 2 5904 21:35:47 [ 7f05d3b6eb77] ioctl(16, EVIOCGKEYCODE, 0x7fffdde98890) = 0 5904 21:35:47 [ 7f05d53f987b] read(4, 0x7fffdde98870, 16) = -1 EAGAIN (Resource temporarily unavailable) 5904 21:35:47 [ 7fffddf437dc] clock_gettime(CLOCK_MONOTONIC, {1724, 868201462}) = 0 5904 21:35:47 [ 7fffddf437dc] clock_gettime(CLOCK_MONOTONIC, {1724, 868259075}) = 0 <snip> 5904 21:35:48 [ 7f05d53f987b] read(16, "o\0\0\0", 4) = 4 5904 21:35:48 [ 7f05d53f97fb] write(16, "o\0\0\0", 4) = 4 5904 21:35:48 [ 7f05d53f97fb] write(6, "\377\377\377\377\377\377\0\26>2\325d\10\0E\0\2@\0\354\0\0@\21w\302\0\0\0\0\377\377"..., 590) = 256 Notice the length of 590 bytes! The EIP also matches what the kernel thinks was the syscall which caused the write. Anyone have any idea where this could be happening in qemu? I''m pretty sure that it''s not originating with the domU because I traced the domU kenrel''s pci config access routines pci_conf1_write/read the same way as the dom0''s and there are no printfs in the domU kernel when this occurs. Again, thanks for all the help, dan On Mon, Mar 15, 2010 at 10:09 PM, Dan Gora <dan.gora@gmail.com> wrote:> Hi All, > > I''m having a problem where if I pass through two instances of my > device to a HVM domU, one of the board instances is having it''s PCI > BAR registers overwritten with garbage by some unknown actor 30 > seconds to a minute after I load my driver. I cannnot for the life of > me find what might possibly be overwriting the BAR registers. > > I''ve added a debugging printf to XEN in > xen/arch/x86/pci.c:pci_conf_write() and I can see the entire PCI BAR > address space being overwritten with garbage: > > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0 > bytes=4 value=0xffffffff > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0 > bytes=4 value=0x1600ffff > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0 > bytes=4 value=0x64d5323e > (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0 > bytes=4 value=0x450008_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Mar-17 02:46 UTC
Re: [Xen-devel] Re: PCI BAR register space written with garbage in HVM guest.
> open("/sys/bus/pci/devices/0000:08:00.0/config", O_RDWR) = 6 > 5904 21:35:47 [ 7f05d53fa3c8] pwrite(6, "\6\1", 2, 4) = 2 > 5904 21:35:47 [ 7f05d3b6eb77] ioctl(16, EVIOCGKEYCODE, 0x7fffdde98890) = 0 > 5904 21:35:47 [ 7f05d53f987b] read(4, 0x7fffdde98870, 16) = -1 > EAGAIN (Resource temporarily unavailable) > 5904 21:35:47 [ 7fffddf437dc] clock_gettime(CLOCK_MONOTONIC, > {1724, 868201462}) = 0 > 5904 21:35:47 [ 7fffddf437dc] clock_gettime(CLOCK_MONOTONIC, > {1724, 868259075}) = 0There is a snippet of QEMU that has that (hw/pass-through.c), function pt_pci_write_config: .. snip .. ret = pci_write_block(pci_dev, address, (uint8_t *)&val, len); if (!ret) PT_LOG("Error: pci_write_block failed. return value[%d].\n", ret); } if (pm_state != NULL && pm_state->flags & PT_FLAG_TRANSITING) /* set QEMUTimer */ qemu_mod_timer(pm_state->pm_timer, (qemu_get_clock(rt_clock) + pm_state->pm_delay));> > <snip> > > 5904 21:35:48 [ 7f05d53f987b] read(16, "o\0\0\0", 4) = 4 > 5904 21:35:48 [ 7f05d53f97fb] write(16, "o\0\0\0", 4) = 4 > 5904 21:35:48 [ 7f05d53f97fb] write(6, > "\377\377\377\377\377\377\0\26>2\325d\10\0E\0\2@\0\354\0\0@\21w\302\0\0\0\0\377\377"..., > 590) = 256I would turn on all of those debug options and see if anything is happening. Actually I would instrument all of the ''pci_write_block'' calls.> > Notice the length of 590 bytes! The EIP also matches what the kernel > thinks was the syscall which caused the write. > > Anyone have any idea where this could be happening in qemu? I''mCan you attach debug to it? Or maybe run ''objdump'' on the qemu-dm and see if the address correspond to some code in it? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Gora
2010-Mar-17 03:31 UTC
Re: [Xen-devel] Re: PCI BAR register space written with garbage in HVM guest.
On Tue, Mar 16, 2010 at 11:46 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> There is a snippet of QEMU that has that (hw/pass-through.c), > function pt_pci_write_config: > > .. snip .. > ret = pci_write_block(pci_dev, address, (uint8_t *)&val, len); > > if (!ret) > PT_LOG("Error: pci_write_block failed. return value[%d].\n", > ret); > } > > if (pm_state != NULL && pm_state->flags & PT_FLAG_TRANSITING) > /* set QEMUTimer */ > qemu_mod_timer(pm_state->pm_timer, > (qemu_get_clock(rt_clock) + pm_state->pm_delay));I turned on all the debugging in there and I see messages, but none when the bad writes occur, which made me think that they were not coming from there...>> <snip> >> >> 5904 21:35:48 [ 7f05d53f987b] read(16, "o\0\0\0", 4) = 4 >> 5904 21:35:48 [ 7f05d53f97fb] write(16, "o\0\0\0", 4) = 4 >> 5904 21:35:48 [ 7f05d53f97fb] write(6, >> "\377\377\377\377\377\377\0\26>2\325d\10\0E\0\2@\0\354\0\0@\21w\302\0\0\0\0\377\377"..., >> 590) = 256 > > I would turn on all of those debug options and see if anything is > happening.done.. nothing :( At least not during the error.> Actually I would instrument all of the ''pci_write_block'' calls.I''ll try that tomorrow...> Can you attach debug to it? Or maybe run ''objdump'' on the qemu-dm and > see if the address correspond to some code in it?Yes I can debug it, but really the only thing that I have to hang my hat on is write(). So I figure that it must be coming from pciutils since it''s messing with /sys/bus/pci, but I don''t really know much about pciutils so wasn''t sure what to grep around for. I''ll try and trace the pci_write_block calls and put gdb on it tomorrow and see if anything pops up. thanks, dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Gora
2010-Mar-18 02:14 UTC
[Xen-devel] Re: PCI BAR register space written with garbage in HVM guest.
Hi all, I apologize for the length of this message, but I figured that it was probably better to supply too much info rather than too little... I''ve made some good progress debugging this today. The problem looks pretty bizarre... I finally figured out how to trap the problem in gdb with the help of the conditional break point. Since I know that it''s a write to fd 6 of length 590 bytes I just make gdb break on that: (gdb) disass write Dump of assembler code for function write: 0x00007f2c3afe07b0 <write+0>: cmpl $0x0,0x20db59(%rip) # 0x7f2c3b1ee310 <__pthread_multiple_threads> 0x00007f2c3afe07b7 <write+7>: jne 0x7f2c3afe07c9 <write+25> 0x00007f2c3afe07b9 <__write_nocancel+0>: mov $0x1,%eax 0x00007f2c3afe07be <__write_nocancel+5>: syscall 0x00007f2c3afe07c0 <__write_nocancel+7>: cmp $0xfffffffffffff001,%rax 0x00007f2c3afe07c6 <__write_nocancel+13>: jae 0x7f2c3afe081b <write+107> 0x00007f2c3afe07c8 <__write_nocancel+15>: retq 0x00007f2c3afe07c9 <write+25>: sub $0x28,%rsp 0x00007f2c3afe07cd <write+29>: mov %rdi,0x8(%rsp) 0x00007f2c3afe07d2 <write+34>: mov %rsi,0x10(%rsp) 0x00007f2c3afe07d7 <write+39>: mov %rdx,0x18(%rsp) 0x00007f2c3afe07dc <write+44>: callq 0x7f2c3afe0380 <__pthread_enable_asynccancel> 0x00007f2c3afe07e1 <write+49>: mov 0x8(%rsp),%rdi 0x00007f2c3afe07e6 <write+54>: mov 0x10(%rsp),%rsi 0x00007f2c3afe07eb <write+59>: mov 0x18(%rsp),%rdx 0x00007f2c3afe07f0 <write+64>: mov %rax,(%rsp) 0x00007f2c3afe07f4 <write+68>: mov $0x1,%eax 0x00007f2c3afe07f9 <write+73>: syscall 0x00007f2c3afe07fb <write+75>: mov (%rsp),%rdi 0x00007f2c3afe07ff <write+79>: mov %rax,0x8(%rsp) 0x00007f2c3afe0804 <write+84>: callq 0x7f2c3afe0350 <__pthread_disable_asynccancel> 0x00007f2c3afe0809 <write+89>: mov 0x8(%rsp),%rax 0x00007f2c3afe080e <write+94>: add $0x28,%rsp 0x00007f2c3afe0812 <write+98>: cmp $0xfffffffffffff001,%rax 0x00007f2c3afe0818 <write+104>: jae 0x7f2c3afe081b <write+107> 0x00007f2c3afe081a <write+106>: retq 0x00007f2c3afe081b <write+107>: mov 0x20977e(%rip),%rcx # 0x7f2c3b1e9fa0 0x00007f2c3afe0822 <write+114>: xor %edx,%edx 0x00007f2c3afe0824 <write+116>: sub %rax,%rdx 0x00007f2c3afe0827 <write+119>: mov %edx,%fs:(%rcx) 0x00007f2c3afe082a <write+122>: or $0xffffffffffffffff,%rax 0x00007f2c3afe082e <write+126>: jmp 0x7f2c3afe081a <write+106> End of assembler dump. (gdb) break *0x00007f2c3afe07f4 if $rdi == 6 && $rdx == 590 Breakpoint 1 at 0x7f2c3afe07f4 (gdb) info break Num Type Disp Enb Address What 1 breakpoint keep y 0x00007f2c3afe07f4 <write+68> stop only if $rdi == 6 && $rdx == 590 ==================Here the breakpoint fires, but the bad writes do not occur when we perform the syscall. ================== Breakpoint 1, 0x00007f2c3afe07f4 in write () from /lib64/libpthread.so.0 (gdb) bt #0 0x00007f2c3afe07f4 in write () from /lib64/libpthread.so.0 #1 0x000000000040746d in tap_receive (opaque=0xa940f0, buf=0x7fff43a7f080 "������", size=<value optimized out>) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:4079 #2 0x0000000000407081 in qemu_send_packet (vc1=0xaf99d0, buf=0x7fff43a7f080 "������", size=590) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:3865 #3 0x00000000004235a0 in rtl8139_transmit_one (s=0xaf9780, descriptor=<value optimized out>) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:1810 #4 0x0000000000424ae7 in rtl8139_io_writel (opaque=0x6, addr=<value optimized out>, val=524878) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:2367 #5 0x00000000004726a3 in cpu_physical_memory_rw (_addr=<value optimized out>, buf=0x7f2c3ba7e018 <Address 0x7f2c3ba7e018 out of bounds>, _len=<value optimized out>, is_write=1) at exec-dm.c:510 #6 0x00000000004a4703 in cpu_ioreq_move (env=<value optimized out>, req=0x7f2c3ba7e000) at helper2.c:319 #7 0x00000000004a4fb1 in cpu_handle_ioreq (opaque=<value optimized out>) at helper2.c:508 #8 0x000000000040aeb4 in main_loop_wait (timeout=10) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:7175 #9 0x00000000004a4dbd in main_loop () at helper2.c:571 #10 0x000000000040feb3 in main (argc=20, argv=0x7fff43a82968) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:8950 =======================Here is the data buffer that is being written by rtl8139_transmit(): Notice that it looks like an ethernet frame! But it is being written down the sysfs pci configuration space descriptor! ======================= (gdb) x/590xb 0x7fff43a7f080 0x7fff43a7f080: 0xff 0xff 0xff 0xff 0xff 0xff 0x00 0x16 0x7fff43a7f088: 0x3e 0x32 0xd5 0x64 0x08 0x00 0x45 0x00 0x7fff43a7f090: 0x02 0x40 0x91 0x0d 0x00 0x00 0x40 0x11 0x7fff43a7f098: 0xe7 0xa0 0x00 0x00 0x00 0x00 0xff 0xff 0x7fff43a7f0a0: 0xff 0xff 0x00 0x44 0x00 0x43 0x02 0x2c 0x7fff43a7f0a8: 0xc4 0xb7 0x01 0x01 0x06 0x00 0xae 0xb8 0x7fff43a7f0b0: 0x09 0x63 0x00 0x0a 0x00 0x00 0x00 0x00 0x7fff43a7f0b8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fff43a7f0c0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x16 0x7fff43a7f0c8: 0x3e 0x32 0xd5 0x64 0x00 0x00 0x00 0x00 0x7fff43a7f0d0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fff43a7f0d8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fff43a7f0e0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fff43a7f0e8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fff43a7f0f0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fff43a7f0f8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ================================So we load and unload my driver in the domU. After several seconds the breakpoint hits again and the bad writes actually happen when we do the write() system call =============================== Breakpoint 1, 0x00007f2c3afe07f4 in write () from /lib64/libpthread.so.0 (gdb) info regi rax 0x0 0 rbx 0x24e 590 rcx 0x0 0 rdx 0x24e 590 rsi 0x7fff43a7f080 140734328467584 rdi 0x6 6 rbp 0x7fff43a7f080 0x7fff43a7f080 rsp 0x7fff43a7f000 0x7fff43a7f000 r8 0x6 6 r9 0x43e 1086 r10 0x7fff43a811d0 140734328476112 r11 0x202 514 r12 0xa940f0 11092208 r13 0x7fff43a7f080 140734328467584 r14 0x7f2c3ba7e018 139827956146200 r15 0x1 1 rip 0x7f2c3afe07f4 0x7f2c3afe07f4 <write+68> eflags 0x202 [ IF ] cs 0x33 51 ss 0xe02b 57387 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 fctrl 0x37f 895 fstat 0x0 0 ftag 0xffff 65535 fiseg 0x0 0 fioff 0x0 0 foseg 0x0 0 fooff 0x0 0 fop 0x0 0 mxcsr 0x1f80 [ IM DM ZM OM UM PM ] (gdb) x/10i $rip 0x7f2c3afe07f4 <write+68>: mov $0x1,%eax 0x7f2c3afe07f9 <write+73>: syscall 0x7f2c3afe07fb <write+75>: mov (%rsp),%rdi 0x7f2c3afe07ff <write+79>: mov %rax,0x8(%rsp) 0x7f2c3afe0804 <write+84>: callq 0x7f2c3afe0350 <__pthread_disable_asynccancel> 0x7f2c3afe0809 <write+89>: mov 0x8(%rsp),%rax 0x7f2c3afe080e <write+94>: add $0x28,%rsp 0x7f2c3afe0812 <write+98>: cmp $0xfffffffffffff001,%rax 0x7f2c3afe0818 <write+104>: jae 0x7f2c3afe081b <write+107> 0x7f2c3afe081a <write+106>: retq (gdb) x/20xb 0x7fff43a7f080 0x7fff43a7f080: 0xff 0xff 0xff 0xff 0xff 0xff 0x00 0x16 0x7fff43a7f088: 0x3e 0x32 0xd5 0x64 0x08 0x00 0x45 0x00 0x7fff43a7f090: 0x02 0x40 0x91 0x0f ========================Now do the system call... ========================(gdb) si Breakpoint 2, 0x00007f2c3afe07f9 in write () from /lib64/libpthread.so.0 (gdb) si 0x00007f2c3afe07fb in write () from /lib64/libpthread.so.0 ===================================Here the problem occurs.. I see the PCI BAR space being overwritten on the XEN console. The bad BAR registers exactly match the data. =================================== (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0 bytes=4 val ue=0xffffffff (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0 bytes=4 val ue=0x1600ffff (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0 bytes=4 val ue=0x64d5323e (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0 bytes=4 val ue=0x450008 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080010 offset=0x0 bytes=4 val ue=0xf914002 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080014 offset=0x0 bytes=4 val ue=0x11400000 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080018 offset=0x0 bytes=4 val ue=0x9ee7 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008001c offset=0x0 bytes=4 val ue=0xffff0000 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080020 offset=0x0 bytes=4 val ue=0x4400ffff (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080024 offset=0x0 bytes=4 val ue=0x2c024300 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080028 offset=0x0 bytes=4 val ue=0x101b7c4 (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008002c offset=0x0 bytes=4 val ue=0xb8ae0006 =======================so let''s go see what we can see.... =======================(gdb) bt #0 0x00007f2c3afe07fb in write () from /lib64/libpthread.so.0 #1 0x000000000040746d in tap_receive (opaque=0xa940f0, buf=0x7fff43a7f080 "������", size=<value optimized out>) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:4079 #2 0x0000000000407081 in qemu_send_packet (vc1=0xaf99d0, buf=0x7fff43a7f080 "������", size=590) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:3865 #3 0x00000000004235a0 in rtl8139_transmit_one (s=0xaf9780, descriptor=<value optimized out>) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:1810 #4 0x0000000000424ae7 in rtl8139_io_writel (opaque=0x6, addr=<value optimized out>, val=524878) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:2367 #5 0x00000000004726a3 in cpu_physical_memory_rw (_addr=<value optimized out>, buf=0x7f2c3ba7e018 <Address 0x7f2c3ba7e018 out of bounds>, _len=<value optimized out>, is_write=1) at exec-dm.c:510 #6 0x00000000004a4703 in cpu_ioreq_move (env=<value optimized out>, req=0x7f2c3ba7e000) at helper2.c:319 #7 0x00000000004a4fb1 in cpu_handle_ioreq (opaque=<value optimized out>) at helper2.c:508 #8 0x000000000040aeb4 in main_loop_wait (timeout=10) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:7175 #9 0x00000000004a4dbd in main_loop () at helper2.c:571 #10 0x000000000040feb3 in main (argc=20, argv=0x7fff43a82968) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:8950 (gdb) (gdb) up #1 0x000000000040746d in tap_receive (opaque=0xa940f0, buf=0x7fff43a7f080 "������", size=<value optimized out>) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:4079 4079 ret = write(s->fd, buf, size); /* No error handling ?! */ (gdb) print *((TAPState*) opaque) $2 = {vc = 0xa94910, fd = 6, down_script = "/etc/xen/qemu-ifdown", ''\0'' <repeats 1003 times>, script_arg = "br0", ''\0'' <repeats 1020 times>, next = 0x0} (gdb) print *(VLANClientState*) 0xa94910 $3 = {fd_read = 0x407450 <tap_receive>, fd_can_read = 0, opaque = 0xa940f0, next = 0xaf99d0, vlan = 0xa940a0, info_str = "tap: ifname=tap1.0 setup_script=/etc/xen/qemu-ifup", ''\0'' <repeats 205 times>} (gdb) print *(VLANState*)0xa940a0 $4 = {id = 1, first_client = 0xa94910, next = 0x0, nb_guest_devs = 1, nb_host_devs = 1} (gdb) print *(VLANClientState*) 0xaf99d0 $5 = {fd_read = 0x4235e0 <rtl8139_receive>, fd_can_read = 0x422770 <rtl8139_can_receive>, opaque = 0xaf9780, next = 0x0, vlan = 0xa940a0, info_str = "rtl8139 pci macaddr=00:16:3e:32:d5:64", ''\0'' <repeats 218 times>} (gdb) up #4 0x0000000000424ae7 in rtl8139_io_writel (opaque=0x6, addr=<value optimized out>, val=524878) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:2367 2367 if (rtl8139_transmit_one(s, descriptor)) (gdb) up #5 0x00000000004726a3 in cpu_physical_memory_rw (_addr=<value optimized out>, buf=0x7f2c3ba7e018 <Address 0x7f2c3ba7e018 out of bounds>, _len=<value optimized out>, is_write=1) at exec-dm.c:510 510 io_mem_write[io_index][2](io_mem_opaque[io_index], addr, val); ===============================================Here are the open files at the time.. Notice that fd 6 is open read only.. I don''t understand how this write() is working.... =============================================== sunblade1:root:bash 2047 => lsof -p 5820 <snip> qemu-dm 5820 root 0r CHR 1,3 0t0 3278 /dev/null qemu-dm 5820 root 1w REG 253,5 335301 606514 /var/log/xen/qemu-dm-node1.log qemu-dm 5820 root 2w REG 253,5 335301 606514 /var/log/xen/qemu-dm-node1.log qemu-dm 5820 root 3r DIR 253,5 4096 196635 /etc/init.d/rc3.d qemu-dm 5820 root 4r FIFO 0,6 0t0 13302 pipe qemu-dm 5820 root 5w FIFO 0,6 0t0 13302 pipe qemu-dm 5820 root 6r REG 0,0 256 4132 /sys/devices/pci0000:00/0000:00:03.0/0000:07:00.0/0000:08:00.0/config qemu-dm 5820 root 7u REG 0,3 0 4026538410 /proc/xen/privcmd qemu-dm 5820 root 8u unix 0xffff8803cd8e6e80 0t0 13325 socket qemu-dm 5820 root 9u BLK 8,6 0t512 1226 /dev/sda6 qemu-dm 5820 root 10u IPv4 13327 0t0 TCP localhost:5901 (LISTEN) qemu-dm 5820 root 11u CHR 5,2 0t0 3280 /dev/ptmx qemu-dm 5820 root 12u CHR 136,2 0t0 4 /dev/pts/2 qemu-dm 5820 root 13w REG 253,5 881440 606516 /var/log/xen/xend.log qemu-dm 5820 root 14u FIFO 253,5 0t0 461914 /var/run/tap/qemu-read-1 qemu-dm 5820 root 15u FIFO 253,5 0t0 461935 /var/run/tap/qemu-write-1 qemu-dm 5820 root 16u CHR 10,63 0t0 4375 /dev/xen/evtchn qemu-dm 5820 root 18r REG 0,0 256 4080 /sys/devices/pci0000:00/0000:00:14.3/config qemu-dm 5820 root 19r REG 0,0 256 4080 /sys/devices/pci0000:00/0000:00:14.3/config qemu-dm 5820 root 20r REG 0,0 256 4123 /sys/devices/pci0000:00/0000:00:1f.3/config qemu-dm 5820 root 21r REG 0,0 256 4123 /sys/devices/pci0000:00/0000:00:1f.3/config qemu-dm 5820 root 22r FIFO 0,6 0t0 13336 pipe qemu-dm 5820 root 23w FIFO 0,6 0t0 13336 pipe qemu-dm 5820 root 28u FIFO 253,5 0t0 461937 /var/run/xend/dm-1-1268876464.fifo =====================================So, I don''t get it... There seem to be several questions here: 1) Why is the rtl driver trying to write into my board''s sysfs configuration space file? Is it possible that it''s just getting the wrong path in sysfs somehow? It seems like it should probably be opening a socket or something. 2) If the file descriptor is open read only, how is the write working? So I changed the configuration file to only pass through one board: Here is the configuration file: ======================================sunblade1:root:bash 2061 => cat node2.conf import os, re arch = os.uname()[4] if re.search(''64'', arch): arch_libdir = ''lib64'' else: arch_libdir = ''lib'' kernel = "/usr/lib/xen/boot/hvmloader" builder = ''hvm'' #memory = 512 memory = 2048 name = "node2" vcpus = 8 pae = 1 acpi = 1 apic = 1 vif =[''mac=00:16:3e:32:d5:64, bridge=br0, model=rtl8139''] disk =[''phy:sda6,hda,w''] #disk =[''phy:sda6,hda,w'', ''phy:/dev/loop0,hdc,r''] on_poweroff = ''destroy'' on_reboot = ''restart'' on_crash = ''restart'' device_model = ''/usr/'' + arch_libdir + ''/xen/bin/qemu-dm'' boot = "cd" snapshot = 0 sdl = 0 opengl = 1 vnc = 1 vnclisten = "127.0.0.1" vncdisplay = 1 vncunused = 0 vncpasswd = '''' nographic = 0 stdvga = 0 serial = ''pty'' #pci = [''08:00.0''] pci = [''0e:00.0''] #pci = [''08:00.0'', ''0e:00.0''] #pci = [''0e:00.0'', ''08:00.0''] #usb = 1 #usbdevice = ''mouse'' ===================================Which only differs from the original configuration file in the devices that it exports: ===================================sunblade1:root:bash 2050 => diff node1.conf node2.conf 13c13 < name = "node1" ---> name = "node2"38c38 < #pci = [''0e:00.0''] ---> pci = [''0e:00.0'']40c40 < pci = [''0e:00.0'', ''08:00.0''] ---> #pci = [''0e:00.0'', ''08:00.0'']===================================After starting this domU, qemu doesn''t have fd 6 open any more... =================================== sunblade1:root:bash 2047 => lsof -p 5843 <snip> qemu-dm 5843 root 0r CHR 1,3 0t0 3278 /dev/null qemu-dm 5843 root 1w REG 253,5 304274 606587 /var/log/xen/qemu-dm-node2.log qemu-dm 5843 root 2w REG 253,5 304274 606587 /var/log/xen/qemu-dm-node2.log qemu-dm 5843 root 3r DIR 253,5 4096 196635 /etc/init.d/rc3.d qemu-dm 5843 root 4r FIFO 0,6 0t0 13281 pipe qemu-dm 5843 root 5w FIFO 0,6 0t0 13281 pipe qemu-dm 5843 root 7u REG 0,3 0 4026538410 /proc/xen/privcmd qemu-dm 5843 root 8u unix 0xffff8803cc809400 0t0 13313 socket qemu-dm 5843 root 9u BLK 8,6 0t512 1281 /dev/sda6 qemu-dm 5843 root 10u IPv4 13315 0t0 TCP localhost:5901 (LISTEN) qemu-dm 5843 root 11u CHR 5,2 0t0 3280 /dev/ptmx qemu-dm 5843 root 12u CHR 136,2 0t0 4 /dev/pts/2 qemu-dm 5843 root 13w REG 253,5 906611 606516 /var/log/xen/xend.log qemu-dm 5843 root 14u FIFO 253,5 0t0 461914 /var/run/tap/qemu-read-1 qemu-dm 5843 root 15u FIFO 253,5 0t0 461935 /var/run/tap/qemu-write-1 qemu-dm 5843 root 16u CHR 10,63 0t0 4370 /dev/xen/evtchn qemu-dm 5843 root 17u REG 0,0 256 4139 /sys/devices/pci0000:00/0000:00:05.0/0000:0d:00.0/0000:0e:00.0/config qemu-dm 5843 root 18r REG 0,0 256 4080 /sys/devices/pci0000:00/0000:00:14.3/config qemu-dm 5843 root 19r REG 0,0 256 4080 /sys/devices/pci0000:00/0000:00:14.3/config qemu-dm 5843 root 20r REG 0,0 256 4123 /sys/devices/pci0000:00/0000:00:1f.3/config qemu-dm 5843 root 21r REG 0,0 256 4123 /sys/devices/pci0000:00/0000:00:1f.3/config qemu-dm 5843 root 22r FIFO 0,6 0t0 13324 pipe qemu-dm 5843 root 23w FIFO 0,6 0t0 13324 pipe qemu-dm 5843 root 28u FIFO 253,5 0t0 461937 /var/run/xend/dm-1-1268880243.fifo =====================================I then started the domU and redebugged qemu-dm with the same break point: ===================================== sunblade1:root:bash 2055 => gdb -p 5843 GNU gdb (GDB; SUSE Linux Enterprise 11) 6.8.50.20081120-cvs Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-suse-linux". For bug reporting instructions, please see: <http://bugs.opensuse.org/>. Attaching to process 5843 <snip> (gdb) b *0x00007fbce8b0c7f4 if $rdi == 6 && $rdx == 590 Breakpoint 2 at 0x7fbce8b0c7f4 (gdb) c Continuing. Breakpoint 2, 0x00007fbce8b0c7f4 in write () from /lib64/libpthread.so.0 (gdb) bt #0 0x00007fbce8b0c7f4 in write () from /lib64/libpthread.so.0 #1 0x000000000040746d in tap_receive (opaque=0xa940f0, buf=0x7ffff15a98b0 "������", size=<value optimized out>) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:4079 #2 0x0000000000407081 in qemu_send_packet (vc1=0xaf93b0, buf=0x7ffff15a98b0 "������", size=590) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:3865 #3 0x00000000004235a0 in rtl8139_transmit_one (s=0xaf9190, descriptor=<value optimized out>) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:1810 #4 0x0000000000424ae7 in rtl8139_io_writel (opaque=0x6, addr=<value optimized out>, val=524878) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:2367 #5 0x00000000004726a3 in cpu_physical_memory_rw (_addr=<value optimized out>, buf=0x7fbce95aa018 <Address 0x7fbce95aa018 out of bounds>, _len=<value optimized out>, is_write=1) at exec-dm.c:510 #6 0x00000000004a4703 in cpu_ioreq_move (env=<value optimized out>, req=0x7fbce95aa000) at helper2.c:319 #7 0x00000000004a4fb1 in cpu_handle_ioreq (opaque=<value optimized out>) at helper2.c:508 #8 0x000000000040aeb4 in main_loop_wait (timeout=10) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:7175 #9 0x00000000004a4dbd in main_loop () at helper2.c:571 #10 0x000000000040feb3 in main (argc=20, argv=0x7ffff15ad198) at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:8950 (gdb) info regi rax 0x0 0 rbx 0x24e 590 rcx 0x0 0 rdx 0x24e 590 rsi 0x7ffff15a98b0 140737242634416 rdi 0x6 6 rbp 0x7ffff15a98b0 0x7ffff15a98b0 rsp 0x7ffff15a9830 0x7ffff15a9830 r8 0x6 6 r9 0x1d6 470 r10 0x7ffff15aba00 140737242642944 r11 0x206 518 r12 0xa940f0 11092208 r13 0x7ffff15a98b0 140737242634416 r14 0x7fbce95aa018 140449345609752 r15 0x1 1 rip 0x7fbce8b0c7f4 0x7fbce8b0c7f4 <write+68> eflags 0x206 [ PF IF ] cs 0xe033 57395 ss 0xe02b 57387 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 fctrl 0x37f 895 fstat 0x0 0 ftag 0xffff 65535 fiseg 0x0 0 fioff 0x0 0 foseg 0x0 0 fooff 0x0 0 fop 0x0 0 mxcsr 0x1f80 [ IM DM ZM OM UM PM ] ======================================Notice that it''s writing to fd 6! And fd 6 is not even open... Is this fd just uninitialized? That''s about as far as I got. Anyone have any clue as to what is going wrong here? Should I be posting this on the qemu mailing list? thanks, dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Mar-18 18:10 UTC
Re: [Xen-devel] Re: PCI BAR register space written with garbage in HVM guest.
On Thu, Mar 18, 2010 at 03:27:30PM -0300, Dan Gora wrote:> On Thu, Mar 18, 2010 at 11:32 AM, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> wrote: > > On Wed, Mar 17, 2010 at 11:14:13PM -0300, Dan Gora wrote: > >> Hi all, > >> > >> I apologize for the length of this message, but I figured that it was > >> probably better to supply too much info rather than too little... > >> I''ve made some good progress debugging this today. The problem looks > >> pretty bizarre... > > > > I just skimmed through your e-mail and didn''t dive in the r8169 code to > > see if there is something there, but I thought you might want to try > > using e1000 instead of r8169 driver and see if that makes it work? > > Sure enough that fixed the problem! No more writes on fd 6, no more > PCI BAR regsiter corruption! > That''s good enough for me as a workaround. I guess someone else can > have a look at this rtl8139 driver.That is not what I expected, but I am glad to hear that it makes it possible for you to continue with your project. It is good to know that the failure path is contained in the r8169 implementation. If it wouldn''t be too much trouble, could you send an e-mail to the QEMU mailing list this thread and your solution in case there is somebody there itching to work on this. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Gora
2010-Mar-18 18:27 UTC
Re: [Xen-devel] Re: PCI BAR register space written with garbage in HVM guest.
On Thu, Mar 18, 2010 at 11:32 AM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> On Wed, Mar 17, 2010 at 11:14:13PM -0300, Dan Gora wrote: >> Hi all, >> >> I apologize for the length of this message, but I figured that it was >> probably better to supply too much info rather than too little... >> I''ve made some good progress debugging this today. The problem looks >> pretty bizarre... > > I just skimmed through your e-mail and didn''t dive in the r8169 code to > see if there is something there, but I thought you might want to try > using e1000 instead of r8169 driver and see if that makes it work?Sure enough that fixed the problem! No more writes on fd 6, no more PCI BAR regsiter corruption! That''s good enough for me as a workaround. I guess someone else can have a look at this rtl8139 driver. thanks! dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Gora
2010-Mar-18 19:56 UTC
Re: [Xen-devel] Re: PCI BAR register space written with garbage in HVM guest.
On Thu, Mar 18, 2010 at 3:10 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> That is not what I expected, but I am glad to hear that it > makes it possible for you to continue with your project. >yeah me either.. :)> It is good to know that the failure path is contained in the r8169 > implementation. If it wouldn''t be too much trouble, could you send an > e-mail to the QEMU mailing list this thread and your solution in > case there is somebody there itching to work on this.done, but there doesn''t seem to be much uptake there.. thanks dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel