I''m experiencing total grant table corruption on a system and I''m hoping my symptoms will ring a bell with a member of the Xen developer community. The setup is Xen 4.0.0-RC2 (OpenSUSE 11.2 package) on a Nehalem system. The sole guest instance is 64bit FreeBSD running in HVM mode, a single vcpu, and a PCI passed-through LSI Logic 1068e SAS controller. FreeBSD is running netfront and blockfront PV drivers. After a few hours of operation, FreeBSD''s entire grant table (3 pages) is spammed with the pattern 0x5a5a5a5a. This problem has been replicated on multiple machines. The first assumption was a bug in the FreeBSD PV drivers or other Xen support. To rule this out, we modified FreeBSD''s grant table functions to unmap the grant table from the kernel virtual address space between operations. This was on a single vcpu setup, but to rule out corruption by interrupt handlers, interrupts were also disabled while the mapping was valid during grant table updates. The corruption still occurs without the FreeBSD kernel faulting on unmapped pages. I believe this leaves a VT-D HW problem or a bug in the hypervisor as the remaining possibilities. I''m working now to further isolate the error by changing our test load so I can remove VT-D from the configuration. Are there any Xen or QEMU components that use a 0x5a5a5a5a initialization pattern? Are there any tools in the hypervisor I can use to trap rogue access to guest grant table pages? Thanks, Justin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Mar-14 19:15 UTC
Re: [Xen-devel] Grant table corruption with HVM guest
On Sun, Mar 14, 2010 at 01:10:38PM -0600, Justin T. Gibbs wrote:> I''m experiencing total grant table corruption on a system and I''m > hoping my symptoms will ring a bell with a member of the Xen developer > community. The setup is Xen 4.0.0-RC2 (OpenSUSE 11.2 package) on > a Nehalem system. >You should definitely upgrade to -rc6 or newer, since rc6 fixes dom0 memory corruption bug! I''m not sure if it''s the same bug you''re seeing.. -- Pasi> The sole guest instance is 64bit FreeBSD running > in HVM mode, a single vcpu, and a PCI passed-through LSI Logic 1068e > SAS controller. FreeBSD is running netfront and blockfront PV > drivers. After a few hours of operation, FreeBSD''s entire grant > table (3 pages) is spammed with the pattern 0x5a5a5a5a. This problem > has been replicated on multiple machines. > > The first assumption was a bug in the FreeBSD PV drivers or other Xen > support. To rule this out, we modified FreeBSD''s grant table functions > to unmap the grant table from the kernel virtual address space between > operations. This was on a single vcpu setup, but to rule out corruption > by interrupt handlers, interrupts were also disabled while the mapping was > valid during grant table updates. The corruption still occurs without the > FreeBSD kernel faulting on unmapped pages. I believe this leaves a VT-D > HW problem or a bug in the hypervisor as the remaining possibilities. > I''m working now to further isolate the error by changing our test load > so I can remove VT-D from the configuration. > > Are there any Xen or QEMU components that use a 0x5a5a5a5a initialization > pattern? Are there any tools in the hypervisor I can use to trap rogue > access to guest grant table pages? > > Thanks, > Justin > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel