If this is the right list I apologies in advance, but I am a kernel developer, and I''m not afraid of getting my hands dirty so... Like a number of others, I want to play with Vt-d and graphics pass through. But first I need to just get Xen up and running. with a dom0 and an operational X server. I''ve been trying various combinations looking for some way to have a working dom0. The hardware is a ASUS p6t7 w/ 2 Nvidia GTX 295 graphics cards. Each card occupies 2 pci slots, and have 2 GPUs plus quite a bit of memory. I keep getting the follow X server crash when I boot Xen and a dom0 kerenl.> [54149.822514] X: Corrupted page table at address b61a1000 > [54149.822519] *pdpt = 0000000027690001 > [54149.822525] Bad pagetable: 000f [#1] SMP > [54149.822531] last sysfs file: > /sys/devices/pci0000:00/0000:00:1c.5/0000:10:00.0/resource > [54149.822535] Modules linked in: nvidia(P) [last unloaded: nvidia] > [54149.822544] > [54149.822548] Pid: 7148, comm: X Tainted: P (2.6.30-rc3-tip > #2) System Product Name > [54149.822552] EIP: 0073:[<b622da45>] EFLAGS: 00210202 CPU: 0 > [54149.822555] EIP is at 0xb622da45 > [54149.822558] EAX: b61a1000 EBX: b6580950 ECX: 0979a410 EDX: 00000001 > [54149.822561] ESI: 00000020 EDI: 097961e0 EBP: 0979a410 ESP: bfd3cec8 > [54149.822565] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b > [54149.822568] Process X (pid: 7148, ti=e0c38000 task=e1615d60 > task.ti=e0c38000) > [54149.822571] > [54149.822574] EIP: [<b622da45>] 0xb622da45 SS:ESP 007b:bfd3cec8 > [54149.822583] ---[ end trace 410a0a71e695876c ]---Running strace on X, I see that it''s crashing doing a series of mmap2. The other odd thing is that if I cat the card files (/proc/driver/nvidia/cards/0), it reports the video bios as ??:??:??:??. It looks like either the Nvidia driver is poking around the system using non-standard interfaces or somehow the presented pci interface to the cards is broken somehow. Turning on or off the Vt-d support in the bios makes no difference, and if I boot the system without the Xen hypervisor, the video bios is reported correctly, and the driver works just fine. I''ve compared the lspci output with and without the Xen hypervisor, and there are a number of discrepancies. Notably in the capabilities listed for the graphics cards. Capability 78 is misreported as Express (v1) instead of Express (v2). I''m not sure what to make of this. Unfortunately, I don''t have any other older Nvidia cards to see if they behave differently. I''ve tried 3.4.1, and the lastest xen-unstable. I''ve tried the pv-ops git tree as I really need a 2.6.31 dom0 kernel for other reasons, but for the moment I''d just like to get to the point where I have Xen, and dom0 and Nvidia playing nicely with one another, so I can move on to working on Vt-d and graphic pass through. Any suggestions? ---Michael J Coss _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 09/09/2009 09:47, "Michael J Coss" <mjcoss@alcatel-lucent.com> wrote:> I''ve tried 3.4.1, and the lastest xen-unstable. I''ve tried the pv-ops > git tree as I really need a 2.6.31 dom0 kernel for other reasons, but > for the moment I''d just like to get to the point where I have Xen, and > dom0 and Nvidia playing nicely with one another, so I can move on to > working on Vt-d and graphic pass through. > > Any suggestions?Unless you really must have 2.6.30+, I''d recommend the 2.6.27 tree and patchqueue from http://xenbits.xensource.com/XCI. Otherwise you are likely to have to get your hands fairly dirty with pv_ops. For example, afaik starting an X server on pv_ops is still pretty ambitious on some systems. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, Sep 09, 2009 at 04:47:39AM -0400, Michael J Coss wrote:> If this is the right list I apologies in advance, but I am a kernel > developer, and I''m not afraid of getting my hands dirty so... > > Like a number of others, I want to play with Vt-d and graphics pass > through. But first I need to just get Xen up and running. with a dom0 > and an operational X server. I''ve been trying various combinations > looking for some way to have a working dom0. The hardware is a ASUS > p6t7 w/ 2 Nvidia GTX 295 graphics cards. Each card occupies 2 pci > slots, and have 2 GPUs plus quite a bit of memory. > > I keep getting the follow X server crash when I boot Xen and a dom0 kerenl. > > >[54149.822514] X: Corrupted page table at address b61a1000 > >[54149.822519] *pdpt = 0000000027690001 > >[54149.822525] Bad pagetable: 000f [#1] SMP > >[54149.822531] last sysfs file: > >/sys/devices/pci0000:00/0000:00:1c.5/0000:10:00.0/resource > >[54149.822535] Modules linked in: nvidia(P) [last unloaded: nvidia] > >[54149.822544] > >[54149.822548] Pid: 7148, comm: X Tainted: P (2.6.30-rc3-tip > >#2) System Product Name > >[54149.822552] EIP: 0073:[<b622da45>] EFLAGS: 00210202 CPU: 0 > >[54149.822555] EIP is at 0xb622da45 > >[54149.822558] EAX: b61a1000 EBX: b6580950 ECX: 0979a410 EDX: 00000001 > >[54149.822561] ESI: 00000020 EDI: 097961e0 EBP: 0979a410 ESP: bfd3cec8 > >[54149.822565] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b > >[54149.822568] Process X (pid: 7148, ti=e0c38000 task=e1615d60 > >task.ti=e0c38000) > >[54149.822571] > >[54149.822574] EIP: [<b622da45>] 0xb622da45 SS:ESP 007b:bfd3cec8 > >[54149.822583] ---[ end trace 410a0a71e695876c ]--- > > Running strace on X, I see that it''s crashing doing a series of mmap2. > The other odd thing is that if I cat the card files > (/proc/driver/nvidia/cards/0), it reports the video bios as ??:??:??:??. > It looks like either the Nvidia driver is poking around the system > using non-standard interfaces or somehow the presented pci interface to > the cards is broken somehow. Turning on or off the Vt-d support in the > bios makes no difference, and if I boot the system without the Xen > hypervisor, the video bios is reported correctly, and the driver works > just fine. > > I''ve compared the lspci output with and without the Xen hypervisor, and > there are a number of discrepancies. Notably in the capabilities listed > for the graphics cards. Capability 78 is misreported as Express (v1) > instead of Express (v2). I''m not sure what to make of this. > > Unfortunately, I don''t have any other older Nvidia cards to see if they > behave differently. > > I''ve tried 3.4.1, and the lastest xen-unstable. I''ve tried the pv-ops > git tree as I really need a 2.6.31 dom0 kernel for other reasons, but > for the moment I''d just like to get to the point where I have Xen, and > dom0 and Nvidia playing nicely with one another, so I can move on to > working on Vt-d and graphic pass through. > > Any suggestions? >So I take it you''re using the binary/proprietary Nvidia driver? It is known to have problems with Xen dom0 kernels.. I''m not sure of the latest status. I think some people have gotten it to work.. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen wrote:> So I take it you''re using the binary/proprietary Nvidia driver? > > It is known to have problems with Xen dom0 kernels.. I''m not sure of the > latest status. I think some people have gotten it to work.. > > -- Pasi >Yes, I''m trying to get the binary driver working. On another machine, at another location, I''ve gotten this to work with OpenSUSE but I was running an older kernel, an older version of Xen, and on older hardware. So I know that it''s doable under some set of conditions. To me, it appears that the GTX 295 card is the issue as I suspect is a bit of hack to allow it to straddle 2 pci slots, and if I had some other Nvidia cards at hand, I''d try one and see if this is in fact the problem or if it''s Nvidia specific. ---Michael J Coss _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 09/09/09 01:56, Keir Fraser wrote:> On 09/09/2009 09:47, "Michael J Coss" <mjcoss@alcatel-lucent.com> wrote: > > >> I''ve tried 3.4.1, and the lastest xen-unstable. I''ve tried the pv-ops >> git tree as I really need a 2.6.31 dom0 kernel for other reasons, but >> for the moment I''d just like to get to the point where I have Xen, and >> dom0 and Nvidia playing nicely with one another, so I can move on to >> working on Vt-d and graphic pass through. >> >> Any suggestions? >> > Unless you really must have 2.6.30+, I''d recommend the 2.6.27 tree and > patchqueue from http://xenbits.xensource.com/XCI. Otherwise you are likely > to have to get your hands fairly dirty with pv_ops. For example, afaik > starting an X server on pv_ops is still pretty ambitious on some systems. >Starting X in dom0 seems to work OK for Intel and ATI systems, at least; I expect most DRM drivers would work OK if they''re well-behaved because we''re hooking AGP memory accesses, etc. However, the proprietary Nvidia drivers are problematic, though I gather there are some patches floating around for them. Unfortunately the AGP hooks are being removed (some years after Keir first added them, and just as they have a user according to their original intent) in favour of making each driver use the DMA API to do the appropriate phys<->bus conversions. So far, only the Intel driver has been converted, and only when Intel IOMMU is enabled. However, I didn''t get any objection from the DRM folks about making it unconditional or adding it to new drivers as needed. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge wrote:> On 09/09/09 01:56, Keir Fraser wrote: > >> On 09/09/2009 09:47, "Michael J Coss" <mjcoss@alcatel-lucent.com> wrote: >> >> >> >>> I''ve tried 3.4.1, and the lastest xen-unstable. I''ve tried the pv-ops >>> git tree as I really need a 2.6.31 dom0 kernel for other reasons, but >>> for the moment I''d just like to get to the point where I have Xen, and >>> dom0 and Nvidia playing nicely with one another, so I can move on to >>> working on Vt-d and graphic pass through. >>> >>> Any suggestions? >>> >>> >> Unless you really must have 2.6.30+, I''d recommend the 2.6.27 tree and >> patchqueue from http://xenbits.xensource.com/XCI. Otherwise you are likely >> to have to get your hands fairly dirty with pv_ops. For example, afaik >> starting an X server on pv_ops is still pretty ambitious on some systems. >> >> > > Starting X in dom0 seems to work OK for Intel and ATI systems, at least; > I expect most DRM drivers would work OK if they''re well-behaved because > we''re hooking AGP memory accesses, etc. However, the proprietary Nvidia > drivers are problematic, though I gather there are some patches floating > around for them. > > Unfortunately the AGP hooks are being removed (some years after Keir > first added them, and just as they have a user according to their > original intent) in favour of making each driver use the DMA API to do > the appropriate phys<->bus conversions. So far, only the Intel driver > has been converted, and only when Intel IOMMU is enabled. However, I > didn''t get any objection from the DRM folks about making it > unconditional or adding it to new drivers as needed. > > J >I suspected as much, although I don''t understand the origin of the lspci discrepancies between booting with/without the hypervisor. It seems to me that there is some problem with Xen''s view of the PCI bus, as well as the fact that the Nvidia driver is trying to access something outside of the hooked APIs. The graphics cards in the system are the dual GPU, dual slot cards, and maybe this is contributing to the problem. I''m going to see about getting some other single slot Nvidia card and see if the same issue happens. I may pick up some ATI cards as well. ---Michael J Coss _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Michael, I have found that the lspci output changes once you have successfully loaded a driver against it. In both cases, when used by pci-stub/pciback in DomU or starting X on Dom0 will make the NV card appear differently in the lspci output. I would try and use the 2.6.18 kernel with the shipped "nv" X driver - I have found this combination to be the most compatible. In the xen-unstable tree, run "make linux-2.6-xen0-build" (from memory). So far I have only had success with passing through the Primary display adapter, I can`t get my secondary to work.. (I dont think it`s anything to do with the models of card, more that the secondary passthru doesnt work..) My Pri = GTX260 (512mb+sharedmem=864mb) Sec: 9500 GT (512mb) The main complication and issue that needs to be overcome is the support for FLR. Without this PCIe capability, the GPU cannot be reset after the DomU has initialised it. I have to perform a hard-reset of Dom0 for GPU Passthrough to work a second time. Tim -----Original Message----- From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Michael J Coss Sent: 10 September 2009 10:46 To: xen-devel@lists.xensource.com Subject: Re: [Xen-devel] Nvidia, Xen, and Vt-d Jeremy Fitzhardinge wrote:> On 09/09/09 01:56, Keir Fraser wrote: > >> On 09/09/2009 09:47, "Michael J Coss" <mjcoss@alcatel-lucent.com> wrote: >> >> >> >>> I''ve tried 3.4.1, and the lastest xen-unstable. I''ve tried the pv-ops >>> git tree as I really need a 2.6.31 dom0 kernel for other reasons, but >>> for the moment I''d just like to get to the point where I have Xen, and >>> dom0 and Nvidia playing nicely with one another, so I can move on to >>> working on Vt-d and graphic pass through. >>> >>> Any suggestions? >>> >>> >> Unless you really must have 2.6.30+, I''d recommend the 2.6.27 tree and >> patchqueue from http://xenbits.xensource.com/XCI. Otherwise you are likely >> to have to get your hands fairly dirty with pv_ops. For example, afaik >> starting an X server on pv_ops is still pretty ambitious on some systems. >> >> > > Starting X in dom0 seems to work OK for Intel and ATI systems, at least; > I expect most DRM drivers would work OK if they''re well-behaved because > we''re hooking AGP memory accesses, etc. However, the proprietary Nvidia > drivers are problematic, though I gather there are some patches floating > around for them. > > Unfortunately the AGP hooks are being removed (some years after Keir > first added them, and just as they have a user according to their > original intent) in favour of making each driver use the DMA API to do > the appropriate phys<->bus conversions. So far, only the Intel driver > has been converted, and only when Intel IOMMU is enabled. However, I > didn''t get any objection from the DRM folks about making it > unconditional or adding it to new drivers as needed. > > J >I suspected as much, although I don''t understand the origin of the lspci discrepancies between booting with/without the hypervisor. It seems to me that there is some problem with Xen''s view of the PCI bus, as well as the fact that the Nvidia driver is trying to access something outside of the hooked APIs. The graphics cards in the system are the dual GPU, dual slot cards, and maybe this is contributing to the problem. I''m going to see about getting some other single slot Nvidia card and see if the same issue happens. I may pick up some ATI cards as well. ---Michael J Coss _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Moore wrote:> Hi Michael, > > I have found that the lspci output changes once you have successfully loaded a driver against it. > > In both cases, when used by pci-stub/pciback in DomU or starting X on Dom0 will make the NV card appear differently in the lspci output. > > I would try and use the 2.6.18 kernel with the shipped "nv" X driver - I have found this combination to be the most compatible. In the xen-unstable tree, run "make linux-2.6-xen0-build" (from memory). > > So far I have only had success with passing through the Primary display adapter, I can`t get my secondary to work.. (I dont think it`s anything to do with the models of card, more that the secondary passthru doesnt work..) > > My Pri = GTX260 (512mb+sharedmem=864mb) > Sec: 9500 GT (512mb) > > The main complication and issue that needs to be overcome is the support for FLR. Without this PCIe capability, the GPU cannot be reset after the DomU has initialised it. I have to perform a hard-reset of Dom0 for GPU Passthrough to work a second time. > > Tim >Just as a follow up to this. I decided before I go back to 2.6.18, to try the open source "nv" driver instead of the proprietary ones. It started up the Xserver just fine, although performance for glxgears dropped through the floor. <100 fps on a GTX 295 with an i7 3.3Ghz processor ouch. Thanks for all the suggestions. I''m going to try working with Nvidia to see if I can get them to help identify the problem, but I don''t expect to get very far with that. Clearly, there is a problem with how they are accessing the card. ---Michael J Coss _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Sep 14, 2009 at 05:31:59PM -0400, Michael J Coss wrote:> Tim Moore wrote: > >Hi Michael, > > > >I have found that the lspci output changes once you have successfully > >loaded a driver against it. > > > >In both cases, when used by pci-stub/pciback in DomU or starting X on Dom0 > >will make the NV card appear differently in the lspci output. > >I would try and use the 2.6.18 kernel with the shipped "nv" X driver - I > >have found this combination to be the most compatible. In the xen-unstable > >tree, run "make linux-2.6-xen0-build" (from memory). > > > >So far I have only had success with passing through the Primary display > >adapter, I can`t get my secondary to work.. (I dont think it`s anything to > >do with the models of card, more that the secondary passthru doesnt work..) > > > >My Pri = GTX260 (512mb+sharedmem=864mb) > >Sec: 9500 GT (512mb) > > > >The main complication and issue that needs to be overcome is the support > >for FLR. Without this PCIe capability, the GPU cannot be reset after the > >DomU has initialised it. I have to perform a hard-reset of Dom0 for GPU > >Passthrough to work a second time. > > > >Tim > > > Just as a follow up to this. I decided before I go back to 2.6.18, to > try the open source "nv" driver instead of the proprietary ones. It > started up the Xserver just fine, although performance for glxgears > dropped through the floor. <100 fps on a GTX 295 with an i7 3.3Ghz > processor ouch. Thanks for all the suggestions. >I think ''nv'' driver doesn''t have any 3D acceleration. The opensource ''nouveau'' driver has (some) 3D acceleration for some cards..> I''m going to try working with Nvidia to see if I can get them to help > identify the problem, but I don''t expect to get very far with that. > Clearly, there is a problem with how they are accessing the card. >Did you google for Xen-patches for the binary/proprietary nvidia driver? Some people got it to work with some extra patches for the kernel module.. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel