Hi all, I merged the drm-tree from 2.6.33-rc8 into jeremy''s 2.6.31.6 master and got X working on dom0 - but only with option "ShadowFB" set. Using Xorg-7.5, mesa-git, libdrm-git xf86-video-nouveau-git, xen-testing-git and qemu-dm-git. Ensured dependencies by deb-packaging everything. Kernel built without drm and the nouveau driver built as a separate out-of-tree modules package- to ensure correct ttm modules. Tried WinXP and debianetch domUs - which worked fine. On bare-metal boot, everything works - even 3D accelerated rendering. But when booted on Xen, X works ONLY - as mentioned - with ShadowFB set, which in turn, turns off even 2D acceleration. The only difference in the boots is that bare-metal boot has 2GB RAM whereas dom0 has 512M. The graphics card is a nVidia GeForce 9400GT, and the distro is basically debian lenny. Turned debug on in the nouveau driver and patched some into libdrm and compared the outputs on bare-metal and xen boot. Identical output upto problem point - only differing fields were time-stamp, process pid, and grobj allocation addresses. Problem Point: libdrm has an inlined function OUT_RING, defined in nouveau/nouveau_pushbuf.h. static __inline__ void OUT_RING(struct nouveau_channel *chan, unsigned data) { *(chan->cur++) = (data); } - chan->cur is a uint32_t * The function is entered by X through ScrnInit in the DDX driver. Patched log-message on entry is written to syslog, and then - X seems to get suspended. chan->cur can be read on entry, so (assumed) suspension is on write. System loses consoles, but can be ssh''ed into - no killed processes, no segfault. The area pointed to be the pushbuf - which is apparently the PRAMIN area on the graphics card. Modern graphics is not my forte - so I am seeking some pointers to resolve this from anyone. I think that if this is solved, Xen would have open-source 3D-acceleration support! Am game for testing, patching, etc. I am basically interested in having a develepment domU and another testing domU without devel-packages. Arvind R. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Feb 25, 2010 at 02:16:07PM +0530, Arvind R wrote:> Hi all, > I merged the drm-tree from 2.6.33-rc8 into jeremy''s 2.6.31.6 master and > got X working on dom0 - but only with option "ShadowFB" set. Using Xorg-7.5, > mesa-git, libdrm-git xf86-video-nouveau-git, xen-testing-git and > qemu-dm-git. Ensured dependencies by deb-packaging everything. Kernel > built without drm and the nouveau driver built as a separate > out-of-tree modules package- to ensure > correct ttm modules. Tried WinXP and debianetch domUs - which worked fine. > > On bare-metal boot, everything works - even 3D accelerated rendering. But > when booted on Xen, X works ONLY - as mentioned - with ShadowFB set, > which in turn, turns off even 2D acceleration. The only difference in the boots > is that bare-metal boot has 2GB RAM whereas dom0 has 512M. The graphics > card is a nVidia GeForce 9400GT, and the distro is basically debian lenny. > > Turned debug on in the nouveau driver and patched some into libdrm and > compared the outputs on bare-metal and xen boot. Identical output upto > problem point - only differing fields were time-stamp, process pid, and > grobj allocation addresses. > > Problem Point: > libdrm has an inlined function OUT_RING, defined in > nouveau/nouveau_pushbuf.h. > static __inline__ void > OUT_RING(struct nouveau_channel *chan, unsigned data) > { > *(chan->cur++) = (data); > } > - chan->cur is a uint32_t * > > The function is entered by X through ScrnInit in the DDX driver. > Patched log-message on entry is written to syslog, and then - > X seems to get suspended. chan->cur can be read on entry, > so (assumed) suspension is on write. System loses consoles, > but can be ssh''ed into - no killed processes, no segfault. > > The area pointed to be the pushbuf - which is apparently the > PRAMIN area on the graphics card. Modern graphics is not > my forte - so I am seeking some pointers to resolve this fromSo this looks to assume that the ring is contingous, which it probably is not. Would it be possible to trace down who allocates that *chan? You say it is ''PRAMIN'' - is that allocated via pci_alloc_* call? Or is the address retrieved from an ioctl call made in user-space?> anyone. I think that if this is solved, Xen would have open-source > 3D-acceleration support! Am game for testing, patching, etc.Neat!> > I am basically interested in having a develepment domU and > another testing domU without devel-packages.You lost me here. Don''t you mean Dom0? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Feb 25, 2010 at 6:25 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> On Thu, Feb 25, 2010 at 02:16:07PM +0530, Arvind R wrote: >> Hi all, >> I merged the drm-tree from 2.6.33-rc8 into jeremy''s 2.6.31.6 master and======= snip ======> is not. Would it be possible to trace down who allocates that *chan? You> say it is ''PRAMIN'' - is that allocated via pci_alloc_* call? > > Or is the address retrieved from an ioctl call made in user-space?Both true, I guess. chan is GFP_KERNEL allocated. My current understanding is that chan->cur, at the end of a lot of initialization, points to specific areas of card memory which forms a command ring. What gets written is 32-bits which encode pointers to contexts and methods already associated with that specific channel. Each of possibly many channels have their own independent Command FIFOs (RINGS) and associations. So, there must be a mmap call somewhere to map the area to user-space for that problem write to work on non-Xen boots. Will try track down some more and post. With mmaps and PCIGARTs - it will be some hunt!>> another testing domU without devel-packages. > > You lost me here. Don''t you mean Dom0? >Let''s say virtual appliances - for which one needs dom0! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Feb 25, 2010 at 09:01:48AM -0800, Arvind R wrote:> On Thu, Feb 25, 2010 at 6:25 PM, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> wrote: > > On Thu, Feb 25, 2010 at 02:16:07PM +0530, Arvind R wrote: > >> Hi all, > >> I merged the drm-tree from 2.6.33-rc8 into jeremy''s 2.6.31.6 master and > ======= snip ======> > is not. Would it be possible to trace down who allocates that *chan? You > > say it is ''PRAMIN'' - is that allocated via pci_alloc_* call? > > > > Or is the address retrieved from an ioctl call made in user-space? > Both true, I guess. > > chan is GFP_KERNEL allocated. My current understanding is that > chan->cur, at the end of a lot of initialization, points to specific > areas of card > memory which forms a command ring. What gets written is 32-bits which > encode pointers to contexts and methods already associated with that > specific channel. Each of possibly many channels have their own independent > Command FIFOs (RINGS) and associations. > > So, there must be a mmap call somewhere to map the area to user-space > for that problem write to work on non-Xen boots. Will try track down some more > and post. With mmaps and PCIGARTs - it will be some hunt!You might want to look also at the source code of the nouveu X driver. I remember looking at the radeon one, where it made an drmScatterMap call, saved it, and then later submitted that address via an ioctl call to the drm_radeon driver which used it as a ring buffer. Took a bit of hoping around to find who allocated it in the first place.> > >> another testing domU without devel-packages. > > > > You lost me here. Don''t you mean Dom0? > > > Let''s say virtual appliances - for which one needs dom0!Ah yes. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Feb 25, 2010 at 11:14 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> On Thu, Feb 25, 2010 at 09:01:48AM -0800, Arvind R wrote: >> On Thu, Feb 25, 2010 at 6:25 PM, Konrad Rzeszutek Wilk >> <konrad.wilk@oracle.com> wrote: >> > On Thu, Feb 25, 2010 at 02:16:07PM +0530, Arvind R wrote: >> >> Hi all, >> >> I merged the drm-tree from 2.6.33-rc8 into jeremy''s 2.6.31.6 master and >> ======= snip ======>> > is not. Would it be possible to trace down who allocates that *chan? You >> > say it is ''PRAMIN'' - is that allocated via pci_alloc_* call?======= snip ======>> So, there must be a mmap call somewhere to map the area to user-space>> for that problem write to work on non-Xen boots. Will try track down some more >> and post. With mmaps and PCIGARTs - it will be some hunt!======= snip ======> to the drm_radeon driver which used it as a ring buffer. Took a bit of> hoping around to find who allocated it in the first place. >After a lot of reboots and log viewing: The pushbuf (FIFO/RING) is the only means of programming the card DMA activity. It is exposed to user-space by mmap of the drm_device (PCI) handle with different offsets for each channel. Parameters are associated to the DMA command using ioctls to bind channels/sub-channels/contexts. This mmap is in the libdrm2 library. Libdrm channel/accelerator initialization and setup chores and the DDX driver (xf86-video-nouveau) more-or-less acts thro'' libdrm. My suspicion is that Xen has some problems with mmap of PCI(E) device memory. How is iomem handled in a mmap? As of now, accelerator on Xen stops right at the initialisation stage - when libdrm tries to set up the accelerator-engine in the course of ScreenInit. And to do that, it cannot write the command to setup the basic 2D engine. Suggestions? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Feb 26, 2010 at 09:04:33PM +0530, Arvind R wrote:> On Thu, Feb 25, 2010 at 11:14 PM, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> wrote: > > On Thu, Feb 25, 2010 at 09:01:48AM -0800, Arvind R wrote: > >> On Thu, Feb 25, 2010 at 6:25 PM, Konrad Rzeszutek Wilk > >> <konrad.wilk@oracle.com> wrote: > >> > On Thu, Feb 25, 2010 at 02:16:07PM +0530, Arvind R wrote: > >> >> Hi all, > >> >> I merged the drm-tree from 2.6.33-rc8 into jeremy''s 2.6.31.6 master and > >> ======= snip ======> >> > is not. Would it be possible to trace down who allocates that *chan? You > >> > say it is ''PRAMIN'' - is that allocated via pci_alloc_* call? > ======= snip ======> >> So, there must be a mmap call somewhere to map the area to user-space > >> for that problem write to work on non-Xen boots. Will try track down some more > >> and post. With mmaps and PCIGARTs - it will be some hunt! > ======= snip ======> > to the drm_radeon driver which used it as a ring buffer. Took a bit of > > hoping around to find who allocated it in the first place. > > > After a lot of reboots and log viewing: > The pushbuf (FIFO/RING) is the only means of programming the card DMA > activity. It is exposed to user-space by mmap of the drm_device (PCI) handle > with different offsets for each channel. Parameters are associated to the DMA > command using ioctls to bind channels/sub-channels/contexts. This mmap is > in the libdrm2 library. Libdrm channel/accelerator initialization and > setup chores > and the DDX driver (xf86-video-nouveau) more-or-less acts thro'' libdrm.Ok, that is the DRM_NOUVEAU_CHANNEL_ALLOC ioctl, which ends up calling the ''ttm_bo_init''. I remember Pasi having an issue with this on Radeon and I provided a hack to see if it would work. Take a look at this e-mail: http://lists.xensource.com/archives/cgi-bin/extract-mesg.cgi?a=xen-devel&m=2010-01&i=20100115071856.GD17978%40reaktio.net> > My suspicion is that Xen has some problems with mmap of PCI(E) device > memory. How is iomem handled in a mmap?It looks to be using ''ioremap'' which is Xen safe. Unless your card has an AGP bridge on it, at which point it would end up using dma_alloc_coherent in all likehood.> > As of now, accelerator on Xen stops right at the initialisation stage - when > libdrm tries to set up the accelerator-engine in the course of ScreenInit. And > to do that, it cannot write the command to setup the basic 2D engine.I think that the ttm_bo calls set up pages in the 4KB size, but the initial channel requests a 64KB one. I think it also sets up page-table directory so that when the GPU accesses the addresses, it gets the real bus address. I wonder if it fails at that thought - meaning that the addresses that are written to the page table are actually the guest page numbers (gpfn) instead of the machine page numbers (mfn). The other issue might be that your back-port broke the AGP allocation. It needs to be: 35 #define alloc_gatt_pages(order) ({ \ 36 char *_t; dma_addr_t _d; \ 37 _t = dma_alloc_coherent(NULL, PAGE_SIZE<<(order), &_d, GFP_KERNEL); \ 38 _t; }) But that is less likely. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Mar 1, 2010 at 9:31 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> On Fri, Feb 26, 2010 at 09:04:33PM +0530, Arvind R wrote: >> On Thu, Feb 25, 2010 at 11:14 PM, Konrad Rzeszutek Wilk >> <konrad.wilk@oracle.com> wrote: >> > On Thu, Feb 25, 2010 at 09:01:48AM -0800, Arvind R wrote: >> >> On Thu, Feb 25, 2010 at 6:25 PM, Konrad Rzeszutek Wilk >> >> <konrad.wilk@oracle.com> wrote: >> >> > On Thu, Feb 25, 2010 at 02:16:07PM +0530, Arvind R wrote: >> >> >> Hi all, >> >> >> I merged the drm-tree from 2.6.33-rc8 into jeremy''s 2.6.31.6 master and >> >> ======= snip ======>> >> > is not. Would it be possible to trace down who allocates that *chan? You >> >> > say it is ''PRAMIN'' - is that allocated via pci_alloc_* call? >> ======= snip ======>> >> So, there must be a mmap call somewhere to map the area to user-space >> >> for that problem write to work on non-Xen boots. Will try track down some more >> >> and post. With mmaps and PCIGARTs - it will be some hunt! >> ======= snip ======>> > to the drm_radeon driver which used it as a ring buffer. Took a bit of >> > hoping around to find who allocated it in the first place. >> > >> After a lot of reboots and log viewing: >> The pushbuf (FIFO/RING) is the only means of programming the card DMA >> activity. It is exposed to user-space by mmap of the drm_device (PCI) handle >> with different offsets for each channel. Parameters are associated to the DMA >> command using ioctls to bind channels/sub-channels/contexts. This mmap is >> in the libdrm2 library. Libdrm channel/accelerator initialization and >> setup chores >> and the DDX driver (xf86-video-nouveau) more-or-less acts thro'' libdrm. > > Ok, that is the DRM_NOUVEAU_CHANNEL_ALLOC ioctl, which ends up calling > the ''ttm_bo_init''. I remember Pasi having an issue with this on Radeon > and I provided a hack to see if it would work. Take a look at this > e-mail: > > http://lists.xensource.com/archives/cgi-bin/extract-mesg.cgi?a=xen-devel&m=2010-01&i=20100115071856.GD17978%40reaktio.net > >> >> My suspicion is that Xen has some problems with mmap of PCI(E) device >> memory. How is iomem handled in a mmap? > > It looks to be using ''ioremap'' which is Xen safe. Unless your card has > an AGP bridge on it, at which point it would end up using > dma_alloc_coherent in all likehood. > >> >> As of now, accelerator on Xen stops right at the initialisation stage - when >> libdrm tries to set up the accelerator-engine in the course of ScreenInit. And >> to do that, it cannot write the command to setup the basic 2D engine. > > I think that the ttm_bo calls set up pages in the 4KB size, but the > initial channel requests a 64KB one. I think it also sets upGot that far, tried some dirty patches of mine which broke the framebuffer Your ttm patch using dma_alloc_coherent instead of alloc_page resulted in the same problem as with the Radeon report - leaking pages, erroneous page count> page-table directory so that when the GPU accesses the addresses, it > gets the real bus address. I wonder if it fails at that thought - > meaning that the addresses that are written to the page table are > actually the guest page numbers (gpfn) instead of the machine page numbers (mfn).No, I don''t think thats how it works. The user-space write triggers an aio-write - I got that in a trace that my patch caused - which page_faults and leads to the ttm_bo_fault. I tried to alloc_pages in ttm_bo_vm_fault but I think I got the remap_pfn_range address parameter wrong. This patch crashed the same way under bare boot as on xen with_or_without the patch! So it is clearly the mmap of pushbuf thats the block. ttm_bo_vm_fault is the pivot for the pushbuf_bo allocation My patch in ttm_bo_vm_fault: if (io_mem) { /* retain the orig. speculative pre-fault code */ ... } else { /* ttm_bo_get_pages is modified __ttm_tt_get_page using alloc_pages Irrespective of where fault occurs, fault-in the whole buffer */ pages = ttm_bo_get_pages(ttm, get_order(bo->num_pages)); pfn = page_to_pfn(page); remap_pfn_range(vma, bo->buffer_start, pfn, bo->num_pages << PAGE_SHIFT, vma->vm_page_prot); /* Triggers Kernel BUG invalid opcode */ } BTW, ttm_bo_vm_fault is the ONLY user of vm_insert_mixed in the kernel tree! Tried to use split_page() - resulted in undefined symbol!> The other issue might be that your back-port broke the AGP allocation. >Nope - untouched and same. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, Mar 3, 2010 at 3:04 AM, Arvind R <arvino55@gmail.com> wrote:> On Mon, Mar 1, 2010 at 9:31 PM, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> wrote: >> On Fri, Feb 26, 2010 at 09:04:33PM +0530, Arvind R wrote: >>> On Thu, Feb 25, 2010 at 11:14 PM, Konrad Rzeszutek Wilk >>> <konrad.wilk@oracle.com> wrote: >>> > On Thu, Feb 25, 2010 at 09:01:48AM -0800, Arvind R wrote: >>> >> On Thu, Feb 25, 2010 at 6:25 PM, Konrad Rzeszutek Wilk >>> >> <konrad.wilk@oracle.com> wrote: >>> >> > On Thu, Feb 25, 2010 at 02:16:07PM +0530, Arvind R wrote: >>> >> >> I merged the drm-tree from 2.6.33-rc8 into jeremy''s 2.6.31.6 master and >>> >> ======= snip ======>>> >> > is not. Would it be possible to trace down who allocates that *chan? You >>> >> > say it is ''PRAMIN'' - is that allocated via pci_alloc_* call? >>> ======= snip ======>>> >> So, there must be a mmap call somewhere to map the area to user-space >>> >> for that problem write to work on non-Xen boots. Will try track down some more >>> >> and post. With mmaps and PCIGARTs - it will be some hunt! >>> ======= snip ======>>> > to the drm_radeon driver which used it as a ring buffer. Took a bit of >>> > hoping around to find who allocated it in the first place. >>> > >>> The pushbuf (FIFO/RING) is the only means of programming the card DMA>> the ''ttm_bo_init''. I remember Pasi having an issue with this on Radeon >> and I provided a hack to see if it would work. Take a look at this >> e-mail: >> >> http://lists.xensource.com/archives/cgi-bin/extract-mesg.cgi?a=xen-devel&m=2010-01&i=20100115071856.GD17978%40reaktio.net >> >>>>> It looks to be using ''ioremap'' which is Xen safe. Unless your card has >> an AGP bridge on it, at which point it would end up using >> dma_alloc_coherent in all likehood.Can''t do that - some later allocations are huge.>>> >>> As of now, accelerator on Xen stops right at the initialisation stage - when>> I think that the ttm_bo calls set up pages in the 4KB size, but the >> initial channel requests a 64KB one. I think it also sets up> Your ttm patch using dma_alloc_coherent instead of alloc_page resulted in > the same problem as with the Radeon report - leaking pages, erroneous page count>> page-table directory so that when the GPU accesses the addresses, it >> gets the real bus address. I wonder if it fails at that thought - >> meaning that the addresses that are written to the page table are >> actually the guest page numbers (gpfn) instead of the machine page numbers (mfn). > > No, I don''t think thats how it works. The user-space write triggers an > aio-write -which triggers do_page_fault, handle_mm_fault, do_linear_fault, __do_fault and finally ttm_bo_vm_fault. ttm_bo_fault returns VM_FAULT_NOPAGE - but xen-boot keeps on re-triggering the same fault. when vm_fault calls ttm_tt_get_page, the page is already there, and the handler does another vm_insert_page (i changed vm_insert_mixed vm_insert_page/pfn based on io_mem, now the only patch, and it works on bare machine) on and on and on. What can possibly cause the fault-handler to repeat endlessly? If a wrong page is backed at the user-address, it should create bad_access or some other subsequent events - but the system is running fine minus all local consoles! If the insertion is to a wrong place, this can happen; but the top-level trap is the only provider of the address - and the fault addres and vma address match, and the same code works fine on bare-boot. ttm_tt_get_page calls alloc in a loop - so it may allocate multiple pages from start/end depending on Highmem memory or not - implying asynchronous allocation and mapping. All I want now is *ptr = (uint32_t)data to work as of now! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> >> page-table directory so that when the GPU accesses the addresses, it > >> gets the real bus address. I wonder if it fails at that thought - > >> meaning that the addresses that are written to the page table are > >> actually the guest page numbers (gpfn) instead of the machine page numbers (mfn). > > > > No, I don''t think thats how it works. The user-space write triggers an > > aio-write - > > which triggers do_page_fault, handle_mm_fault, do_linear_fault, __do_fault > and finally ttm_bo_vm_fault. > ttm_bo_fault returns VM_FAULT_NOPAGEVM_FAULT_NOPAGE = means retry the fault, In other words, I''ve fixed the PTE to point to the right PFN.> > - but xen-boot keeps on re-triggering the same fault.Which probably means that something is not OK with the PTE. What is the vma->vm_page_prot value before the vm_insert_mixed? (and maybe even after) Try also reading the true value of the PTE and seeing what it shows before and after the vm_insert_mixed. I''ve attached a simple patch I wrote some time ago to get the real MFNs and its page protection. I think you can adapt it (print_data function to be exact) to peet at the PTE and its protection values. There is an extra flag that the PTE can have when running under Xen: _PAGE_IOMAP. This signifies that the PFN is actually the MFN. In this case thought it sholdn''t be enabled b/c the memory is actually gathered from alloc_page. But if it is, it might be the culprit.> when vm_fault calls ttm_tt_get_page, the page is already there, and > the handler does another vm_insert_page (i changed vm_insert_mixed > vm_insert_page/pfn based on io_mem, now the only patch, and it works on > bare machine) on and on and on. > > What can possibly cause the fault-handler to repeat endlessly?The VM_FAULT_NOPAGE shortcircuits most of the fault-handler and makes it return back. The application is resumed and retries the operation that caused the fault - in this case an attempt to write to an address that was not present. Obviously the second attempt at writing to the address should have worked without problems.> If a wrong page is backed at the user-address, it should create bad_access or > some other subsequent events - but the system is running fine minus all local > consoles! If the insertion is to a wrong place, this can happen; but > the top-level > trap is the only provider of the address - and the fault addres and > vma address match, > and the same code works fine on bare-boot.So you see this fault handler being called endlessly while the machine is still running and other pieces of code work just fine, right?> > ttm_tt_get_page calls alloc in a loop - so it may allocate multiple pages from > start/end depending on Highmem memory or not - implying asynchronous allocation > and mapping.I thought it had some logic to figure out that it already handled this page and would return an already allocate page?> > All I want now is *ptr = (uint32_t)data to work as of now!You are doing a great job at this head-spinning detective work. Much appreciated! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:>> > aio-write - >> >> which triggers do_page_fault, handle_mm_fault, do_linear_fault, __do_fault >> and finally ttm_bo_vm_fault.> I''ve attached a simple patch I wrote some time ago to get the real MFNs > and its page protection. I think you can adapt it (print_data function to be exact) > to peet at the PTE and its protection values.Have patched - did not apply clean. Will compile and get some info.> There is an extra flag that the PTE can have when running under Xen: _PAGE_IOMAP. > This signifies that the PFN is actually the MFN. In this case thought > it sholdn''t be enabled b/c the memory is actually gathered from > alloc_page. But if it is, it might be the culprit.>> What can possibly cause the fault-handler to repeat endlessly?FYI: about 2000 times a second - slowed by printk>> If a wrong page is backed at the user-address, it should create bad_access or >> some other subsequent events - but the system is running fine minus all local > So you see this fault handler being called endlessly while the machine > is still running and other pieces of code work just fine, right?Right. Can ssh in - but no local console>> ttm_tt_get_page calls alloc in a loop - so it may allocate multiple pages from >> start/end depending on Highmem memory or not - implying asynchronous allocation >> and mapping. > > I thought it had some logic to figure out that it already handled this > page and would return an already allocate page?Right. I think the problem lies in the vm_insert_pfn/page/mixed family of functions. These are only used (grep''ed kernel tree) and invariably for mmaping. Scsi-tgt, mspec, some media/video, poch,android in staging and ttm - and, surprise - xen/blktap/ring.c and device.c - which both check XENFEAT_auto_translated_physmap Pls. look at xen/blktap/ring.c - it looks to be what we need _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote:> On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> wrote: > >> > aio-write - > >> > >> which triggers do_page_fault, handle_mm_fault, do_linear_fault, __do_fault > >> and finally ttm_bo_vm_fault. > > > I''ve attached a simple patch I wrote some time ago to get the real MFNs > > and its page protection. I think you can adapt it (print_data function to be exact) > > to peet at the PTE and its protection values. > Have patched - did not apply clean. Will compile and get some info.Right. I don''t think it would help you immediately - I was thinking you could take the print_data function and just jam it in the tt_bo_vm_fault code and use it to print the PTE data.> > > There is an extra flag that the PTE can have when running under Xen: _PAGE_IOMAP. > > This signifies that the PFN is actually the MFN. In this case thought > > it sholdn''t be enabled b/c the memory is actually gathered from > > alloc_page. But if it is, it might be the culprit. > > >> What can possibly cause the fault-handler to repeat endlessly? > > FYI: about 2000 times a second - slowed by printk > > >> If a wrong page is backed at the user-address, it should create bad_access or > >> some other subsequent events - but the system is running fine minus all local > > So you see this fault handler being called endlessly while the machine > > is still running and other pieces of code work just fine, right? > Right. Can ssh in - but no local console > > >> ttm_tt_get_page calls alloc in a loop - so it may allocate multiple pages from > >> start/end depending on Highmem memory or not - implying asynchronous allocation > >> and mapping. > > > > I thought it had some logic to figure out that it already handled this > > page and would return an already allocate page? > Right. > > I think the problem lies in the vm_insert_pfn/page/mixed family of functions. > These are only used (grep''ed kernel tree) and invariably for mmaping. > Scsi-tgt, mspec, some media/video, poch,android in staging and ttm > - and, surprise - xen/blktap/ring.c and device.c > - which both check XENFEAT_auto_translated_physmap > > Pls. look at xen/blktap/ring.c - it looks to be what we needLet me take a look at it tomorrow. Bit swamped. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Mar 4, 2010 at 11:55 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote: >> On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk >> <konrad.wilk@oracle.com> wrote: >> >> > aio-write - >> >> >> >> which triggers do_page_fault, handle_mm_fault, do_linear_fault, __do_fault >> >> and finally ttm_bo_vm_fault. >> >> > I''ve attached a simple patch I wrote some time ago to get the real MFNs >> Have patched - did not apply clean. Will compile and get some info. > take the print_data function and just jam it in the tt_bo_vm_fault codeLinking problems. But compiled and run !!! CANNOT lookup_address()!!! Returns NULL on bare AND Xen Before AND After vm_insert/remap_pfn. Address looked_up is what fault_handler passes in. Had to add a NULL check in print_data. Bare-boot log. [TTM] ttm_bo_vm_fault: faulting-in pages, TTM_PAGE_FLAGS=0x0 [ Before:]PFN: Failed lookup_address of 0x7fd82e9aa000 [ After :]PFN: Failed lookup_address of 0x7fd82e9aa000 Ring any bells?>> > There is an extra flag that the PTE can have when running under Xen: _PAGE_IOMAP. >> > This signifies that the PFN is actually the MFN. In this case thought >> > it sholdn''t be enabled b/c the memory is actually gathered from >> > alloc_page. But if it is, it might be the culprit.>> I think the problem lies in the vm_insert_pfn/page/mixed family of functions. >> These are only used (grep''ed kernel tree) and invariably for mmaping. >> Scsi-tgt, mspec, some media/video, poch,android in staging and ttm >> - and, surprise - xen/blktap/ring.c and device.c >> - which both check XENFEAT_auto_translated_physmap >> >> Pls. look at xen/blktap/ring.c - it looks to be what we need > > Let me take a look at it tomorrow. Bit swamped. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Mar 05, 2010 at 01:16:13PM +0530, Arvind R wrote:> On Thu, Mar 4, 2010 at 11:55 PM, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> wrote: > > On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote: > >> On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk > >> <konrad.wilk@oracle.com> wrote: > >> >> > aio-write - > >> >> > >> >> which triggers do_page_fault, handle_mm_fault, do_linear_fault, __do_fault > >> >> and finally ttm_bo_vm_fault. > >> > >> > I''ve attached a simple patch I wrote some time ago to get the real MFNs > >> Have patched - did not apply clean. Will compile and get some info. > > take the print_data function and just jam it in the tt_bo_vm_fault code > Linking problems. But compiled and run > !!! CANNOT lookup_address()!!! Returns NULL on bare AND Xen > Before AND After vm_insert/remap_pfn. Address looked_up is whatThe "after" is a bit surprise. I would have thought it would would have update the page-table with the new PFN. But maybe it did, but for a different address (since it does not actually use the ''address'' field but __va(pfn)<< PAGE_SHIFT as the address).> fault_handler passes in. Had to add a NULL check in print_data. > > Bare-boot log. > [TTM] ttm_bo_vm_fault: faulting-in pages, TTM_PAGE_FLAGS=0x0 > [ Before:]PFN: Failed lookup_address of 0x7fd82e9aa000 > [ After :]PFN: Failed lookup_address of 0x7fd82e9aa000 > > Ring any bells?Yeah... Can you also instrument the code to print the PFN? The code goes through insert_pfn->pfn_pte, which calls xen_make_pte, which ends up doing pte_pfn_to_mfn. That routine does a pfn_to_mfn which does a get_phys_to_machine(pfn). The last routine looks up the PFN->MFN lookup table and finds a MFN that corresponds to this PFN. Since the memory was allocated from ... well this is the big question. Is the memory allocated from normal kernel space or is really backed by the video card. In your previous e-mails you mentioned that is_iomem is set to zero, which implies that the memory for these functions is NOT memory backed.> > >> > There is an extra flag that the PTE can have when running under Xen: _PAGE_IOMAP. > >> > This signifies that the PFN is actually the MFN. In this case thought > >> > it sholdn''t be enabled b/c the memory is actually gathered from > >> > alloc_page. But if it is, it might be the culprit. > > >> I think the problem lies in the vm_insert_pfn/page/mixed family of functions. > >> These are only used (grep''ed kernel tree) and invariably for mmaping. > >> Scsi-tgt, mspec, some media/video, poch,android in staging and ttm > >> - and, surprise - xen/blktap/ring.c and device.c > >> - which both check XENFEAT_auto_translated_physmap > >> > >> Pls. look at xen/blktap/ring.c - it looks to be what we need > > > > Let me take a look at it tomorrow. Bit swamped.I started going through the function allocations that were done and found this in ttm_bo_mmap: vma->vm_flags |= VM_RESERVED | VM_IO | VM_MIXEDMAP | VM_DONTEXPAND; the VM_IO is OK if the memory that is being referenced is the video driver memory. _BUT_ if the memory is being allocated through the alloc_page (ttm_tt_alloc_page) , or kmalloc, then this will cause us headaches. You might want to check in ttm_bo_vm_fault what the vma->vm_flags are and if VM_IO is set. (FYI, look at http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=commit;h=e84db8b7136d1b4a393dbd982201d0c5a3794333) If the VM_IO is set, change that ttm_bo_mmap to not have VM_IO and see how that works. Thought I am not sure if the ttm_bo_mmap is used by the nvidia driver. Attached is a re-write of the debug patch I sent earlier. I compile tested it but haven''t yet run it (just doing that now). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sat, Mar 6, 2010 at 1:53 AM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> On Fri, Mar 05, 2010 at 01:16:13PM +0530, Arvind R wrote: >> On Thu, Mar 4, 2010 at 11:55 PM, Konrad Rzeszutek Wilk >> <konrad.wilk@oracle.com> wrote: >> > On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote: >> >> On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk >> >> <konrad.wilk@oracle.com> wrote: > Yeah... Can you also instrument the code to print the PFN? The code goes > through insert_pfn->pfn_pte, which calls xen_make_pte, which ends up > doing pte_pfn_to_mfn. That routine does a pfn_to_mfn which does a > get_phys_to_machine(pfn). The last routine looks up the PFN->MFN lookup > table and finds a MFN that corresponds to this PFN. Since the memory > was allocated from ... well this is the big question. > > Is the memory allocated from normal kernel space or is really backed by > the video card. In your previous e-mails you mentioned that is_iomem is > set to zero, which implies that the memory for these functions is NOT > memory backed. > > the VM_IO is OK if the memory that is being referenced is the video > driver memory. _BUT_ if the memory is being allocated through the > alloc_page (ttm_tt_alloc_page) , or kmalloc, then this will cause us > headaches. You might want to check in ttm_bo_vm_fault what the > vma->vm_flags are and if VM_IO is set. > > (FYI, look at > http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=commit;h=e84db8b7136d1b4a393dbd982201d0c5a3794333)How do you remember these refs?!> > Thought I am not sure if the ttm_bo_mmap is used by the nvidia driver.U mean nouveau? Only for accelerated graphics.> Attached is a re-write of the debug patch I sent earlier. I compile > tested it but haven''t yet run it (just doing that now). >Output: (snipped/cut/pasted for easier association) Trace of Pushbuf Memory Access, Bare-BOOT: X: OUT_RING: Enter: chan=0x8170a0, id=2, data=0x48000, chan->cur=0x7f0aa3594054 kernel: [TTM] FAULTing-in address=0x7f0aa3594000, bo->buffer_start=0x0 kernel: [ BEFORE]PFN: 0x7513f PTE: 0x750001e3 (val:750001e3): [ RW PSE GLB x ] [2M] kernel: [ AFTER]PFN: 0x7513f PTE: 0x750001e3 (val:750001e3): [ RW PSE GLB x ] [2M] kernel: [BEFORE]PFN: 0x75144 PTE: 0x750001e3 (val:750001e3): [ RW PSE GLB x ][2M] kernel: [ AFTER]PFN: 0x75144 PTE: 0x750001e3 (val:750001e3): [ RW PSE GLB x ] [2M] < --- and so on for 14 more pages ---> X: OUT_RING: updated data X: OUT_RING: Exit Trace of Pushbuf Memory Access, Xen-BOOT: X: OUT_RING: Enter: chan=0x8170a0, id=2, data=0x44000, chan->cur=0x7f98838df000 kernel: [TTM] FAULTing-in address=0x7f98838df000, bo->buffer_start=0x0 kernel: [BEFORE]PFN: 0x16042 PTE: 0x10000068042067 (val:10000068042067): [mfn:426050->ffff880016042000USR RW x ] [4K] kernel: [ AFTER]PFN: 0x16042 PTE: 0x10000068042067 (val:10000068042067): [mfn:426050->ffff880016042000USR RW x ] [4K] kernel: [BEFORE]PFN: 0x16043 PTE: 0x10000068043067 (val:10000068043067): [mfn:426051->ffff880016043000USR RW x ] [4K] kernel: [ AFTER]PFN: 0x16043 PTE: 0x10000068043067 (val:10000068043067): [mfn:426051->ffff880016043000USR RW x ] [4K] < --- and so on for 14 more pages ---> < --- and repeat fault ---> kernel: [TTM] FAULTing-in address=0x7f98838df000, bo->buffer_start=0x0 Do you know what is happening? Is a solution feasible? Sequence of nouveau operation as I understand it: 1. prepare for user pushbuf write by grabbing memory access rights (exclude GPU access) 2. Do the write 3. finish and release grab The memory may/maynot be on the video card. There is a vram_pushbuf module option which would probably complicate things more. GPU is informed about the address, I suppose, in the prepare and finish pre/postamble to RING access. and, THANKS hugely for your help. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sat, Mar 6, 2010 at 1:46 PM, Arvind R <arvino55@gmail.com> wrote:> On Sat, Mar 6, 2010 at 1:53 AM, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> wrote: >> On Fri, Mar 05, 2010 at 01:16:13PM +0530, Arvind R wrote: >>> On Thu, Mar 4, 2010 at 11:55 PM, Konrad Rzeszutek Wilk >>> <konrad.wilk@oracle.com> wrote: >>> > On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote: >>> >> On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk >>> >> <konrad.wilk@oracle.com> wrote: >> Yeah... Can you also instrument the code to print the PFN? The code goes >> through insert_pfn->pfn_pte, which calls xen_make_pte, which ends up >> doing pte_pfn_to_mfn. That routine does a pfn_to_mfn which does a >> get_phys_to_machine(pfn). The last routine looks up the PFN->MFN lookup >> table and finds a MFN that corresponds to this PFN. Since the memory >> was allocated from ... well this is the big question.by ttm_tt_page_alloc and is not backed by video-memory. the nouveau folks have just added a patch that disables pushbuf in video memory.>> >> Is the memory allocated from normal kernel space or is really backed by >> the video card. In your previous e-mails you mentioned that is_iomem is >> set to zero, which implies that the memory for these functions is NOT >> memory backed.right. see continuation below.>> >> the VM_IO is OK if the memory that is being referenced is the video >> driver memory. _BUT_ if the memory is being allocated through the >> alloc_page (ttm_tt_alloc_page) , or kmalloc, then this will cause us >> headaches. You might want to check in ttm_bo_vm_fault what the >> vma->vm_flags are and if VM_IO is set. >> >> (FYI, look at >> http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=commit;h=e84db8b7136d1b4a393dbd982201d0c5a3794333) > How do you remember these refs?!Sadly, VM_IO is set. Tried not setting it (in ttm_bo_map) - works on bare-boot, but crashes (very) hard on Xen. Tried setting it conditional to io_mem; same result. No logs even, so don''t know what happened.>> Thought I am not sure if the ttm_bo_mmap is used by the nvidia driver. > U mean nouveau? Only for accelerated graphics. > >> Attached is a re-write of the debug patch I sent earlier. I compile >> tested it but haven''t yet run it (just doing that now). >> > Output: (snipped/cut/pasted for easier association) > > Trace of Pushbuf Memory Access, Bare-BOOT: > X: OUT_RING: Enter: chan=0x8170a0, id=2, data=0x48000, chan->cur=0x7f0aa3594054 > kernel: [TTM] FAULTing-in address=0x7f0aa3594000, bo->buffer_start=0x0 > kernel: [ BEFORE]PFN: 0x7513f PTE: 0x750001e3 (val:750001e3): [ RW > PSE GLB x ] [2M] > kernel: [ AFTER]PFN: 0x7513f PTE: 0x750001e3 (val:750001e3): [ > RW PSE GLB x ] [2M] > kernel: [BEFORE]PFN: 0x75144 PTE: 0x750001e3 (val:750001e3): [ RW > PSE GLB x ][2M] > kernel: [ AFTER]PFN: 0x75144 PTE: 0x750001e3 (val:750001e3): [ RW > PSE GLB x ] [2M] > < --- and so on for 14 more pages ---> > X: OUT_RING: updated data > X: OUT_RING: Exit > > Trace of Pushbuf Memory Access, Xen-BOOT: > X: OUT_RING: Enter: chan=0x8170a0, id=2, data=0x44000, chan->cur=0x7f98838df000 > kernel: [TTM] FAULTing-in address=0x7f98838df000, bo->buffer_start=0x0 > kernel: [BEFORE]PFN: 0x16042 PTE: 0x10000068042067 > (val:10000068042067): [mfn:426050->ffff880016042000USR RW > x ] [4K] > kernel: [ AFTER]PFN: 0x16042 PTE: 0x10000068042067 > (val:10000068042067): [mfn:426050->ffff880016042000USR RW > x ] [4K] > kernel: [BEFORE]PFN: 0x16043 PTE: 0x10000068043067 > (val:10000068043067): [mfn:426051->ffff880016043000USR RW > x ] [4K] > kernel: [ AFTER]PFN: 0x16043 PTE: 0x10000068043067 > (val:10000068043067): [mfn:426051->ffff880016043000USR RW > x ] [4K] > < --- and so on for 14 more pages ---> > < --- and repeat fault ---> > kernel: [TTM] FAULTing-in address=0x7f98838df000, bo->buffer_start=0x0 >Note that the patch now effectively looks up page_address(address)> Do you know what is happening? Is a solution feasible? > > Sequence of nouveau operation as I understand it: > 1. prepare for user pushbuf write by grabbing memory access rights > (exclude GPU access) > 2. Do the write > 3. finish and release grab > > The memory may/maynot be on the video card. There is a vram_pushbuf > module option which would probably complicate things more.the option is now made ineffective by the nouveau folks. Continuation, after some code reading: The pushbuf needs to be backed by some memory - any memory. The memory is allocated after the mmap call (which sets up VM), by the fault-handler. The user-space program (X thro'' libdrm-nouveau) issues ioctls that are effectively sync_to_cpu/device and wites to the buffer - thereby invoking the fault handler. After writing, ioctls are issued to sync and ends up with nouveau_pushbuf_flush - which treats the pushbuf memory to be __force user-space memory and does a DRM_COPY_FROM_USER which is in fact copy_from_user into a locally allocated (GFP_KERNEL) buffer and writes it out to the video card (basically iomem writes). What gets written triggers activity on the GPU (the parameters having been set and associated with some other buffers) and the caller waits on an event_queue for notification. Strange way of doing things, but guess this is the ground-work for the future of GEM. All that is achieved is hide buffer-allocations from the user - this may be important - cos if the fault-handler installs only one page and leaves the rest to be allocated by future faults - the video card hangs in PFIFO errors! So the special value of SPECULATIVE_PRE_INSTALL - 16 pages. I suppose that in xen-boot, the pages are installed to the wrong address. The installed pages need NOT be contiguous - the contiguous pages happen only at the first X invocation on a bare boot. So choosing a different allocator is possible in the fault-handler - the problem is in the freeing of the allocation. Sorry for the messed-up postings _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sun, Mar 7, 2010 at 2:29 AM, Arvind R <arvino55@gmail.com> wrote:> On Sat, Mar 6, 2010 at 1:46 PM, Arvind R <arvino55@gmail.com> wrote: >> On Sat, Mar 6, 2010 at 1:53 AM, Konrad Rzeszutek Wilk >> <konrad.wilk@oracle.com> wrote: >>> On Fri, Mar 05, 2010 at 01:16:13PM +0530, Arvind R wrote: >>>> On Thu, Mar 4, 2010 at 11:55 PM, Konrad Rzeszutek Wilk >>>> <konrad.wilk@oracle.com> wrote: >>>> > On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote: >>>> >> On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk >>>> >> <konrad.wilk@oracle.com> wrote:>>> (FYI, look at >>> http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=commit;h=e84db8b7136d1b4a393dbd982201d0c5a3794333)THAT SOLVED THE FAULTING; OUT_RING now completes under Xen. My typo and testing mistakes. Patched ttm_bo_mmap vma->vm_flags |= VM_RESERVED | VM_MIXEDMAP | VM_DONTEXPAND; if (bo->type != ttm_bo_type_device) vma->vm_flags |= VM_IO; Then, put sleep and exit in libdrm OUT_RING. The fault-handler worked fine! One question - How to get DMA addresses for user-buffers under Xen. Will work on that. HUGE THANKS! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sun, Mar 07, 2010 at 05:26:12AM +0530, Arvind R wrote:> On Sun, Mar 7, 2010 at 2:29 AM, Arvind R <arvino55@gmail.com> wrote: > > On Sat, Mar 6, 2010 at 1:46 PM, Arvind R <arvino55@gmail.com> wrote: > >> On Sat, Mar 6, 2010 at 1:53 AM, Konrad Rzeszutek Wilk > >> <konrad.wilk@oracle.com> wrote: > >>> On Fri, Mar 05, 2010 at 01:16:13PM +0530, Arvind R wrote: > >>>> On Thu, Mar 4, 2010 at 11:55 PM, Konrad Rzeszutek Wilk > >>>> <konrad.wilk@oracle.com> wrote: > >>>> > On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote: > >>>> >> On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk > >>>> >> <konrad.wilk@oracle.com> wrote: > > >>> (FYI, look at > >>> http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=commit;h=e84db8b7136d1b4a393dbd982201d0c5a3794333) > > THAT SOLVED THE FAULTING; OUT_RING now completes under Xen.That is great! Thanks for doing all the hard-work in digging through the code.> > My typo and testing mistakes. > Patched ttm_bo_mmap > vma->vm_flags |= VM_RESERVED | VM_MIXEDMAP | VM_DONTEXPAND; > if (bo->type != ttm_bo_type_device) > vma->vm_flags |= VM_IO; > > Then, put sleep and exit in libdrm OUT_RING. > The fault-handler worked fine!So this means you got graphics on the screen? Or at least that Kernel Mode Setting and the DRM parts show fancy graphics during boot?> > One question - How to get DMA addresses for user-buffers under Xen.This is the X part right? Where the X driver takes control of the GPU and starts having fun? I am not that familiar with how the drm_nouvou module hands over the pointers and such to the X driver? Does it reset it and start from scratch (as if you had no KMS enabled?) Or does it use the allocated buffers and such and then asks for more using ioctl such as DRM_ALLOCATE_SCATTER_GATHER (don''t remember if that was the right name). But to answer your question, the DMA address is actually the MFN (machine frame number) which is bitshifted by twelve and an offset added. The debug patch I provided gets that from the PTE value: if (xen_domain()) { + phys = (pte_mfn(*pte) << PAGE_SHIFT) + offset; The ''phys'' now has the physical address that PCI bus (and the video card) would utilize to request data to. Please keep in mind that the ''pte_mfn'' is a special Xen function. Normally one would do ''pte''. There is a layer of indirection in the Linux pvops kernel that makes this a bit funny. Mainly most of the time you get something called GPFN which is a psedu-physical MFN. Then there is a translation of PFN to MFN (or vice-versa). For pages that are being utilized for PCI devices (and that have _PAGE_IOMAP PTE flag set), the GPFN is actually the MFN, while for the rest (like the pages allocated by the mmap and then stitched up in the ttm_bo_fault handler), it is the PFN. .. back to the DMA part. When kernel subsystems do DMA they go through a PCI DMA API. This API has things such as ''dma_map_page'', which through layers of indirection calls the Xen SWIOTLB layer. The Xen SWIOTLB is smart enough (actually, the enligthen.c) to distinguish if the page has _PAGE_IOMAP set or not and to figure out if the PTE has a MFN or PFN. Either way, the PCI DMA API _always_ return the DMA address for pages. So as long as a user-buffer has ''struct page'' backing it it should be possible to get the DMA address. Hopefully I''ve not confused this matter :-(> Will work on that. > > HUGE THANKS!Oh, thank you! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Mar 8, 2010 at 11:21 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> On Sun, Mar 07, 2010 at 05:26:12AM +0530, Arvind R wrote: >> On Sun, Mar 7, 2010 at 2:29 AM, Arvind R <arvino55@gmail.com> wrote: >> > On Sat, Mar 6, 2010 at 1:46 PM, Arvind R <arvino55@gmail.com> wrote: >> >> On Sat, Mar 6, 2010 at 1:53 AM, Konrad Rzeszutek Wilk >> >> <konrad.wilk@oracle.com> wrote: >> >>> On Fri, Mar 05, 2010 at 01:16:13PM +0530, Arvind R wrote: >> >>>> On Thu, Mar 4, 2010 at 11:55 PM, Konrad Rzeszutek Wilk >> >>>> <konrad.wilk@oracle.com> wrote: >> >>>> > On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote: >> >>>> >> On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk >> >>>> >> <konrad.wilk@oracle.com> wrote: >> >> >>> (FYI, look at >> >>> http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=commit;h=e84db8b7136d1b4a393dbd982201d0c5a3794333) >> >> THAT SOLVED THE FAULTING; OUT_RING now completes under Xen. > > That is great! Thanks for doing all the hard-work in digging through the > code. > > > So this means you got graphics on the screen? Or at least that Kernel > Mode Setting and the DRM parts show fancy graphics during boot?AT LAST, yes! Patch: (after aboout 600 reboots!) diff -Naur nouveau-kernel.orig/drivers/gpu/drm/ttm/ttm_bo_vm.c nouveau-kernel.new/drivers/gpu/drm/ttm/ttm_bo_vm.c --- nouveau-kernel.orig/drivers/gpu/drm/ttm/ttm_bo_vm.c 2010-01-27 10:19:28.000000000 +0530 +++ nouveau-kernel.new/drivers/gpu/drm/ttm/ttm_bo_vm.c 2010-03-10 17:28:59.000000000 +0530 @@ -271,7 +271,10 @@ */ vma->vm_private_data = bo; - vma->vm_flags |= VM_RESERVED | VM_IO | VM_MIXEDMAP | VM_DONTEXPAND; + vma->vm_flags |= VM_RESERVED | VM_MIXEDMAP | VM_DONTEXPAND; + if (!((bo->mem.placement & TTM_PL_MASK_MEM) & TTM_PL_FLAG_TT)) + vma->vm_flags |= VM_IO; + vma->vm_page_prot = vma_get_vm_prot(vma->vm_flags); return 0; out_unref: ttm_bo_unref(&bo); The previous patch worked for memory-space exported to user via mmap. That worked for the pushbuf, but not for mode-setting (I guess). The ensuing crashes were hard - no logs, nothing. So had to devise ways of forcing log-writing before crashing (and praying). The located iomem problem and had search code for appropriate condition. And setting the vm_page_prot IS important! Nouveau does kernel-modesetting only. The framebuffer device uses channel 1 and is as regular a framebuffer as any other. 2D graphics operations use channel 2 (xf86-video-nouveau). 3D graphics (gallium) use a channel for every 3D window. There are 128 channels, 0 and 127 being reserved. Every channel has a dma-engine which is user triggered thro'' pushbuffer rings. Every DMA has a 1MiB VRAM space which forms one of the targets of DMA ops - the other being in the opaque GPU-space. The BO encapsualtes the virtual-address space of the user VM. and the GPU-DMA is provided a constructed PageTable that is consistent with the kernel view of that space. The GEM_NEW ioctl sets up the whole space-management machinery, the user-space is mmaped out, and the operations triggered thro the pushbuf.> But to answer your question, the DMA address is actually the MFN > (machine frame number) which is bitshifted by twelve and an offset > added. The debug patch I provided gets that from the > > PTE value: > > if (xen_domain()) { > + phys = (pte_mfn(*pte) << PAGE_SHIFT) + offset; > > The ''phys'' now has the physical address that PCI bus (and the video > card) would utilize to request data to. Please keep in mind that the > ''pte_mfn'' is a special Xen function. Normally one would do ''pte''. > > There is a layer of indirection in the Linux pvops kernel that makes > this a bit funny. Mainly most of the time you get something called GPFN > which is a psedu-physical MFN. Then there is a translation of PFN to > MFN (or vice-versa). For pages that are being utilized for PCI devices > (and that have _PAGE_IOMAP PTE flag set), the GPFN is actually the MFN, > while for the rest (like the pages allocated by the mmap and then > stitched up in the ttm_bo_fault handler), it is the PFN. > > .. back to the DMA part. When kernel subsystems do DMA they go through a > PCI DMA API. This API has things such as ''dma_map_page'', which through > layers of indirection calls the Xen SWIOTLB layer. The Xen SWIOTLB is > smart enough (actually, the enligthen.c) to distinguish if the page has > _PAGE_IOMAP set or not and to figure out if the PTE has a MFN or PFN. > > Hopefully I''ve not confused this matter :-(On the contrary, a neat essence of the matter - only wish it was clear to me a month ago:-( YAHOO! (just a simple shout) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, Mar 10, 2010 at 06:20:42PM +0530, Arvind R wrote:> On Mon, Mar 8, 2010 at 11:21 PM, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> wrote: > > On Sun, Mar 07, 2010 at 05:26:12AM +0530, Arvind R wrote: > >> On Sun, Mar 7, 2010 at 2:29 AM, Arvind R <arvino55@gmail.com> wrote: > >> > On Sat, Mar 6, 2010 at 1:46 PM, Arvind R <arvino55@gmail.com> wrote: > >> >> On Sat, Mar 6, 2010 at 1:53 AM, Konrad Rzeszutek Wilk > >> >> <konrad.wilk@oracle.com> wrote: > >> >>> On Fri, Mar 05, 2010 at 01:16:13PM +0530, Arvind R wrote: > >> >>>> On Thu, Mar 4, 2010 at 11:55 PM, Konrad Rzeszutek Wilk > >> >>>> <konrad.wilk@oracle.com> wrote: > >> >>>> > On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote: > >> >>>> >> On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk > >> >>>> >> <konrad.wilk@oracle.com> wrote: > >> > >> >>> (FYI, look at > >> >>> http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=commit;h=e84db8b7136d1b4a393dbd982201d0c5a3794333) > >> > >> THAT SOLVED THE FAULTING; OUT_RING now completes under Xen. > > > > That is great! Thanks for doing all the hard-work in digging through the > > code. > > > > > > So this means you got graphics on the screen? Or at least that Kernel > > Mode Setting and the DRM parts show fancy graphics during boot? > > AT LAST, yes! Patch: (after aboout 600 reboots!) >Cool, congratulations! -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 03/10/2010 04:50 AM, Arvind R wrote:> On Mon, Mar 8, 2010 at 11:21 PM, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> wrote: > >> On Sun, Mar 07, 2010 at 05:26:12AM +0530, Arvind R wrote: >> >>> On Sun, Mar 7, 2010 at 2:29 AM, Arvind R<arvino55@gmail.com> wrote: >>> >>>> On Sat, Mar 6, 2010 at 1:46 PM, Arvind R<arvino55@gmail.com> wrote: >>>> >>>>> On Sat, Mar 6, 2010 at 1:53 AM, Konrad Rzeszutek Wilk >>>>> <konrad.wilk@oracle.com> wrote: >>>>> >>>>>> On Fri, Mar 05, 2010 at 01:16:13PM +0530, Arvind R wrote: >>>>>> >>>>>>> On Thu, Mar 4, 2010 at 11:55 PM, Konrad Rzeszutek Wilk >>>>>>> <konrad.wilk@oracle.com> wrote: >>>>>>> >>>>>>>> On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote: >>>>>>>> >>>>>>>>> On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk >>>>>>>>> <konrad.wilk@oracle.com> wrote: >>>>>>>>> >>> >>>>>> (FYI, look at >>>>>> http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=commit;h=e84db8b7136d1b4a393dbd982201d0c5a3794333) >>>>>> >>> THAT SOLVED THE FAULTING; OUT_RING now completes under Xen. >>> >> That is great! Thanks for doing all the hard-work in digging through the >> code. >> >> >> So this means you got graphics on the screen? Or at least that Kernel >> Mode Setting and the DRM parts show fancy graphics during boot? >> > AT LAST, yes! Patch: (after aboout 600 reboots!) > > diff -Naur nouveau-kernel.orig/drivers/gpu/drm/ttm/ttm_bo_vm.c > nouveau-kernel.new/drivers/gpu/drm/ttm/ttm_bo_vm.c > --- nouveau-kernel.orig/drivers/gpu/drm/ttm/ttm_bo_vm.c 2010-01-27 > 10:19:28.000000000 +0530 > +++ nouveau-kernel.new/drivers/gpu/drm/ttm/ttm_bo_vm.c 2010-03-10 > 17:28:59.000000000 +0530 > @@ -271,7 +271,10 @@ > */ > > vma->vm_private_data = bo; > - vma->vm_flags |= VM_RESERVED | VM_IO | VM_MIXEDMAP | VM_DONTEXPAND; > + vma->vm_flags |= VM_RESERVED | VM_MIXEDMAP | VM_DONTEXPAND; > + if (!((bo->mem.placement& TTM_PL_MASK_MEM)& TTM_PL_FLAG_TT)) > + vma->vm_flags |= VM_IO; > + vma->vm_page_prot = vma_get_vm_prot(vma->vm_flags); > return 0; > out_unref: > ttm_bo_unref(&bo); > >Cool, nice and simple. Can you write it up as a proper patch for submission to upstream? Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Mar 12, 2010 at 1:45 AM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:>> >> THAT SOLVED THE FAULTING; OUT_RING now completes under Xen. > > :-) > >> > >> > That is great! Thanks for doing all the hard-work in digging through the >> > code. >> > >> > >> > So this means you got graphics on the screen? Or at least that Kernel >> > Mode Setting and the DRM parts show fancy graphics during boot? >> >> AT LAST, yes! Patch: (after aboout 600 reboots!) >> >> diff -Naur nouveau-kernel.orig/drivers/gpu/drm/ttm/ttm_bo_vm.c >> nouveau-kernel.new/drivers/gpu/drm/ttm/ttm_bo_vm.c >> --- nouveau-kernel.orig/drivers/gpu/drm/ttm/ttm_bo_vm.c 2010-01-27 >> 10:19:28.000000000 +0530 >> +++ nouveau-kernel.new/drivers/gpu/drm/ttm/ttm_bo_vm.c 2010-03-10 >> 17:28:59.000000000 +0530 >> @@ -271,7 +271,10 @@ >> */ >> >> vma->vm_private_data = bo; >> - vma->vm_flags |= VM_RESERVED | VM_IO | VM_MIXEDMAP | VM_DONTEXPAND; >> + vma->vm_flags |= VM_RESERVED | VM_MIXEDMAP | VM_DONTEXPAND; >> + if (!((bo->mem.placement & TTM_PL_MASK_MEM) & TTM_PL_FLAG_TT)) >> + vma->vm_flags |= VM_IO; >> + vma->vm_page_prot = vma_get_vm_prot(vma->vm_flags); >> return 0; >> out_unref: >> ttm_bo_unref(&bo); >>Sorry for the typo: vma_get_vm_prot in last added line should be vm_get_page_prot>> The previous patch worked for memory-space exported to user via >> mmap. That worked for the pushbuf, but not for mode-setting (I guess). >> The ensuing crashes were hard - no logs, nothing. So had to devise >> ways of forcing log-writing before crashing (and praying). The located >> iomem problem and had search code for appropriate condition. > > Aaah. >> And setting the vm_page_prot IS important! >> >> Nouveau does kernel-modesetting only. The framebuffer device uses >> channel 1 and is as regular a framebuffer as any other. 2D graphics >> operations use channel 2 (xf86-video-nouveau). 3D graphics (gallium) >> use a channel for every 3D window. There are 128 channels, 0 and 127 >> being reserved. Every channel has a dma-engine which is user triggered > > What happens if you use only one channel? Does it grow to accomodate > more of the writes to the ring?One channel for one compostion. So channel 1 for the consolefb device. So if X is set to omit acceleration, it works thro the consolefb. Channel 2 is set up for 2D graphics which alone xf86-video-nouveau (the DDX component) supports. Channel 3 is set up for 3D acceleration provided by gallium (Mesa) - sort of tunneling thro the DDX layer. If you run glxgears in a window, Channel 4 will be set up the application. Each ''Channe'' is self-contained with pushbufs, dma, bo, gpuobject ....> >> thro'' pushbuffer rings. Every DMA has a 1MiB VRAM space which forms one >> of the targets of DMA ops - the other being in the opaque GPU-space. The > > So 1MiB per channel? Is this how the textures get loaded via this 1MiB > VRAM?Yes.>> BO encapsualtes the virtual-address space of the user VM. and the GPU-DMA >> is provided a constructed PageTable that is consistent with the kernel view of >> that space. The GEM_NEW ioctl sets up the whole space-management machinery, >> the user-space is mmaped out, and the operations triggered thro the pushbuf. > > So when the write to the RING is done, the GPU accesses the System RAM memory. > What is then the deal with the 512MB or so video cards? Is that only > used for putting textures on it?Half the memory is used as the viewport to the system CPU and the other half is the GPU system. The system/user transfer to the viewport (and controls via the iomem space). The DMA is NOT programmed in the conventional way - it has the lowest-level pagetable created for the instances 1MiB space (to which it is bound) and the other end is managed by the GPU intelligence.>> YAHOO! (just a simple shout) > > <grins> Thank you for solving this problem! If you ever are in the > Boston area give me a ring and the beers (or your favorite liquid) is on > me! >vice-versa if you are around Chennai, India. Actually, I''m neither an expert on the deep internals of the kernel (though I''m getting to know more about it) nor the new generation of graphics. Just reading about the graphics devices of today got me frustrated because I could not get it it up on Xen, and the debates on TTM, GEM made me think that either something was drastically wrong or something stupid was missed. So this mission. Video-cards have loads of specialised processors - dozens of them often. CUDA is an environment/architecure that allows normal C programs to use these cores - someday graphics cards will also do graphics! So somebody will have to ensure that Xen in the future is enabled for it - it doesn''t stop with Direct-Rendering - which also needs enhancements. Having X accelerated brought down dom0 CPU usage from 15-30% on my system to 1-5% when running a sample WinXP (with meadowcourt PVops drivers) domU! PS. I sent a mail to your personal address with all the patches I used in the workout - am attaching it here too - in case it is of interest to somebody. You should look a the correct_section_mismatch.patch for what it is worth. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel