thr3ads.net - Xen devel - [Xen-devel] Fbdev graphics broken in xen/next dom0 [Mar 2010]

If this information is useful, please help other people find it:
Share via:

Eamon Walsh

2010-Mar-12 20:24 UTC

[Xen-devel] Fbdev graphics broken in xen/next dom0

Hello:

I work in the same group as Dave Quigley and George Coker.  I''m working
on a graphical switcher application for Xen which uses the DirectFB
library on top of Linux VESA fbdev.  This runs in dom0 at the moment. 
I''m using the latest xen/next pvops dom0 and xen-unstable hypervisor
compiled from source, with vga=ask so I can boot dom0 in a graphical mode.

The problem I''m having is illustrated by the attached test program that
displays a green background with a white square for 10 seconds when run
as root.  It doesn''t work on the xen/next / xen-unstable combo.  The
program runs and exits normally but all I see is a black screen.

The program *does* work on xen/next running on the bare metal.  It also
works using the xen-unstable hypervisor with an older dom0, the 2.6.31.4
kernel with Novell patches.  So I think the issue is in the xen/next
kernel.  I''ve run the test program on different machines and observed
the same behavior.

The xen-unstable / 2.6.31.4 dom0 combination works and I''m using that
for the moment but I''d like to be using pvops.  I would be happy to run
more tests / provide more data if needed.


-- 

Eamon Walsh 
National Security Agency



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Mar-12 21:42 UTC

head link

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

On 03/12/2010 12:24 PM, Eamon Walsh wrote:> I work in the same group as Dave Quigley and George Coker.  I''m
working
> on a graphical switcher application for Xen which uses the DirectFB
> library on top of Linux VESA fbdev.  This runs in dom0 at the moment.
> I''m using the latest xen/next pvops dom0 and xen-unstable
hypervisor
> compiled from source, with vga=ask so I can boot dom0 in a graphical mode.
>
> The problem I''m having is illustrated by the attached test program
that
> displays a green background with a white square for 10 seconds when run
> as root.  It doesn''t work on the xen/next / xen-unstable combo. 
The
> program runs and exits normally but all I see is a black screen.
>
> The program *does* work on xen/next running on the bare metal.  It also
> works using the xen-unstable hypervisor with an older dom0, the 2.6.31.4
> kernel with Novell patches.  So I think the issue is in the xen/next
> kernel.  I''ve run the test program on different machines and
observed
> the same behavior.
>
> The xen-unstable / 2.6.31.4 dom0 combination works and I''m using
that
> for the moment but I''d like to be using pvops.  I would be happy
to run
> more tests / provide more data if needed.
>    
What''s the hardware?  Do any messages appear either on the dom0 console
or the Xen console?  Does booting with a vga console help?

Thanks,
     J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Eamon Walsh

2010-Mar-13 00:44 UTC

head link

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

On 03/12/2010 04:42 PM, Jeremy Fitzhardinge wrote:> On 03/12/2010 12:24 PM, Eamon Walsh wrote:
>   
>> I work in the same group as Dave Quigley and George Coker. 
I''m working
>> on a graphical switcher application for Xen which uses the DirectFB
>> library on top of Linux VESA fbdev.  This runs in dom0 at the moment.
>> I''m using the latest xen/next pvops dom0 and xen-unstable
hypervisor
>> compiled from source, with vga=ask so I can boot dom0 in a graphical
mode.
>>
>> The problem I''m having is illustrated by the attached test
program that
>> displays a green background with a white square for 10 seconds when run
>> as root.  It doesn''t work on the xen/next / xen-unstable
combo.  The
>> program runs and exits normally but all I see is a black screen.
>>
>> The program *does* work on xen/next running on the bare metal.  It also
>> works using the xen-unstable hypervisor with an older dom0, the
2.6.31.4
>> kernel with Novell patches.  So I think the issue is in the xen/next
>> kernel.  I''ve run the test program on different machines and
observed
>> the same behavior.
>>
>> The xen-unstable / 2.6.31.4 dom0 combination works and I''m
using that
>> for the moment but I''d like to be using pvops.  I would be
happy to run
>> more tests / provide more data if needed.
>>    
>>     
> What''s the hardware?  Do any messages appear either on the dom0
console
> or the Xen console?  Does booting with a vga console help?
>   
The hardware is a Dell Latitude E6500 with nvidia graphics.  I also see
the issue on a Dell Optiplex 960 desktop with Intel graphics.  No
obvious messages on the consoles.  I am booting in VGA mode.

I have narrowed the problem down: it has something to do with mmap of
/dev/fb0 not syncing.  The attached C code mmaps /dev/fb0 and writes
some random bits.  On a configuration that does work (2.6.31.4 on
4.0-rc6, or xen/next on bare metal) the random bits are visible on the
screen.  With xen/next on 4.0-rc6, nothing is visible.  Calling msync()
before the sleep has no effect.  Also, using write() on /dev/fb0 always
works so it appears to be mmap related.


-- 

Eamon Walsh 
National Security Agency



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Mar-13 00:51 UTC

head link

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

On 03/12/2010 04:44 PM, Eamon Walsh wrote:> I have narrowed the problem down: it has something to do with mmap of
> /dev/fb0 not syncing.  The attached C code mmaps /dev/fb0 and writes
> some random bits.  On a configuration that does work (2.6.31.4 on
> 4.0-rc6, or xen/next on bare metal) the random bits are visible on the
> screen.  With xen/next on 4.0-rc6, nothing is visible.  Calling msync()
> before the sleep has no effect.  Also, using write() on /dev/fb0 always
> works so it appears to be mmap related.
>    
Yes.  I suspect there''s a missing VM_IO in there, and so the mmap is 
mapping the wrong pages (if you''re lucky you might be able to crash the
machine to see something juicy).

     J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Mar-16 00:46 UTC

head link

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

On Fri, Mar 12, 2010 at 04:51:30PM -0800, Jeremy Fitzhardinge
wrote:> On 03/12/2010 04:44 PM, Eamon Walsh wrote:
>> I have narrowed the problem down: it has something to do with mmap of
>> /dev/fb0 not syncing.  The attached C code mmaps /dev/fb0 and writes
Is the machine spinning? Meaning if you start writting to the mmap
region the machine looks to be stuck?
>> some random bits.  On a configuration that does work (2.6.31.4 on
>> 4.0-rc6, or xen/next on bare metal) the random bits are visible on the
>> screen.  With xen/next on 4.0-rc6, nothing is visible.  Calling msync()
>> before the sleep has no effect.  Also, using write() on /dev/fb0 always
>> works so it appears to be mmap related.
>>    
>
> Yes.  I suspect there''s a missing VM_IO in there, and so the mmap
is
> mapping the wrong pages (if you''re lucky you might be able to
crash the
> machine to see something juicy).
<scratches his head>

The nvidia framebuffer (drivers/video/nvidia/nvidia.c) does this:

1369         info->screen_base = ioremap(nvidiafb_fix.smem_start,
par->FbMapSize);

where the start of memory is obtained via
1328         nvidiafb_fix.smem_start = pci_resource_start(pd, 1);

I believe the ''ioremap'' works pretty good, otherwise we would
have other
PCI devices having trouble.

... and in another code (fbmem.c):

1321 static int
1322 fb_mmap(struct file *file, struct vm_area_struct * vma)
..
1345         start = info->fix.smem_start;
..
1363         /* This is an IO map - tell maydump to skip this VMA */
1364         vma->vm_flags |= VM_IO | VM_RESERVED;

.. it _does_ set the VM_IO, but that is OK since the memory is actually
backed by the PCI device.

Eamon, can you provide a more detail serial output? That could shed some
light on this. Another thing you could try to make sure you are actually
hitting the right mmap, is to instrument fb_mmap. I would recommend
printing out the vma->vm_start, vm_end, and start to see if the look
reasonable.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Eamon Walsh

2010-Mar-16 21:52 UTC

head link

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

On 03/15/2010 08:46 PM, Konrad Rzeszutek Wilk wrote:> On Fri, Mar 12, 2010 at 04:51:30PM -0800, Jeremy Fitzhardinge wrote:
>   
>> On 03/12/2010 04:44 PM, Eamon Walsh wrote:
>>     
>>> I have narrowed the problem down: it has something to do with mmap
of
>>> /dev/fb0 not syncing.  The attached C code mmaps /dev/fb0 and
writes
>>>       
> Is the machine spinning? Meaning if you start writting to the mmap
> region the machine looks to be stuck?
>   
No, the machine keeps running just fine.  Although I tried reading out
of the mmap region and it is definitely not framebuffer memory, it''s
filled with some kind of binary data on one of my machines which is not
there if I just read() from /dev/fb0.


>   
>>> some random bits.  On a configuration that does work (2.6.31.4 on
>>> 4.0-rc6, or xen/next on bare metal) the random bits are visible on
the
>>> screen.  With xen/next on 4.0-rc6, nothing is visible.  Calling
msync()
>>> before the sleep has no effect.  Also, using write() on /dev/fb0
always
>>> works so it appears to be mmap related.
>>>    
>>>       
>> Yes.  I suspect there''s a missing VM_IO in there, and so the
mmap is
>> mapping the wrong pages (if you''re lucky you might be able to
crash the
>> machine to see something juicy).
>>     
> <scratches his head>
>
> The nvidia framebuffer (drivers/video/nvidia/nvidia.c) does this:
>
> 1369         info->screen_base = ioremap(nvidiafb_fix.smem_start,
par->FbMapSize);
>
> where the start of memory is obtained via
> 1328         nvidiafb_fix.smem_start = pci_resource_start(pd, 1);
>
> I believe the ''ioremap'' works pretty good, otherwise we
would have other
> PCI devices having trouble.
>
> ... and in another code (fbmem.c):
>
> 1321 static int
> 1322 fb_mmap(struct file *file, struct vm_area_struct * vma)
> ..
> 1345         start = info->fix.smem_start;
> ..
> 1363         /* This is an IO map - tell maydump to skip this VMA */
> 1364         vma->vm_flags |= VM_IO | VM_RESERVED;
>
> .. it _does_ set the VM_IO, but that is OK since the memory is actually
> backed by the PCI device.
>
> Eamon, can you provide a more detail serial output? That could shed some
> light on this. Another thing you could try to make sure you are actually
> hitting the right mmap, is to instrument fb_mmap. I would recommend
> printing out the vma->vm_start, vm_end, and start to see if the look
> reasonable.
>
>
>   
The serial output is attached.

The patch I used to instrument the fb_mmap function and the output it
produced for a couple of runs are also attached.

And I tossed in my kernel .config for good measure.

What else is needed?


-- 

Eamon Walsh 
National Security Agency






_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Mar-16 22:19 UTC

head link

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

> 
> The serial output is attached.
> 
> The patch I used to instrument the fb_mmap function and the output it
> produced for a couple of runs are also attached.
> 
> And I tossed in my kernel .config for good measure.
> 
> What else is needed?
It looks that I confused your email with another person. You don''t seem
to run the nvidia fb, but rather the radeon one.

.. snip ..> Non-volatile memory driver v1.3
> Linux agpgart interface v0.103
> agpgart-intel 0000:00:00.0: Intel Q45/Q43 Chipset
> agpgart-intel 0000:00:00.0: detected 32764K stolen memory
> agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xd0000000
> tpm_tis 00:08: 1.2 TPM (device-id 0x4A10, rev-id 78)
> [drm] Initialized drm 1.1.0 20060810
> [drm] radeon defaulting to kernel modesetting.
> [drm] radeon kernel modesetting enabled.
> xen_allocate_pirq: returning irq 16 for gsi 16
> Already setup the GSI :16
> i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> [drm] set up 31M of stolen space
> [drm] TMDS-8: set mode 1280x1024 17
> Console: switching to colour frame buffer device 160x64
> fb0: inteldrmfb frame buffer device
> registered panic notifier
> [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
You look to have a i915 framebuffer on your box.

I *think* that the i915 is not using KMS and the TTM stuff, so the
patch that Arvind posted would probably not help you.
http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg48668.html

So, lets boot your kernel with these command line parameters to get more
data: debug initcall_debug drm.debug=255

That should spew out some more details.

Next thing I would suggest is to instrument i915_gem_fault. Attached is
a patch that does it (thought it is not compile tested nor actually
booted so it might need some hand crafting - sorry).

And the other thing is to read through the steps that Arvind took in the
e-mail thread titled: "Nouveau on dom0". It covers the gamma of things
to troubleshoot this.

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fba37e9..cfcaafd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -33,6 +33,8 @@
 #include "intel_drv.h"
 #include <linux/swap.h>
 #include <linux/pci.h>
+#include <xen/xen.h>
+#include <asm/xen/page.h>
 
 #define I915_GEM_GPU_DOMAINS	(~(I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT))
 
@@ -1145,6 +1147,143 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+void print_pte(struct vm_area_struct *vma, char *what, struct page *page,
unsigned int pfn, unsigned long address)
+{
+	static const char * const level_name[] +	  { "NONE", "4K",
"2M", "1G", "NUM" };
+	unsigned long addr = 0;
+	pte_t *pte = NULL;
+	pteval_t val = (pteval_t)0;
+	unsigned int level = 0;
+	unsigned offset;
+	unsigned long phys;
+	pgprotval_t prot;
+	char buf[90];
+	char *str;
+
+	str = buf;
+	// Figure out if the address is pagetable.
+	if (address == 0 && !page && pfn>0) {
+		page = pfn_to_page(pfn);
+	}
+	if (address == 0 && page)
+		addr = (u64)page_address(page);
+
+	if (address && !page)
+		addr = address;
+
+	if (address && page) {
+		addr = (u64)page_address(page);
+		if (address != addr) {
+			if (addr == 0) {
+				str += sprintf(str, "addr(page)==0");
+				addr = address;
+			}
+		}
+	}
+
+	if (pfn != 0 && page) {
+		if (pfn != page_to_pfn(page)) // Gosh!?
+			str += sprintf(str, "pfn!=pfn(page)");
+	}
+	if (pfn != 0 && addr != 0) {
+		if (pfn != virt_to_pfn(addr))
+			str += sprintf(str,"pfn(addr)!=pfn");
+	}
+	pte = lookup_address(addr, &level);
+	if (!pte) {
+		str += sprintf(str,"!pte(addr)");
+		goto print;
+	}
+	offset = addr & ~PAGE_MASK;
+
+	if (xen_domain()) {
+		phys = (pte_mfn(*pte) << PAGE_SHIFT) + offset;		
+		val = pte_val_ma(*pte);
+
+		if (pfn > 0) {
+			if (pte_mfn(*pte) == pfn) {
+				if  (vma->vm_flags && VM_IO)
+					str += sprintf(str,"PHYS");
+				else
+					str += sprintf(str,"BUG: VM_IO not set!");
+			}
+			/* It is a pseudo page ... and the VM_IO flag is set */
+			if (pte_mfn(*pte) != pfn) {
+				if (vma->vm_flags && VM_IO)
+					str += sprintf(str,"BUG: VM_IO flag set!");
+				else
+					str += sprintf(str, "PSEUDO");
+			}
+		} else {
+			str += sprintf(str,"pfn==0");
+		}
+
+	} else {
+		phys = (pte_pfn(*pte) << PAGE_SHIFT) + offset;
+		val = pte_val(*pte);
+	}	
+	prot = pgprot_val(pte_pgprot(*pte));
+
+	if (!prot)
+		str += sprintf(str, "Not present.");
+	else  {
+		if (prot & _PAGE_USER)
+			str += sprintf(str, "USR ");
+		else
+			str += sprintf(str, "    ");
+		if (prot & _PAGE_RW)
+			str += sprintf(str, "RW ");
+		else
+			str += sprintf(str, "ro ");
+		if (prot & _PAGE_PWT)
+			str += sprintf(str, "PWT ");
+		else
+			str += sprintf(str, "    ");
+		if (prot & _PAGE_PCD)
+			str += sprintf(str, "PCD ");
+		else
+			str += sprintf(str, "    ");
+
+		/* Bit 9 has a different meaning on level 3 vs 4 */
+		if (level <= 3) {
+			if (prot & _PAGE_PSE)
+				str += sprintf(str, "PSE ");
+			else
+				str += sprintf(str, "    ");
+		} else {
+			if (prot & _PAGE_PAT)
+				str += sprintf(str, "pat ");
+			else
+				str += sprintf(str, "    ");
+		}
+		if (prot & _PAGE_GLOBAL)
+			str += sprintf(str, "GLB ");
+		else
+			str += sprintf(str, "    ");
+		if (prot & _PAGE_NX)
+			str += sprintf(str, "NX ");
+		else
+			str += sprintf(str, "x  ");
+#ifdef _PAGE_IOMEM
+		if (prot & _PAGE_IOMEM)
+			str += sprintf(str, "IO ");
+		else
+			str += sprintf(str, "   ");
+#endif
+		
+	}
+
+print:
+	printk(KERN_INFO "[%16s]PFN: 0x%lx PTE: 0x%lx (val:%lx): [%s]
[%s]\n",
+			what,
+			(unsigned long)pfn,
+			(pte) ? (unsigned long)(pte->pte) : 0,
+			(unsigned long)val,
+			buf,
+			level_name[level]);
+}
+
 /**
  * i915_gem_fault - fault a page into the GTT
  * vma: VMA in question
@@ -1200,8 +1339,10 @@ int i915_gem_fault(struct vm_area_struct *vma, struct
vm_fault *vmf)
 	pfn = ((dev->agp->base + obj_priv->gtt_offset) >> PAGE_SHIFT) +
 		page_offset;
 
+	print_pte(vma,"before",  NULL, pfn, 0);
 	/* Finally, remap it using the new GTT offset */
 	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address, pfn);
+	print_pte(vma, "after",  NULL, pfn, (unsigned long)
vmf->virtual_address);
 unlock:
 	mutex_unlock(&dev->struct_mutex);
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Eamon Walsh

2010-Mar-25 23:55 UTC

head link

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

On 03/16/2010 06:19 PM, Konrad Rzeszutek Wilk wrote:>> The serial output is attached.
>>
>> The patch I used to instrument the fb_mmap function and the output it
>> produced for a couple of runs are also attached.
>>
>> And I tossed in my kernel .config for good measure.
>>
>> What else is needed?
>>     
> It looks that I confused your email with another person. You don''t
seem
> to run the nvidia fb, but rather the radeon one.
>   
The current machine I am using has Intel integrated graphics but I can
also reproduce the problem on a laptop with nvidia graphics (it runs the
vesafb framebuffer).  After I send this mail I''ll recompile on that
machine and see what happens.

I recompiled Xen and pvops/next today.  I included your instrumentation
patch below for i915_gem_fault, but it doesn''t trigger.  No
instrumentation messages appear.  I even put a print statement at the
top of the function but it never prints.

I have attached the serial console output and dmesg output.  The
initcall and drm debug stuff is present.

Also, I get something new when I run the test program.  It prints out:

# ./silly
Mapped /dev/fb0 at 0x7f3237175000
Killed

Message from syslogd@moss-flapper at Mar 25 19:25:52 ...
 kernel:Bad pagetable: 000f [#1] SMP


And I get the following on the serial console (the deadbeef stuff is the
buffer I just wrote into the mmap):

moss-flapper login: (XEN) d0:v1: reserved bit in page table (ec=000F)
(XEN) Pagetable walk from 00007f3237175000:
(XEN)  L4[0x0fe] = 000000001154a067 00000000001deaec
(XEN)  L3[0x0c8] = 000000001492f067 00000000001db6d1
(XEN)  L2[0x1b8] = 0000000015bc7067 00000000001da569 
(XEN)  L1[0x175] = fffff7fffffff22f ffffffffffffffff
(XEN) ----[ Xen-4.0.0-rc8-pre  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    1
(XEN) RIP:    e033:[<0000003002e8305b>]
(XEN) RFLAGS: 0000000000010206   EM: 0   CONTEXT: pv guest
(XEN) rax: 00007f3237175000   rbx: 0000000000000000   rcx: 0000000000000200
(XEN) rdx: 0000000000001000   rsi: 00007fff42cc42e0   rdi: 00007f3237175000
(XEN) rbp: 00007fff42cc52f0   rsp: 00007fff42cc42c8   r8:  0000000000000001
(XEN) r9:  0000000000000001   r10: 00000000ffffffff   r11: 0000000000001000
(XEN) r12: 00000000004005d0   r13: 00007fff42cc53d0   r14: 0000000000000000
(XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 00000000000026f0
(XEN) cr3: 00000000116da000   cr2: 00007f3237175000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=00007fff42cc42c8:
(XEN)    00000000004007e0 cafeababdeadbeef 0000000000000000 cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN)    cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
silly: Corrupted page table at address 7f3237175000
PGD 1deaec067 PUD 1db6d1067 PMD 1da569067 PTE fffffffffffff22f
Bad pagetable: 000f [#1] SMP 
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 1 
Modules linked in: nfs fscache bridge stp llc ipt_MASQUERADE iptable_nat nf_nat
nfsd lockd nfs_acl auth_rpcgss export]
Pid: 1775, comm: silly Not tainted 2.6.32-pvops-dom0 #23 OptiPlex 960
RIP: e033:[<0000003002e8305b>]  [<0000003002e8305b>] 0x3002e8305b
RSP: e02b:00007fff42cc42c8  EFLAGS: 00010206
RAX: 00007f3237175000 RBX: 0000000000000000 RCX: 0000000000000200
RDX: 0000000000001000 RSI: 00007fff42cc42e0 RDI: 00007f3237175000
RBP: 00007fff42cc52f0 R08: 0000000000000001 R09: 0000000000000001
R10: 00000000ffffffff R11: 0000000000001000 R12: 00000000004005d0
R13: 00007fff42cc53d0 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f3237162700(0000) GS:ffff880028054000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f3237175000 CR3: 00000001df03a000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process silly (pid: 1775, threadinfo ffff8801df02a000, task ffff8801db68ae60)

RIP  [<0000003002e8305b>] 0x3002e8305b
 RSP <00007fff42cc42c8>
---[ end trace e07c6ddec4199123 ]---






> .. snip ..
>   
>> Non-volatile memory driver v1.3
>> Linux agpgart interface v0.103
>> agpgart-intel 0000:00:00.0: Intel Q45/Q43 Chipset
>> agpgart-intel 0000:00:00.0: detected 32764K stolen memory
>> agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xd0000000
>> tpm_tis 00:08: 1.2 TPM (device-id 0x4A10, rev-id 78)
>> [drm] Initialized drm 1.1.0 20060810
>> [drm] radeon defaulting to kernel modesetting.
>> [drm] radeon kernel modesetting enabled.
>> xen_allocate_pirq: returning irq 16 for gsi 16
>> Already setup the GSI :16
>> i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
>> [drm] set up 31M of stolen space
>> [drm] TMDS-8: set mode 1280x1024 17
>> Console: switching to colour frame buffer device 160x64
>> fb0: inteldrmfb frame buffer device
>> registered panic notifier
>> [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
>>     
> You look to have a i915 framebuffer on your box.
>
> I *think* that the i915 is not using KMS and the TTM stuff, so the
> patch that Arvind posted would probably not help you.
> http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg48668.html
>
> So, lets boot your kernel with these command line parameters to get more
> data: debug initcall_debug drm.debug=255
>
> That should spew out some more details.
>
> Next thing I would suggest is to instrument i915_gem_fault. Attached is
> a patch that does it (thought it is not compile tested nor actually
> booted so it might need some hand crafting - sorry).
>
> And the other thing is to read through the steps that Arvind took in the
> e-mail thread titled: "Nouveau on dom0". It covers the gamma of
things
> to troubleshoot this.
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c
b/drivers/gpu/drm/i915/i915_gem.c
> index fba37e9..cfcaafd 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -33,6 +33,8 @@
>  #include "intel_drv.h"
>  #include <linux/swap.h>
>  #include <linux/pci.h>
> +#include <xen/xen.h>
> +#include <asm/xen/page.h>
>  
>  #define I915_GEM_GPU_DOMAINS	(~(I915_GEM_DOMAIN_CPU |
I915_GEM_DOMAIN_GTT))
>  
> @@ -1145,6 +1147,143 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void
*data,
>  	return 0;
>  }
>  
> +void print_pte(struct vm_area_struct *vma, char *what, struct page *page,
unsigned int pfn, unsigned long address)
> +{
> +	static const char * const level_name[] > +	  { "NONE",
"4K", "2M", "1G", "NUM" };
> +	unsigned long addr = 0;
> +	pte_t *pte = NULL;
> +	pteval_t val = (pteval_t)0;
> +	unsigned int level = 0;
> +	unsigned offset;
> +	unsigned long phys;
> +	pgprotval_t prot;
> +	char buf[90];
> +	char *str;
> +
> +	str = buf;
> +	// Figure out if the address is pagetable.
> +	if (address == 0 && !page && pfn>0) {
> +		page = pfn_to_page(pfn);
> +	}
> +	if (address == 0 && page)
> +		addr = (u64)page_address(page);
> +
> +	if (address && !page)
> +		addr = address;
> +
> +	if (address && page) {
> +		addr = (u64)page_address(page);
> +		if (address != addr) {
> +			if (addr == 0) {
> +				str += sprintf(str, "addr(page)==0");
> +				addr = address;
> +			}
> +		}
> +	}
> +
> +	if (pfn != 0 && page) {
> +		if (pfn != page_to_pfn(page)) // Gosh!?
> +			str += sprintf(str, "pfn!=pfn(page)");
> +	}
> +	if (pfn != 0 && addr != 0) {
> +		if (pfn != virt_to_pfn(addr))
> +			str += sprintf(str,"pfn(addr)!=pfn");
> +	}
> +	pte = lookup_address(addr, &level);
> +	if (!pte) {
> +		str += sprintf(str,"!pte(addr)");
> +		goto print;
> +	}
> +	offset = addr & ~PAGE_MASK;
> +
> +	if (xen_domain()) {
> +		phys = (pte_mfn(*pte) << PAGE_SHIFT) + offset;		
> +		val = pte_val_ma(*pte);
> +
> +		if (pfn > 0) {
> +			if (pte_mfn(*pte) == pfn) {
> +				if  (vma->vm_flags && VM_IO)
> +					str += sprintf(str,"PHYS");
> +				else
> +					str += sprintf(str,"BUG: VM_IO not set!");
> +			}
> +			/* It is a pseudo page ... and the VM_IO flag is set */
> +			if (pte_mfn(*pte) != pfn) {
> +				if (vma->vm_flags && VM_IO)
> +					str += sprintf(str,"BUG: VM_IO flag set!");
> +				else
> +					str += sprintf(str, "PSEUDO");
> +			}
> +		} else {
> +			str += sprintf(str,"pfn==0");
> +		}
> +
> +	} else {
> +		phys = (pte_pfn(*pte) << PAGE_SHIFT) + offset;
> +		val = pte_val(*pte);
> +	}	
> +	prot = pgprot_val(pte_pgprot(*pte));
> +
> +	if (!prot)
> +		str += sprintf(str, "Not present.");
> +	else  {
> +		if (prot & _PAGE_USER)
> +			str += sprintf(str, "USR ");
> +		else
> +			str += sprintf(str, "    ");
> +		if (prot & _PAGE_RW)
> +			str += sprintf(str, "RW ");
> +		else
> +			str += sprintf(str, "ro ");
> +		if (prot & _PAGE_PWT)
> +			str += sprintf(str, "PWT ");
> +		else
> +			str += sprintf(str, "    ");
> +		if (prot & _PAGE_PCD)
> +			str += sprintf(str, "PCD ");
> +		else
> +			str += sprintf(str, "    ");
> +
> +		/* Bit 9 has a different meaning on level 3 vs 4 */
> +		if (level <= 3) {
> +			if (prot & _PAGE_PSE)
> +				str += sprintf(str, "PSE ");
> +			else
> +				str += sprintf(str, "    ");
> +		} else {
> +			if (prot & _PAGE_PAT)
> +				str += sprintf(str, "pat ");
> +			else
> +				str += sprintf(str, "    ");
> +		}
> +		if (prot & _PAGE_GLOBAL)
> +			str += sprintf(str, "GLB ");
> +		else
> +			str += sprintf(str, "    ");
> +		if (prot & _PAGE_NX)
> +			str += sprintf(str, "NX ");
> +		else
> +			str += sprintf(str, "x  ");
> +#ifdef _PAGE_IOMEM
> +		if (prot & _PAGE_IOMEM)
> +			str += sprintf(str, "IO ");
> +		else
> +			str += sprintf(str, "   ");
> +#endif
> +		
> +	}
> +
> +print:
> +	printk(KERN_INFO "[%16s]PFN: 0x%lx PTE: 0x%lx (val:%lx): [%s]
[%s]\n",
> +			what,
> +			(unsigned long)pfn,
> +			(pte) ? (unsigned long)(pte->pte) : 0,
> +			(unsigned long)val,
> +			buf,
> +			level_name[level]);
> +}
> +
>  /**
>   * i915_gem_fault - fault a page into the GTT
>   * vma: VMA in question
> @@ -1200,8 +1339,10 @@ int i915_gem_fault(struct vm_area_struct *vma,
struct vm_fault *vmf)
>  	pfn = ((dev->agp->base + obj_priv->gtt_offset) >>
PAGE_SHIFT) +
>  		page_offset;
>  
> +	print_pte(vma,"before",  NULL, pfn, 0);
>  	/* Finally, remap it using the new GTT offset */
>  	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address, pfn);
> +	print_pte(vma, "after",  NULL, pfn, (unsigned long)
vmf->virtual_address);
>  unlock:
>  	mutex_unlock(&dev->struct_mutex);
>  
>   

-- 

Eamon Walsh 
National Security Agency





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Arvind R

2010-Mar-27 09:14 UTC

head link

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

On Fri, Mar 26, 2010 at 5:25 AM, Eamon Walsh <ewalsh@tycho.nsa.gov>
wrote:> On 03/16/2010 06:19 PM, Konrad Rzeszutek Wilk wrote:
< --- snip --- >> I have attached the serial console output and dmesg output.  The
> initcall and drm debug stuff is present.
>
> Also, I get something new when I run the test program.  It prints out:
>
> # ./silly
> Mapped /dev/fb0 at 0x7f3237175000
> Killed
>
> Message from syslogd@moss-flapper at Mar 25 19:25:52 ...
>  kernel:Bad pagetable: 000f [#1] SMP
>
< --- snip --- >> silly: Corrupted page table at address 7f3237175000
> PGD 1deaec067 PUD 1db6d1067 PMD 1da569067 PTE fffffffffffff22f
> Bad pagetable: 000f [#1] SMP
> last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
> CPU 1
> Modules linked in: nfs fscache bridge stp llc ipt_MASQUERADE iptable_nat
nf_nat nfsd lockd nfs_acl auth_rpcgss export]
< --- snip --->>>> [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
>>>
>> You look to have a i915 framebuffer on your box.
>>
>> I *think* that the i915 is not using KMS and the TTM stuff, so the
>> patch that Arvind posted would probably not help you.
>>
http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg48668.html
>>
>> So, lets boot your kernel with these command line parameters to get
more
>> data: debug initcall_debug drm.debug=255
< --- snip --- >
>> e-mail thread titled: "Nouveau on dom0". It covers the gamma
of things
>> to troubleshoot this.
This is related and most probably due to the same bit. xf86-video-fbdev works
on bare-metal boot on XenNext with the nouveaufb driver but not on Xen.
Have upgraded whole chain to tip except xen which is 3.4.3rc3
Here is the syslog trace:
kernel: ------------[ cut here ]------------
kernel: WARNING: at arch/x86/mm/pat.c:872 track_pfn_vma_copy+0x4d/0x86()
kernel: Hardware name: System Product Name
kernel: Modules linked in: fbcon font bitblit softcursor nouveau ttm
drm_kms_helper drm cfbcopyarea cfbimgblt cfbfillrect bridge stp llc
ipv6 nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs fuse
kernel: Pid: 5835, comm: Xorg Not tainted 2.6.32-xen0-git20100323+asusp5wd #1
kernel: Call Trace:
kernel:  [<ffffffff8102c834>] ? track_pfn_vma_copy+0x4d/0x86
kernel:  [<ffffffff8102c834>] ? track_pfn_vma_copy+0x4d/0x86
kernel:  [<ffffffff8103ce54>] ? warn_slowpath_common+0x77/0xa3
kernel:  [<ffffffff8102c834>] ? track_pfn_vma_copy+0x4d/0x86
kernel:  [<ffffffff8100c436>] ? xen_leave_lazy_mmu+0x25/0x43
kernel:  [<ffffffff81090c49>] ? copy_page_range+0x76/0x7f8
kernel:  [<ffffffff8100ddc9>] ? xen_force_evtchn_callback+0x9/0xa
kernel:  [<ffffffff8100e572>] ? check_events+0x12/0x20
kernel:  [<ffffffff8100e55f>] ? xen_restore_fl_direct_end+0x0/0x1
kernel:  [<ffffffff8103b1f2>] ? dup_mm+0x276/0x409
kernel:  [<ffffffff8103bd82>] ? copy_process+0x9c8/0x10ff
kernel:  [<ffffffff8103c5ff>] ? do_fork+0x146/0x2c0
kernel:  [<ffffffff810110a3>] ? stub_clone+0x13/0x20
kernel:  [<ffffffff81010d82>] ? system_call_fastpath+0x16/0x1b
kernel: ---[ end trace c58bf004d15b0c42 ]---

Xorg.log ends with the same message as originally with trying
accelerated nouveau with misleading
XKB: Failed to compile keymap

fbdev.c calls fbdevHWMapVidmem in xorg-server/hw/xfree86/fbdevhw.c
which does a mmap as in silly.c.  As far as X is concerned, everything
is fine, but there is obviously a page-fault problem. Will have to setup
debug options and trace :-(

The ''corrupted page table'' syndrome is also present in the
accelerated
nouveau with AGP cards - so it may be linked to this problem. At least
this problem can be repeated on many platforms :-)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Mar-27 21:52 UTC

head link

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

On 03/27/2010 02:14 AM, Arvind R wrote:> On Fri, Mar 26, 2010 at 5:25 AM, Eamon Walsh<ewalsh@tycho.nsa.gov> 
wrote:
>    
>> On 03/16/2010 06:19 PM, Konrad Rzeszutek Wilk wrote:
>>      
> <  --- snip --->
>    
>> I have attached the serial console output and dmesg output.  The
>> initcall and drm debug stuff is present.
>>
>> Also, I get something new when I run the test program.  It prints out:
>>
>> # ./silly
>> Mapped /dev/fb0 at 0x7f3237175000
>> Killed
>>
>> Message from syslogd@moss-flapper at Mar 25 19:25:52 ...
>>   kernel:Bad pagetable: 000f [#1] SMP
>>
>>      
> <  --- snip --->
>    
>> silly: Corrupted page table at address 7f3237175000
>> PGD 1deaec067 PUD 1db6d1067 PMD 1da569067 PTE fffffffffffff22f
>> Bad pagetable: 000f [#1] SMP
>> last sysfs file:
/sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>> CPU 1
>> Modules linked in: nfs fscache bridge stp llc ipt_MASQUERADE
iptable_nat nf_nat nfsd lockd nfs_acl auth_rpcgss export]
>>      
> <  --- snip --->
>    
>>>> [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor
0
>>>>
>>>>          
>>> You look to have a i915 framebuffer on your box.
>>>
>>> I *think* that the i915 is not using KMS and the TTM stuff, so the
>>> patch that Arvind posted would probably not help you.
>>>
http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg48668.html
>>>
>>> So, lets boot your kernel with these command line parameters to get
more
>>> data: debug initcall_debug drm.debug=255
>>>        
> <  --- snip --->
>
>    
>>> e-mail thread titled: "Nouveau on dom0". It covers the
gamma of things
>>> to troubleshoot this.
>>>        
> This is related and most probably due to the same bit. xf86-video-fbdev
works
> on bare-metal boot on XenNext with the nouveaufb driver but not on Xen.
> Have upgraded whole chain to tip except xen which is 3.4.3rc3
> Here is the syslog trace:
> kernel: ------------[ cut here ]------------
> kernel: WARNING: at arch/x86/mm/pat.c:872 track_pfn_vma_copy+0x4d/0x86()
> kernel: Hardware name: System Product Name
> kernel: Modules linked in: fbcon font bitblit softcursor nouveau ttm
> drm_kms_helper drm cfbcopyarea cfbimgblt cfbfillrect bridge stp llc
> ipv6 nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs fuse
> kernel: Pid: 5835, comm: Xorg Not tainted 2.6.32-xen0-git20100323+asusp5wd
#1
> kernel: Call Trace:
> kernel:  [<ffffffff8102c834>] ? track_pfn_vma_copy+0x4d/0x86
> kernel:  [<ffffffff8102c834>] ? track_pfn_vma_copy+0x4d/0x86
> kernel:  [<ffffffff8103ce54>] ? warn_slowpath_common+0x77/0xa3
> kernel:  [<ffffffff8102c834>] ? track_pfn_vma_copy+0x4d/0x86
> kernel:  [<ffffffff8100c436>] ? xen_leave_lazy_mmu+0x25/0x43
> kernel:  [<ffffffff81090c49>] ? copy_page_range+0x76/0x7f8
> kernel:  [<ffffffff8100ddc9>] ? xen_force_evtchn_callback+0x9/0xa
> kernel:  [<ffffffff8100e572>] ? check_events+0x12/0x20
> kernel:  [<ffffffff8100e55f>] ? xen_restore_fl_direct_end+0x0/0x1
> kernel:  [<ffffffff8103b1f2>] ? dup_mm+0x276/0x409
> kernel:  [<ffffffff8103bd82>] ? copy_process+0x9c8/0x10ff
> kernel:  [<ffffffff8103c5ff>] ? do_fork+0x146/0x2c0
> kernel:  [<ffffffff810110a3>] ? stub_clone+0x13/0x20
> kernel:  [<ffffffff81010d82>] ? system_call_fastpath+0x16/0x1b
> kernel: ---[ end trace c58bf004d15b0c42 ]---
>
> Xorg.log ends with the same message as originally with trying
> accelerated nouveau with misleading
> XKB: Failed to compile keymap
>
> fbdev.c calls fbdevHWMapVidmem in xorg-server/hw/xfree86/fbdevhw.c
> which does a mmap as in silly.c.  As far as X is concerned, everything
> is fine, but there is obviously a page-fault problem. Will have to setup
> debug options and trace :-(
>
> The ''corrupted page table'' syndrome is also present in
the accelerated
> nouveau with AGP cards - so it may be linked to this problem. At least
> this problem can be repeated on many platforms :-)
>    
The "corrupt pagetable" comes from the pte having invalid reserved
bits
set in it.  I think the failure path is this:

The bad bits get set because someone is doing a pfn->mfn conversion on a 
page which is already an mfn, and doesn''t have a valid pfn->mfn
mapping,
and the result of the conversion is either 0xff... or 0x7f... (I forget 
right now).  But either way, a whole lot of bits get set, but nothing 
useful.  I''m not quite sure why Xen isn''t complaining about
this at
set-pte time, but perhaps it looks vaguely valid to it (perhaps it sees 
the invalid flags, knows the pte can''t be used to access anything, and 
allows it to be set?).  But this fault is happening because usermode 
gets a tlb miss, and the CPU finds a pte with reserved bits set, and 
raises the fault.

I''m not sure about the mm/pat.c warning thought.  I had a quick look at
that code, but it wasn''t obvious to me what''s going on there. 
Something
about handing the IO mapping during a fork().  Not sure if its related 
or not.

     J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Arvind R

2010-Mar-28 09:33 UTC

head link

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

On Sun, Mar 28, 2010 at 3:22 AM, Jeremy Fitzhardinge <jeremy@goop.org>
wrote:> On 03/27/2010 02:14 AM, Arvind R wrote:
>>
>> On Fri, Mar 26, 2010 at 5:25 AM, Eamon
Walsh<ewalsh@tycho.nsa.gov>  wrote:
>>
>>>
>>> On 03/16/2010 06:19 PM, Konrad Rzeszutek Wilk wrote:
>>>
>>
>> <  --- snip --->
>>
>>>
>>> I have attached the serial console output and dmesg output.  The
>>> initcall and drm debug stuff is present.
>>>
>>> Also, I get something new when I run the test program.  It prints
out:
>>>
>>> # ./silly
>>> Mapped /dev/fb0 at 0x7f3237175000
>>> Killed
>>>
>>> Message from syslogd@moss-flapper at Mar 25 19:25:52 ...
>>>  kernel:Bad pagetable: 000f [#1] SMP
>>>
>>>
>>
>> <  --- snip --->
>>
>>>
>>> silly: Corrupted page table at address 7f3237175000
>>> PGD 1deaec067 PUD 1db6d1067 PMD 1da569067 PTE fffffffffffff22f
>>> Bad pagetable: 000f [#1] SMP
>>> last sysfs file:
/sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>>> CPU 1
>>> Modules linked in: nfs fscache bridge stp llc ipt_MASQUERADE
iptable_nat
>>> nf_nat nfsd lockd nfs_acl auth_rpcgss export]
>>>
>>
>> <  --- snip --->
>>
>>>>>
>>>>> [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on
minor 0
>>>>>
>>>>>
>>>>
>>>> You look to have a i915 framebuffer on your box.
>>>>
>>>> I *think* that the i915 is not using KMS and the TTM stuff, so
the
>>>> patch that Arvind posted would probably not help you.
>>>>
>>>>
http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg48668.html
>>>>
>>>> So, lets boot your kernel with these command line parameters to
get more
>>>> data: debug initcall_debug drm.debug=255
>>>>
>>
>> <  --- snip --->
>>
>>
>>>>
>>>> e-mail thread titled: "Nouveau on dom0". It covers
the gamma of things
>>>> to troubleshoot this.
>>>>
>>
>> This is related and most probably due to the same bit. xf86-video-fbdev
>> works
>> on bare-metal boot on XenNext with the nouveaufb driver but not on Xen.
>> Have upgraded whole chain to tip except xen which is 3.4.3rc3
>> Here is the syslog trace:
>> kernel: ------------[ cut here ]------------
>> kernel: WARNING: at arch/x86/mm/pat.c:872
track_pfn_vma_copy+0x4d/0x86()
>> kernel: Hardware name: System Product Name
>> kernel: Modules linked in: fbcon font bitblit softcursor nouveau ttm
>> drm_kms_helper drm cfbcopyarea cfbimgblt cfbfillrect bridge stp llc
>> ipv6 nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs fuse
>> kernel: Pid: 5835, comm: Xorg Not tainted
2.6.32-xen0-git20100323+asusp5wd
>> #1
>> kernel: Call Trace:
>> kernel:  [<ffffffff8102c834>] ? track_pfn_vma_copy+0x4d/0x86
>> kernel:  [<ffffffff8102c834>] ? track_pfn_vma_copy+0x4d/0x86
>> kernel:  [<ffffffff8103ce54>] ? warn_slowpath_common+0x77/0xa3
>> kernel:  [<ffffffff8102c834>] ? track_pfn_vma_copy+0x4d/0x86
>> kernel:  [<ffffffff8100c436>] ? xen_leave_lazy_mmu+0x25/0x43
>> kernel:  [<ffffffff81090c49>] ? copy_page_range+0x76/0x7f8
>> kernel:  [<ffffffff8100ddc9>] ? xen_force_evtchn_callback+0x9/0xa
>> kernel:  [<ffffffff8100e572>] ? check_events+0x12/0x20
>> kernel:  [<ffffffff8100e55f>] ? xen_restore_fl_direct_end+0x0/0x1
>> kernel:  [<ffffffff8103b1f2>] ? dup_mm+0x276/0x409
>> kernel:  [<ffffffff8103bd82>] ? copy_process+0x9c8/0x10ff
>> kernel:  [<ffffffff8103c5ff>] ? do_fork+0x146/0x2c0
>> kernel:  [<ffffffff810110a3>] ? stub_clone+0x13/0x20
>> kernel:  [<ffffffff81010d82>] ? system_call_fastpath+0x16/0x1b
>> kernel: ---[ end trace c58bf004d15b0c42 ]---
>>
>> Xorg.log ends with the same message as originally with trying
>> accelerated nouveau with misleading
>> XKB: Failed to compile keymap
>>
>> fbdev.c calls fbdevHWMapVidmem in xorg-server/hw/xfree86/fbdevhw.c
>> which does a mmap as in silly.c.  As far as X is concerned, everything
>> is fine, but there is obviously a page-fault problem. Will have to
setup
>> debug options and trace :-(
>>
>> The ''corrupted page table'' syndrome is also present
in the accelerated
>> nouveau with AGP cards - so it may be linked to this problem. At least
>> this problem can be repeated on many platforms :-)
>>
>
> The "corrupt pagetable" comes from the pte having invalid
reserved bits set
> in it.  I think the failure path is this:
>
> The bad bits get set because someone is doing a pfn->mfn conversion on a
> page which is already an mfn, and doesn''t have a valid pfn->mfn
mapping, and
> the result of the conversion is either 0xff... or 0x7f... (I forget right
> now).  But either way, a whole lot of bits get set, but nothing useful.
 I''m
> not quite sure why Xen isn''t complaining about this at set-pte
time, but
> perhaps it looks vaguely valid to it (perhaps it sees the invalid flags,
> knows the pte can''t be used to access anything, and allows it to
be set?).
OK
>  But this fault is happening because usermode gets a tlb miss, and the CPU
> finds a pte with reserved bits set, and raises the fault.
Sorry, no faults!
> I''m not sure about the mm/pat.c warning thought.  I had a quick
look at that
> code, but it wasn''t obvious to me what''s going on there.
 Something about
> handing the IO mapping during a fork().  Not sure if its related or not.
>
>    J
>Was mistaken in assuming a fault. My guess is that Jeremy''s
failure-path train is
right, minus the fault. The hang occurs after the kernel-mode setting has
completed - but usermode (which thinks all is hunky-dory) is somehow unable to
create/write to its map of the framebuffer. System responsive - no consoles.

The FBDev DDX driver mmaps the framebuffer device, once,  during initialization
in fbdevHWMapVidmem. Subsequent calls return the previously mapped address.
But unfortunately, the first mmap of the device finds it already mapped by the
console drivers (I presume) - with VM_IO set in the shareable mapping.
 Is this the
first case where the mapped area is iomem (backed by the graphic card memory)
and is already mapped?

In mm/mmap.c mmap_region I see the vma  created for the mmap - and it does
not have the VM_IO set initially, The driver f_ops->mmap should be
able to select it.
But the common drm_mmap entry-point is not being entered at all in both
bare-boot (working) and xen-boot (not working) cases!

What am I missing?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Mar 2010 - Fbdev graphics broken in xen/next dom0

[Xen-devel] Fbdev graphics broken in xen/next dom0

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0

Re: [Xen-devel] Fbdev graphics broken in xen/next dom0