thr3ads.net - Xen devel - [Xen-devel] Continuing problems booting [Feb 2009]

If this information is useful, please help other people find it:
Share via:

M A Young

2009-Feb-20 12:50 UTC

[Xen-devel] Continuing problems booting

I am still having problems getting my dom0 machines to boot - I am not 
sure it is a single problem but there do still seem to be ata issues. This 
is with the recent apic patch applied. The full boot log (compressed) is 
attached up to the point it stopped responding.

 	Michael Young

ata_piix 0000:00:1f.1: enabling device (0005 -> 0007)
xen_set_ioapic_routing: irq 18 gsi 18 vector 160 ioapic 0 pin 18 
triggering 1 polarity 1
ata_piix 0000:00:1f.1: PCI INT A -> GSI 18 (level, low) -> IRQ 18
scsi0 : ata_piix
scsi1 : ata_piix
ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14
ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15
ata1.00: qc timeout (cmd 0xef)
ata1.00: failed to IDENTIFY (SPINUP failed, err_mask=0x4)
ata1.00: qc timeout (cmd 0xef)
ata1.00: failed to IDENTIFY (SPINUP failed, err_mask=0x4)
ata1.00: qc timeout (cmd 0xef)
ata1.00: failed to IDENTIFY (SPINUP failed, err_mask=0x4)
ata2.00: ATAPI: CD-952E/AKV, R7AR, max UDMA/33
ata2.00: configured for UDMA/33
ata2.00: qc timeout (cmd 0xa0)
ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
ata2.00: configured for UDMA/33
ata2.00: qc timeout (cmd 0xa0)
ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
ata2.00: limiting speed to UDMA/33:PIO3
ata2.00: configured for UDMA/33
ata2.00: qc timeout (cmd 0xa0)
ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
ata2.00: disabled
ata2: soft resetting link
ata2: EH complete


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-20 16:19 UTC

head link

[Xen-devel] Re: Continuing problems booting

This log extract is from the same hardware booting off the USB stick. This 
contains a lot of traceback, starting
=====================================================[ INFO: HARDIRQ-safe ->
HARDIRQ-unsafe lock order detected ]
2.6.29-0.135.rc5.git3.fc10.i686.PAE #1
------------------------------------------------------
khubd/245 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
  (&retval->lock){......}, at: [<c04a5b9f>]
dma_pool_alloc+0x1d/0x247

and this task is already holding:
  (&ehci->lock){-.....}, at: [<c0626797>]
ehci_urb_enqueue+0xac/0xc6b
which would create a new lock dependency:
  (&ehci->lock){-.....} -> (&retval->lock){......}

but this new dependency connects a HARDIRQ-irq-safe lock:
  (&ehci->lock){-.....}
... which became HARDIRQ-irq-safe at:
   [<c0456ef2>] __lock_acquire+0x241/0xb1c
   [<c0457820>] lock_acquire+0x53/0x75
   [<c06fc2e3>] _spin_lock+0x1e/0x4e
   [<c0627fd6>] ehci_irq+0x21/0x193
   [<c0615829>] usb_hcd_irq+0x38/0x93
   [<c0477674>] handle_IRQ_event+0x1a/0x4b
   [<c0478890>] handle_level_irq+0x64/0xac
   [<ffffffff>] 0xffffffff

to a HARDIRQ-irq-unsafe lock:
  (purge_lock){+.+...}
... which became HARDIRQ-irq-unsafe at:
...  [<c0456f6f>] __lock_acquire+0x2be/0xb1c
   [<c0457820>] lock_acquire+0x53/0x75
   [<c06fc2e3>] _spin_lock+0x1e/0x4e
   [<c04a106d>] __purge_vmap_area_lazy+0x39/0x145
   [<c04a2362>] vm_unmap_aliases+0x150/0x159
   [<c04061e0>] xen_create_contiguous_region+0x4c/0xd8
   [<c056103d>] xen_swiotlb_fixup+0x6e/0x99
   [<c08f1c2d>] swiotlb_alloc_boot+0x2e/0x35
   [<c08fc3b2>] swiotlb_init_with_default_size+0x2f/0xdb
   [<c08fc46b>] swiotlb_init+0xd/0xf
   [<c08f1bf2>] pci_swiotlb_init+0x41/0x4e
   [<c08e53b8>] pci_iommu_alloc+0x8/0xa
   [<c08f283c>] mem_init+0xe/0x2b3
   [<c08db7ff>] start_kernel+0x26b/0x31a
   [<c08db096>] i386_start_kernel+0x85/0x8d
   [<c08e0e7a>] xen_start_kernel+0x4bc/0x4c4
   [<ffffffff>] 0xffffffff

the full crash is in the attached log (I didn''t catch the entire log,
so
there is an earlier gap). Later there are further errors which eventually 
cause xen to give up.

BUG: unable to handle kernel NULL pointer dereference at 000000a8
IP: [<c0680000>] rtnetlink_net_exit+0x11/0x1e
*pdpt = 0000000004c33001 *pde = 0000000000000000
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/virtual/vtconsole/vtcon0/uevent
Modules linked in: pata_acpi ata_generic i915 drm i2c_algo_bit i2c_core

BUG: unable to handle kernel NULL pointer dereference at 00000005
IP: [<c065b8e5>] dmi_get_system_info+0x0/0xc
*pdpt = 0000000004c33001 *pde = 0000000000000000
Oops: 0002 [#2] SMP
last sysfs file: /sys/devices/virtual/vtconsole/vtcon0/uevent
Modules linked in: pata_acpi ata_generic i915 drm i2c_algo_bit i2c_core

BUG: unable to handle kernel NULL pointer dereference at 00000005
IP: [<c065b8e5>] dmi_get_system_info+0x0/0xc
*pdpt = 0000000004c33001 *pde = 0000000000000000
Oops: 0002 [#3] SMP
last sysfs file: /sys/devices/virtual/vtconsole/vtcon0/uevent
Modules linked in: pata_acpi ata_generic i915 drm i2c_algo_bit i2c_core

BUG: unable to handle kernel NULL pointer dereference at 00000005
IP: [<c065b8e5>] dmi_get_system_info+0x0/0xc
*pdpt = 0000000004c33001 *pde = 0000000000000000
Oops: 0002 [#4] SMP
last sysfs file: /sys/devices/virtual/vtconsole/vtcon0/uevent
Modules linked in: pata_acpi ata_generic i915 drm i2c_algo_bit i2c_core

BUG: unable to handle kernel NULL pointer dereference at 00000005
IP:(XEN) domain_crash_sync called from entry.S (ff1a2c9e)
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-3.3.1  x86_32p  debug=n  Not tainted ]----
(XEN) CPU:    0
(XEN) EIP:    0061:[<c0545ef5>]
(XEN) EFLAGS: 00010206   EM: 1   CONTEXT: pv guest
(XEN) eax: da434171   ebx: da43415e   ecx: c07e154f   edx: 7fffffff
(XEN) esi: da434108   edi: c07cf380   ebp: da4340ec   esp: da433fcc
(XEN) cr0: 8005003b   cr4: 000006f0   cr3: 04c37000   cr2: da433fdc
(XEN) ds: 007b   es: 007b   fs: 00d8   gs: 0033   ss: 0069   cs: 0061
(XEN) Guest stack trace from esp=da433fcc:
(XEN)    00000000 00000000 00000000 00000000 00000000 00000000 00000000 
00000000(XEN)    00000000 00000000 00000000 00000000 00000000
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-21 12:34 UTC

head link

[Xen-devel] Re: Continuing problems booting

And here are a couple more. First I get this traceback with a dom0 enabled 
kernel not running under xen atthe start of the boot log
BUG: spinlock bad magic on CPU#0, swapper/0 (Not tainted)
  lock: ffffffff81a39c90, .magic: 00000000, .owner: swapper/0, .owner_cpu: 
0
Pid: 0, comm: swapper Not tainted 2.6.29-0.135.rc5.git3.fc10.x86_64 #1
Call Trace:
  [<ffffffff811f0af7>] spin_bug+0xb9/0xd8
  [<ffffffff811f0b46>] _raw_spin_unlock+0x30/0xb9
  [<ffffffff8143f17c>] _spin_unlock+0x35/0x50
  [<ffffffff8102ebd5>] ? flat_send_IPI_mask+0x1f/0x35
  [<ffffffff810402ef>] native_flush_tlb_others+0xf6/0x119
  [<ffffffff810403a6>] flush_tlb_all+0x2a/0x60
  [<ffffffff810f1f07>] __purge_vmap_area_lazy+0x142/0x1bc
  [<ffffffff810f1e1f>] ? __purge_vmap_area_lazy+0x5a/0x1bc
  [<ffffffff811ee6d4>] ? __bitmap_weight+0x4d/0xac
  [<ffffffff810f25dc>] free_unmap_vmap_area_noflush+0x80/0x9b
  [<ffffffff810f1941>] ? find_vmap_area+0x5b/0x7b
  [<ffffffff810f262b>] remove_vm_area+0x34/0x97
  [<ffffffff810f27ad>] __vunmap+0x50/0x103
  [<ffffffff810857ff>] ? trace_hardirqs_on_caller+0x140/0x17a
  [<ffffffff81392be0>] ? neigh_proxy_process+0xad/0x124
  [<ffffffff810f2899>] vunmap+0x39/0x4f
  [<ffffffff81440dde>] text_poke+0x13c/0x186
  [<ffffffff8116df16>] ? __sysfs_put+0x1c/0x41
  [<ffffffff81446641>] ? _etext+0x0/0x3
  [<ffffffff8101a765>] alternatives_smp_unlock+0x59/0x85
  [<ffffffff8101aa31>] alternatives_smp_switch+0x16a/0x1bd
  [<ffffffff816ca9ef>] alternative_instructions+0x110/0x166
  [<ffffffff816cb241>] ? identify_boot_cpu+0x23/0x5b
  [<ffffffff816cb3c8>] check_bugs+0x21/0x54
  [<ffffffff816beffe>] start_kernel+0x410/0x43b
  [<ffffffff816be140>] ? early_idt_handler+0x0/0x71
  [<ffffffff816be2ce>] x86_64_start_reservations+0xb9/0xd4
  [<ffffffff816be000>] ? _sinittext+0x0/0x140
  [<ffffffff816be3d6>] x86_64_start_kernel+0xed/0x110

Secondly I get this crash when trying to start xen under qemu-kvm. 
Something similar is happening when I try to start xen directly, but I 
can''t do serial logging on this computer so I can''t be sure.

  \ \/ /___ _ __   |___ / |___ / / |
   \  // _ \ ''_ \    |_ \   |_ \ | |
   /  \  __/ | | |  ___) | ___) || |
  /_/\_\___|_| |_| |____(_)____(_)_|

(XEN) Xen version 3.3.1 (michael@home) (gcc version 4.3.2 20081105 (Red 
Hat 4.3.2-7) (GCC) ) Tue Feb  3 23:13:03 GMT 2009
(XEN) Latest ChangeSet: unavailable
(XEN) Command line: console=com1
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN) Disc information:
(XEN)  Found 0 MBR signatures
(XEN)  Found 0 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 000000000009fc00 (usable)
(XEN)  000000000009fc00 - 00000000000a0000 (reserved)
(XEN)  00000000000e8000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 000000003fff0000 (usable)
(XEN)  000000003fff0000 - 0000000040000000 (ACPI data)
(XEN)  00000000fffbd000 - 0000000100000000 (reserved)
(XEN) System RAM: 1023MB (1048124kB)
(XEN) ACPI: RSDP 000FB9D0, 0014 (r0 QEMU  )
(XEN) ACPI: RSDT 3FFF0000, 002C (r1 QEMU   QEMURSDT        1 QEMU        1)
(XEN) ACPI: FACP 3FFF002C, 0074 (r1 QEMU   QEMUFACP        1 QEMU        1)
(XEN) ACPI: DSDT 3FFF0100, 253C (r1   BXPC   BXDSDT        1 INTL 20061109)
(XEN) ACPI: FACS 3FFF00C0, 0040
(XEN) ACPI: APIC 3FFF2640, 00E0 (r1 QEMU   QEMUAPIC        1 QEMU        1)
(XEN) Xen heap: 14MB (14632kB)
(XEN) Domain heap initialised
(XEN) Processor #0 6:2 APIC version 20
(XEN) IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-23
(XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2394.081 MHz processor.
(XEN) CPU0: Intel QEMU Virtual CPU version 0.9.1 stepping 03
(XEN) Total of 1 processors activated.
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using new ACK method
(XEN) Platform timer is 3.579MHz ACPI PM Timer
(XEN) Brought up 1 CPUs
(XEN) I/O virtualisation disabled
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x239bbc0
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000038000000->000000003c000000 (221906 pages to 
be allocated)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff8239bbc0
(XEN)  Init. ramdisk: ffffffff8239c000->ffffffff82f5f000
(XEN)  Phys-Mach map: ffffffff82f5f000->ffffffff83130690
(XEN)  Start info:    ffffffff83131000->ffffffff831314a4
(XEN)  Page tables:   ffffffff83132000->ffffffff8314f000
(XEN)  Boot stack:    ffffffff8314f000->ffffffff83150000
(XEN)  TOTAL:         ffffffff80000000->ffffffff83400000
(XEN)  ENTRY ADDRESS: ffffffff816be200
(XEN) Dom0 has maximum 1 VCPUs
(XEN) Scrubbing Free RAM: done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times
to switch input
to Xen)
(XEN) Freed 120kB init memory.
(XEN) d0:v0: unhandled page fault (ec=0000)
(XEN) Pagetable walk from 0000000000000028:
(XEN)  L4[0x000] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-3.3.1  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff816c5315>]
(XEN) RFLAGS: 0000000000000296   EM: 1   CONTEXT: pv guest
(XEN) rax: 0000000000000000   rbx: 0000000000000000   rcx: 0000000000000000
(XEN) rdx: 0000000000000000   rsi: ffffffff83131000   rdi: ffffffff83131000
(XEN) rbp: ffffffff81695ff8   rsp: ffffffff81695f90   r8:  0000000000000000
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006b0
(XEN) cr3: 000000003b132000   cr2: 0000000000000028
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff81695f90:
(XEN)    0000000000000000 0000000000000000 0000000000000000 ffffffff816c5315
(XEN)    000000010000e030 0000000000010096 ffffffff81695fd8 000000000000e02b
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 ffffffff816b6000
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Feb-21 15:43 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

M A Young wrote:> And here are a couple more. First I get this traceback with a dom0 
> enabled kernel not running under xen atthe start of the boot log
> BUG: spinlock bad magic on CPU#0, swapper/0 (Not tainted)
>  lock: ffffffff81a39c90, .magic: 00000000, .owner: swapper/0, 
> .owner_cpu: 0
> Pid: 0, comm: swapper Not tainted 2.6.29-0.135.rc5.git3.fc10.x86_64 #1
Oh, I thought I''d fixed that (hm, must have lost the change somewhere).
Does it keep going OK anyway?
> Call Trace:
>  [<ffffffff811f0af7>] spin_bug+0xb9/0xd8
>  [<ffffffff811f0b46>] _raw_spin_unlock+0x30/0xb9
>  [<ffffffff8143f17c>] _spin_unlock+0x35/0x50
>  [<ffffffff8102ebd5>] ? flat_send_IPI_mask+0x1f/0x35
>  [<ffffffff810402ef>] native_flush_tlb_others+0xf6/0x119
>  [<ffffffff810403a6>] flush_tlb_all+0x2a/0x60
>  [<ffffffff810f1f07>] __purge_vmap_area_lazy+0x142/0x1bc
>  [<ffffffff810f1e1f>] ? __purge_vmap_area_lazy+0x5a/0x1bc
>  [<ffffffff811ee6d4>] ? __bitmap_weight+0x4d/0xac
>  [<ffffffff810f25dc>] free_unmap_vmap_area_noflush+0x80/0x9b
>  [<ffffffff810f1941>] ? find_vmap_area+0x5b/0x7b
>  [<ffffffff810f262b>] remove_vm_area+0x34/0x97
>  [<ffffffff810f27ad>] __vunmap+0x50/0x103
>  [<ffffffff810857ff>] ? trace_hardirqs_on_caller+0x140/0x17a
>  [<ffffffff81392be0>] ? neigh_proxy_process+0xad/0x124
>  [<ffffffff810f2899>] vunmap+0x39/0x4f
>  [<ffffffff81440dde>] text_poke+0x13c/0x186
>  [<ffffffff8116df16>] ? __sysfs_put+0x1c/0x41
>  [<ffffffff81446641>] ? _etext+0x0/0x3
>  [<ffffffff8101a765>] alternatives_smp_unlock+0x59/0x85
>  [<ffffffff8101aa31>] alternatives_smp_switch+0x16a/0x1bd
>  [<ffffffff816ca9ef>] alternative_instructions+0x110/0x166
>  [<ffffffff816cb241>] ? identify_boot_cpu+0x23/0x5b
>  [<ffffffff816cb3c8>] check_bugs+0x21/0x54
>  [<ffffffff816beffe>] start_kernel+0x410/0x43b
>  [<ffffffff816be140>] ? early_idt_handler+0x0/0x71
>  [<ffffffff816be2ce>] x86_64_start_reservations+0xb9/0xd4
>  [<ffffffff816be000>] ? _sinittext+0x0/0x140
>  [<ffffffff816be3d6>] x86_64_start_kernel+0xed/0x110
>
> Secondly I get this crash when trying to start xen under qemu-kvm. 
> Something similar is happening when I try to start xen directly, but I 
> can''t do serial logging on this computer so I can''t be
sure.
Interesting.  Xen has certainly revealed bugs in kvm''s pagetable 
management before, so it wouldn''t surprise me if they''ve
broken
something again (apparently they''re not in the habit of testing with 
Xen).  Report it to kvm-devel <kvm@vger.kernel.org>

    J>
>  \ \/ /___ _ __   |___ / |___ / / |
>   \  // _ \ ''_ \    |_ \   |_ \ | |
>   /  \  __/ | | |  ___) | ___) || |
>  /_/\_\___|_| |_| |____(_)____(_)_|
>
> (XEN) Xen version 3.3.1 (michael@home) (gcc version 4.3.2 20081105 
> (Red Hat 4.3.2-7) (GCC) ) Tue Feb  3 23:13:03 GMT 2009
> (XEN) Latest ChangeSet: unavailable
> (XEN) Command line: console=com1
> (XEN) Video information:
> (XEN)  VGA is text mode 80x25, font 8x16
> (XEN) Disc information:
> (XEN)  Found 0 MBR signatures
> (XEN)  Found 0 EDD information structures
> (XEN) Xen-e820 RAM map:
> (XEN)  0000000000000000 - 000000000009fc00 (usable)
> (XEN)  000000000009fc00 - 00000000000a0000 (reserved)
> (XEN)  00000000000e8000 - 0000000000100000 (reserved)
> (XEN)  0000000000100000 - 000000003fff0000 (usable)
> (XEN)  000000003fff0000 - 0000000040000000 (ACPI data)
> (XEN)  00000000fffbd000 - 0000000100000000 (reserved)
> (XEN) System RAM: 1023MB (1048124kB)
> (XEN) ACPI: RSDP 000FB9D0, 0014 (r0 QEMU  )
> (XEN) ACPI: RSDT 3FFF0000, 002C (r1 QEMU   QEMURSDT        1 
> QEMU        1)
> (XEN) ACPI: FACP 3FFF002C, 0074 (r1 QEMU   QEMUFACP        1 
> QEMU        1)
> (XEN) ACPI: DSDT 3FFF0100, 253C (r1   BXPC   BXDSDT        1 INTL 
> 20061109)
> (XEN) ACPI: FACS 3FFF00C0, 0040
> (XEN) ACPI: APIC 3FFF2640, 00E0 (r1 QEMU   QEMUAPIC        1 
> QEMU        1)
> (XEN) Xen heap: 14MB (14632kB)
> (XEN) Domain heap initialised
> (XEN) Processor #0 6:2 APIC version 20
> (XEN) IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-23
> (XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
> (XEN) Using scheduler: SMP Credit Scheduler (credit)
> (XEN) Detected 2394.081 MHz processor.
> (XEN) CPU0: Intel QEMU Virtual CPU version 0.9.1 stepping 03
> (XEN) Total of 1 processors activated.
> (XEN) ENABLING IO-APIC IRQs
> (XEN)  -> Using new ACK method
> (XEN) Platform timer is 3.579MHz ACPI PM Timer
> (XEN) Brought up 1 CPUs
> (XEN) I/O virtualisation disabled
> (XEN) *** LOADING DOMAIN 0 ***
> (XEN)  Xen  kernel: 64-bit, lsb, compat32
> (XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x239bbc0
> (XEN) PHYSICAL MEMORY ARRANGEMENT:
> (XEN)  Dom0 alloc.:   0000000038000000->000000003c000000 (221906 pages 
> to be allocated)
> (XEN) VIRTUAL MEMORY ARRANGEMENT:
> (XEN)  Loaded kernel: ffffffff81000000->ffffffff8239bbc0
> (XEN)  Init. ramdisk: ffffffff8239c000->ffffffff82f5f000
> (XEN)  Phys-Mach map: ffffffff82f5f000->ffffffff83130690
> (XEN)  Start info:    ffffffff83131000->ffffffff831314a4
> (XEN)  Page tables:   ffffffff83132000->ffffffff8314f000
> (XEN)  Boot stack:    ffffffff8314f000->ffffffff83150000
> (XEN)  TOTAL:         ffffffff80000000->ffffffff83400000
> (XEN)  ENTRY ADDRESS: ffffffff816be200
> (XEN) Dom0 has maximum 1 VCPUs
> (XEN) Scrubbing Free RAM: done.
> (XEN) Xen trace buffers: disabled
> (XEN) Std. Loglevel: Errors and warnings
> (XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
> (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three
times to switch
> input to Xen)
> (XEN) Freed 120kB init memory.
> (XEN) d0:v0: unhandled page fault (ec=0000)
> (XEN) Pagetable walk from 0000000000000028:
> (XEN)  L4[0x000] = 0000000000000000 ffffffffffffffff
> (XEN) domain_crash_sync called from entry.S
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-3.3.1  x86_64  debug=n  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e033:[<ffffffff816c5315>]
> (XEN) RFLAGS: 0000000000000296   EM: 1   CONTEXT: pv guest
> (XEN) rax: 0000000000000000   rbx: 0000000000000000   rcx: 
> 0000000000000000
> (XEN) rdx: 0000000000000000   rsi: ffffffff83131000   rdi: 
> ffffffff83131000
> (XEN) rbp: ffffffff81695ff8   rsp: ffffffff81695f90   r8:  
> 0000000000000000
> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 
> 0000000000000000
> (XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 
> 0000000000000000
> (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 
> 00000000000006b0
> (XEN) cr3: 000000003b132000   cr2: 0000000000000028
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
> (XEN) Guest stack trace from rsp=ffffffff81695f90:
> (XEN)    0000000000000000 0000000000000000 0000000000000000 
> ffffffff816c5315
> (XEN)    000000010000e030 0000000000010096 ffffffff81695fd8 
> 000000000000e02b
> (XEN)    0000000000000000 0000000000000000 0000000000000000 
> 0000000000000000
> (XEN)    0000000000000000 0000000000000000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN)    ffffffff816b6000 ffffffff816b6000 ffffffff816b6000 
> ffffffff816b6000
> (XEN) Domain 0 crashed: rebooting machine in 5 seconds.
>
>     Michael Young
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-21 15:52 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Sat, 21 Feb 2009, Jeremy Fitzhardinge wrote:
> M A Young wrote:
>> And here are a couple more. First I get this traceback with a dom0
enabled
>> kernel not running under xen atthe start of the boot log
>> BUG: spinlock bad magic on CPU#0, swapper/0 (Not tainted)
>>  lock: ffffffff81a39c90, .magic: 00000000, .owner: swapper/0,
.owner_cpu: 0
>> Pid: 0, comm: swapper Not tainted 2.6.29-0.135.rc5.git3.fc10.x86_64 #1
>
> Oh, I thought I''d fixed that (hm, must have lost the change
somewhere).  Does
> it keep going OK anyway?
Yes, that one doesn''t cause any obvious problems.
>> Secondly I get this crash when trying to start xen under qemu-kvm. 
>> Something similar is happening when I try to start xen directly, but I 
>> can''t do serial logging on this computer so I can''t
be sure.
>
> Interesting.  Xen has certainly revealed bugs in kvm''s pagetable
management
> before, so it wouldn''t surprise me if they''ve broken
something again
> (apparently they''re not in the habit of testing with Xen).  Report
it to
> kvm-devel <kvm@vger.kernel.org>
I was hoping that wasn''t kvm related because something is crashing my 
x86_64 system when I try to boot it directly into xen at about the same 
point (though I don''t have any good way of catching the logging 
information so I can''t be sure).

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-21 22:26 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Sat, 21 Feb 2009, Jeremy Fitzhardinge wrote:
> Interesting.  Xen has certainly revealed bugs in kvm''s pagetable
management
> before, so it wouldn''t surprise me if they''ve broken
something again
> (apparently they''re not in the habit of testing with Xen).  Report
it to
> kvm-devel <kvm@vger.kernel.org>
Further testing (and playing with the xen settings so I get to see the 
logging) reveals that it isn''t a kvm problem, because I get the same
crash
trying to boot xen directly on the computer.

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Feb-21 22:59 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Ok, Keir's problem.

But try doing a very clean rebuild; I spent a good chunk of last night bisecting
something that turned out to be a misbuild...

	J

M A Young <m.a.young@durham.ac.uk> wrote:
>On Sat, 21 Feb 2009, Jeremy Fitzhardinge wrote:
>
>> Interesting.  Xen has certainly revealed bugs in kvm's pagetable
management
>> before, so it wouldn't surprise me if they've broken something
again
>> (apparently they're not in the habit of testing with Xen).  Report
it to
>> kvm-devel <kvm@vger.kernel.org>
>
>Further testing (and playing with the xen settings so I get to see the 
>logging) reveals that it isn't a kvm problem, because I get the same
crash
>trying to boot xen directly on the computer.
>
> 	Michael Young
-- 
Sent from my Android phone with K-9. Please excuse my brevity.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-21 23:57 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Sat, 21 Feb 2009, Jeremy Fitzhardinge wrote:
> Ok, Keir''s problem.
>
> But try doing a very clean rebuild; I spent a good chunk of last night 
> bisecting something that turned out to be a misbuild...
In this case I think a misbuild is unlikely because I have seen the 
behaviour with two or three kernels, and the xen package is a straight 
rpmbuild --rebuild of xen-3.3.1-3.fc11.src.rpm from Fedora rawhide.

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Feb-22 04:47 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

All the problems occur after your dom0 pv_ops kernel has started execution,
Jeremy. ;-)

 -- Keir

On 21/02/2009 14:59, "Jeremy Fitzhardinge" <jeremy@goop.org>
wrote:
> Ok, Keir''s problem.
But try doing a very clean rebuild; I spent a good chunk> of last night bisecting something that turned out to be a misbuild...
 J

M A> Young <m.a.young@durham.ac.uk> wrote:
>On Sat, 21 Feb 2009, Jeremy
> Fitzhardinge wrote:
>> 
>>> Interesting.  Xen has certainly revealed bugs in kvm''s
pagetable management
>>> before, so it wouldn''t surprise me if they''ve
broken something again
>>> (apparently they''re not in the habit of testing with Xen).
Report it to
>>> kvm-devel <kvm@vger.kernel.org>
>> 
>> Further testing (and playing with the xen settings so I get to see the
>> logging) reveals that it isn''t a kvm problem, because I get
the same crash
>> trying to boot xen directly on the computer.
>> 
>> Michael Young
> -- 
Sent from my Android phone with K-9. Please excuse my brevity.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Feb-22 06:54 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

M A Young wrote:> And here are a couple more. First I get this traceback with a dom0 
> enabled kernel not running under xen atthe start of the boot log
> BUG: spinlock bad magic on CPU#0, swapper/0 (Not tainted)
>  lock: ffffffff81a39c90, .magic: 00000000, .owner: swapper/0, 
> .owner_cpu: 0
> Pid: 0, comm: swapper Not tainted 2.6.29-0.135.rc5.git3.fc10.x86_64 #1
> Call Trace:
>  [<ffffffff811f0af7>] spin_bug+0xb9/0xd8
>  [<ffffffff811f0b46>] _raw_spin_unlock+0x30/0xb9
>  [<ffffffff8143f17c>] _spin_unlock+0x35/0x50
>  [<ffffffff8102ebd5>] ? flat_send_IPI_mask+0x1f/0x35
>  [<ffffffff810402ef>] native_flush_tlb_others+0xf6/0x119
>  [<ffffffff810403a6>] flush_tlb_all+0x2a/0x60
>  [<ffffffff810f1f07>] __purge_vmap_area_lazy+0x142/0x1bc
>  [<ffffffff810f1e1f>] ? __purge_vmap_area_lazy+0x5a/0x1bc
>  [<ffffffff811ee6d4>] ? __bitmap_weight+0x4d/0xac
>  [<ffffffff810f25dc>] free_unmap_vmap_area_noflush+0x80/0x9b
>  [<ffffffff810f1941>] ? find_vmap_area+0x5b/0x7b
>  [<ffffffff810f262b>] remove_vm_area+0x34/0x97
>  [<ffffffff810f27ad>] __vunmap+0x50/0x103
>  [<ffffffff810857ff>] ? trace_hardirqs_on_caller+0x140/0x17a
>  [<ffffffff81392be0>] ? neigh_proxy_process+0xad/0x124
>  [<ffffffff810f2899>] vunmap+0x39/0x4f
>  [<ffffffff81440dde>] text_poke+0x13c/0x186
>  [<ffffffff8116df16>] ? __sysfs_put+0x1c/0x41
>  [<ffffffff81446641>] ? _etext+0x0/0x3
>  [<ffffffff8101a765>] alternatives_smp_unlock+0x59/0x85
>  [<ffffffff8101aa31>] alternatives_smp_switch+0x16a/0x1bd
>  [<ffffffff816ca9ef>] alternative_instructions+0x110/0x166
>  [<ffffffff816cb241>] ? identify_boot_cpu+0x23/0x5b
>  [<ffffffff816cb3c8>] check_bugs+0x21/0x54
>  [<ffffffff816beffe>] start_kernel+0x410/0x43b
>  [<ffffffff816be140>] ? early_idt_handler+0x0/0x71
>  [<ffffffff816be2ce>] x86_64_start_reservations+0xb9/0xd4
>  [<ffffffff816be000>] ? _sinittext+0x0/0x140
>  [<ffffffff816be3d6>] x86_64_start_kernel+0xed/0x110
>
> Secondly I get this crash when trying to start xen under qemu-kvm. 
> Something similar is happening when I try to start xen directly, but I 
> can''t do serial logging on this computer so I can''t be
sure.
>
>  \ \/ /___ _ __   |___ / |___ / / |
>   \  // _ \ ''_ \    |_ \   |_ \ | |
>   /  \  __/ | | |  ___) | ___) || |
>  /_/\_\___|_| |_| |____(_)____(_)_|
>
> (XEN) Xen version 3.3.1 (michael@home) (gcc version 4.3.2 20081105 
> (Red Hat 4.3.2-7) (GCC) ) Tue Feb  3 23:13:03 GMT 2009
> (XEN) Latest ChangeSet: unavailable
> (XEN) Command line: console=com1
> (XEN) Video information:
> (XEN)  VGA is text mode 80x25, font 8x16
> (XEN) Disc information:
> (XEN)  Found 0 MBR signatures
> (XEN)  Found 0 EDD information structures
> (XEN) Xen-e820 RAM map:
> (XEN)  0000000000000000 - 000000000009fc00 (usable)
> (XEN)  000000000009fc00 - 00000000000a0000 (reserved)
> (XEN)  00000000000e8000 - 0000000000100000 (reserved)
> (XEN)  0000000000100000 - 000000003fff0000 (usable)
> (XEN)  000000003fff0000 - 0000000040000000 (ACPI data)
> (XEN)  00000000fffbd000 - 0000000100000000 (reserved)
> (XEN) System RAM: 1023MB (1048124kB)
> (XEN) ACPI: RSDP 000FB9D0, 0014 (r0 QEMU  )
> (XEN) ACPI: RSDT 3FFF0000, 002C (r1 QEMU   QEMURSDT        1 
> QEMU        1)
> (XEN) ACPI: FACP 3FFF002C, 0074 (r1 QEMU   QEMUFACP        1 
> QEMU        1)
> (XEN) ACPI: DSDT 3FFF0100, 253C (r1   BXPC   BXDSDT        1 INTL 
> 20061109)
> (XEN) ACPI: FACS 3FFF00C0, 0040
> (XEN) ACPI: APIC 3FFF2640, 00E0 (r1 QEMU   QEMUAPIC        1 
> QEMU        1)
> (XEN) Xen heap: 14MB (14632kB)
> (XEN) Domain heap initialised
> (XEN) Processor #0 6:2 APIC version 20
> (XEN) IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-23
> (XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
> (XEN) Using scheduler: SMP Credit Scheduler (credit)
> (XEN) Detected 2394.081 MHz processor.
> (XEN) CPU0: Intel QEMU Virtual CPU version 0.9.1 stepping 03
> (XEN) Total of 1 processors activated.
> (XEN) ENABLING IO-APIC IRQs
> (XEN)  -> Using new ACK method
> (XEN) Platform timer is 3.579MHz ACPI PM Timer
> (XEN) Brought up 1 CPUs
> (XEN) I/O virtualisation disabled
> (XEN) *** LOADING DOMAIN 0 ***
> (XEN)  Xen  kernel: 64-bit, lsb, compat32
> (XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x239bbc0
> (XEN) PHYSICAL MEMORY ARRANGEMENT:
> (XEN)  Dom0 alloc.:   0000000038000000->000000003c000000 (221906 pages 
> to be allocated)
> (XEN) VIRTUAL MEMORY ARRANGEMENT:
> (XEN)  Loaded kernel: ffffffff81000000->ffffffff8239bbc0
> (XEN)  Init. ramdisk: ffffffff8239c000->ffffffff82f5f000
> (XEN)  Phys-Mach map: ffffffff82f5f000->ffffffff83130690
> (XEN)  Start info:    ffffffff83131000->ffffffff831314a4
> (XEN)  Page tables:   ffffffff83132000->ffffffff8314f000
> (XEN)  Boot stack:    ffffffff8314f000->ffffffff83150000
> (XEN)  TOTAL:         ffffffff80000000->ffffffff83400000
> (XEN)  ENTRY ADDRESS: ffffffff816be200
> (XEN) Dom0 has maximum 1 VCPUs
> (XEN) Scrubbing Free RAM: done.
> (XEN) Xen trace buffers: disabled
> (XEN) Std. Loglevel: Errors and warnings
> (XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
> (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three
times to switch
> input to Xen)
> (XEN) Freed 120kB init memory.
> (XEN) d0:v0: unhandled page fault (ec=0000)
> (XEN) Pagetable walk from 0000000000000028:
> (XEN)  L4[0x000] = 0000000000000000 ffffffffffffffff
> (XEN) domain_crash_sync called from entry.S
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-3.3.1  x86_64  debug=n  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e033:[<ffffffff816c5315>]
What does this correspond to in the kernel?

$ gdb vmlinux
(gdb) x/i 0xffffffff816c5315


    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-22 09:38 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Sat, 21 Feb 2009, Jeremy Fitzhardinge wrote:
>> ...
>> (XEN) d0:v0: unhandled page fault (ec=0000)
>> (XEN) Pagetable walk from 0000000000000028:
>> (XEN)  L4[0x000] = 0000000000000000 ffffffffffffffff
>> (XEN) domain_crash_sync called from entry.S
>> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
>> (XEN) ----[ Xen-3.3.1  x86_64  debug=n  Not tainted ]----
>> (XEN) CPU:    0
>> (XEN) RIP:    e033:[<ffffffff816c5315>]
>
> What does this correspond to in the kernel?
>
> $ gdb vmlinux
> (gdb) x/i 0xffffffff816c5315
0xffffffff816c5315 <xen_start_kernel+16>:	mov    %gs:0x28,%rax

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-22 14:58 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Sun, 22 Feb 2009, M A Young wrote:
> On Sat, 21 Feb 2009, Jeremy Fitzhardinge wrote:
>
>>> ...
>>> (XEN) d0:v0: unhandled page fault (ec=0000)
>>> (XEN) Pagetable walk from 0000000000000028:
>>> (XEN)  L4[0x000] = 0000000000000000 ffffffffffffffff
>>> (XEN) domain_crash_sync called from entry.S
>>> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
>>> (XEN) ----[ Xen-3.3.1  x86_64  debug=n  Not tainted ]----
>>> (XEN) CPU:    0
>>> (XEN) RIP:    e033:[<ffffffff816c5315>]
>> 
>> What does this correspond to in the kernel?
>> 
>> $ gdb vmlinux
>> (gdb) x/i 0xffffffff816c5315
>
> 0xffffffff816c5315 <xen_start_kernel+16>:	mov    %gs:0x28,%rax
This is from
0xffffffff816c5305 <xen_start_kernel>:	push   %rbp
0xffffffff816c5306 <xen_start_kernel+1>:	mov    %rsp,%rbp
0xffffffff816c5309 <xen_start_kernel+4>:	push   %rbx
0xffffffff816c530a <xen_start_kernel+5>:	sub    $0x18,%rsp
0xffffffff816c530e <xen_start_kernel+9>:
     mov    0x333e23(%rip),%rdi        # 0xffffffff819f9138 
<xen_start_info>
0xffffffff816c5315 <xen_start_kernel+16>:	mov    %gs:0x28,%rax
0xffffffff816c531e <xen_start_kernel+25>:	mov    %rax,-0x18(%rbp)
0xffffffff816c5322 <xen_start_kernel+29>:	xor    %eax,%eax
0xffffffff816c5324 <xen_start_kernel+31>:	test   %rdi,%rdi
0xffffffff816c5327 <xen_start_kernel+34>:
     je     0xffffffff816c5827 <xen_start_kernel+1314>
0xffffffff816c532d <xen_start_kernel+40>:
     movl   $0x1,0x333df9(%rip)        # 0xffffffff819f9130 
<xen_domain_type>
...

which is generated if CONFIG_CC_STACKPROTECTOR=y (also 
CONFIG_CC_OPTIMIZE_FOR_SIZE=y though I don''t know is the latter is 
important). If these aren''t set, the compiler produces differnt code,
and
the boot process gets a bit further before crashing.

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Feb-22 17:13 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

M A Young wrote:> On Sun, 22 Feb 2009, M A Young wrote:
>
>> On Sat, 21 Feb 2009, Jeremy Fitzhardinge wrote:
>>
>>>> ...
>>>> (XEN) d0:v0: unhandled page fault (ec=0000)
>>>> (XEN) Pagetable walk from 0000000000000028:
>>>> (XEN)  L4[0x000] = 0000000000000000 ffffffffffffffff
>>>> (XEN) domain_crash_sync called from entry.S
>>>> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
>>>> (XEN) ----[ Xen-3.3.1  x86_64  debug=n  Not tainted ]----
>>>> (XEN) CPU:    0
>>>> (XEN) RIP:    e033:[<ffffffff816c5315>]
>>>
>>> What does this correspond to in the kernel?
>>>
>>> $ gdb vmlinux
>>> (gdb) x/i 0xffffffff816c5315
>>
>> 0xffffffff816c5315 <xen_start_kernel+16>:    mov    %gs:0x28,%rax
>
> This is from
> 0xffffffff816c5305 <xen_start_kernel>:    push   %rbp
> 0xffffffff816c5306 <xen_start_kernel+1>:    mov    %rsp,%rbp
> 0xffffffff816c5309 <xen_start_kernel+4>:    push   %rbx
> 0xffffffff816c530a <xen_start_kernel+5>:    sub    $0x18,%rsp
> 0xffffffff816c530e <xen_start_kernel+9>:
>     mov    0x333e23(%rip),%rdi        # 0xffffffff819f9138 
> <xen_start_info>
> 0xffffffff816c5315 <xen_start_kernel+16>:    mov    %gs:0x28,%rax
> 0xffffffff816c531e <xen_start_kernel+25>:    mov    %rax,-0x18(%rbp)
> 0xffffffff816c5322 <xen_start_kernel+29>:    xor    %eax,%eax
> 0xffffffff816c5324 <xen_start_kernel+31>:    test   %rdi,%rdi
> 0xffffffff816c5327 <xen_start_kernel+34>:
>     je     0xffffffff816c5827 <xen_start_kernel+1314>
> 0xffffffff816c532d <xen_start_kernel+40>:
>     movl   $0x1,0x333df9(%rip)        # 0xffffffff819f9130 
> <xen_domain_type>
> ...
>
> which is generated if CONFIG_CC_STACKPROTECTOR=y (also 
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y though I don''t know is the latter is
> important). If these aren''t set, the compiler produces differnt
code,
> and the boot process gets a bit further before crashing. 
Hm, yes, I guess there''s something to stop stack-protector from adding 
stuff to xen_start_kernel().

But I''m more interested in the crash you see when you have stack 
protector off.  What are the symptoms?

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-22 23:02 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Sun, 22 Feb 2009, Jeremy Fitzhardinge wrote:
> Hm, yes, I guess there''s something to stop stack-protector from
adding stuff
> to xen_start_kernel().
>
> But I''m more interested in the crash you see when you have stack
protector
> off.  What are the symptoms?
That seems to have been pci related problems due to not setting pci=nomsi. 
Beyond that I still get ata timeouts, and IRQ problems. The end of one 
traceback (I couldn''t see the whole thing) looked similar to the one I 
mentioned for i686 
at
http://lists.xensource.com/archives/html/xen-devel/2009-02/msg00832.html
and another started with print_irq_inversion_bug

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Feb-22 23:19 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

M A Young wrote:> On Sun, 22 Feb 2009, Jeremy Fitzhardinge wrote:
>
>> Hm, yes, I guess there''s something to stop stack-protector
from
>> adding stuff to xen_start_kernel().
>>
>> But I''m more interested in the crash you see when you have
stack
>> protector off.  What are the symptoms?
>
> That seems to have been pci related problems due to not setting 
> pci=nomsi. 
Ah, yes.  I should probably find a way to code that rather than relying 
on the user to put it on the command line.
> Beyond that I still get ata timeouts, and IRQ problems.
That''s a pity; I was hoping those problems would be behind us...  Do
you
have any log messages relating to them?
> The end of one traceback (I couldn''t see the whole thing) looked 
> similar to the one I mentioned for i686 at
> http://lists.xensource.com/archives/html/xen-devel/2009-02/msg00832.html
> and another started with print_irq_inversion_bug 
The USB lockdep error isn''t terribly worrying if the machine
doesn''t
actually lock up.  I''m still working on a good way to fix that one.

The NULL pointer dereferences are much more of a worry, and a bit 
random.  Looks like the stack pointer has got trashed or something, so 
its not giving any useful information.  Do you know what was going on 
before then?

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-23 00:20 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Sun, 22 Feb 2009, Jeremy Fitzhardinge wrote:
> That''s a pity; I was hoping those problems would be behind us... 
Do you have
> any log messages relating to them?
ata is still timimg out, and the devices don''t work. Some 
errors from a kvm boot are
ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc000 irq 14
ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc008 irq 15
ata2.00: ATAPI: QEMU DVD-ROM, 0.9.1, max UDMA/100
ata2.00: configured for MWDMA2
Clocksource tsc unstable (delta = -2149391501 ns)
ata2.00: qc timeout (cmd 0xa0)
ata2.00: TEST_UNIT_READY failed (err_mask=0x4)
ata2.00: configured for MWDMA2
ata2.00: qc timeout (cmd 0xa0)
ata2.00: TEST_UNIT_READY failed (err_mask=0x4)
ata2.00: limiting speed to MWDMA2:PIO3
ata2.00: configured for MWDMA2

ata2.00: TEST_UNIT_READY failed (err_mask=0x4)
ata2.00: disabled
ata2: soft resetting link
ata2: EH complete
> The NULL pointer dereferences are much more of a worry, and a bit random. 
> Looks like the stack pointer has got trashed or something, so its not
giving
> any useful information.  Do you know what was going on before then?
I haven''t yet seen the NULL pointer dereferences in x86_64, but for
i686
boot they would just be part of the ordinary system startup, which I think 
in that case might have got far enough to try to fire off an X boot 
screen (the kernel was from before the last couple of days of drm related 
fixes were applied).

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Feb-23 06:27 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

M A Young wrote:> On Sun, 22 Feb 2009, Jeremy Fitzhardinge wrote:
>
>> That''s a pity; I was hoping those problems would be behind
us...  Do
>> you have any log messages relating to them?
>
> ata is still timimg out, and the devices don''t work. Some errors
from
> a kvm boot are
> ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc000 irq 14
> ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc008 irq 15
> ata2.00: ATAPI: QEMU DVD-ROM, 0.9.1, max UDMA/100
> ata2.00: configured for MWDMA2
> Clocksource tsc unstable (delta = -2149391501 ns)
> ata2.00: qc timeout (cmd 0xa0)
> ata2.00: TEST_UNIT_READY failed (err_mask=0x4)
> ata2.00: configured for MWDMA2
> ata2.00: qc timeout (cmd 0xa0)
> ata2.00: TEST_UNIT_READY failed (err_mask=0x4)
> ata2.00: limiting speed to MWDMA2:PIO3
> ata2.00: configured for MWDMA2
>
> ata2.00: TEST_UNIT_READY failed (err_mask=0x4)
> ata2.00: disabled
> ata2: soft resetting link
> ata2: EH complete
This is under qemu/kvm?

Can you include the complete boot output, and compare a native boot of 
the dom0 kernel too?
> I haven''t yet seen the NULL pointer dereferences in x86_64, but
for
> i686 boot they would just be part of the ordinary system startup, 
> which I think in that case might have got far enough to try to fire 
> off an X boot screen (the kernel was from before the last couple of 
> days of drm related fixes were applied).
Hm, I don''t see what would be causing them, regardless.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-25 00:56 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Sun, 22 Feb 2009, Jeremy Fitzhardinge wrote:
> M A Young wrote:
>> ...
>> ata2.00: TEST_UNIT_READY failed (err_mask=0x4)
>> ata2.00: disabled
>> ata2: soft resetting link
>> ata2: EH complete
>
> This is under qemu/kvm?
>
> Can you include the complete boot output, and compare a native boot of the 
> dom0 kernel too?
My x86_64 system has started working with the latest set of patches, I 
suspect as a result of the mtrr related smp changes. If QEMU is still 
broken, I will submit the boot log once I get a chance to test it.

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Feb-27 23:27 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Wed, 25 Feb 2009, M A Young wrote:
> My x86_64 system has started working with the latest set of patches, I 
> suspect as a result of the mtrr related smp changes. If QEMU is still
broken,
> I will submit the boot log once I get a chance to test it.
The QEMU boot still fails (pvops patch up-to-date as of 2 days ago), 
probably because it is emulating an ide style cdrom drive which I believe 
xen still has problems with (I have similar problems booting a different 
system with ide disks). The boot log (bzipped) is attached.

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Feb-28 01:18 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

M A Young wrote:> On Wed, 25 Feb 2009, M A Young wrote:
>
>> My x86_64 system has started working with the latest set of patches, 
>> I suspect as a result of the mtrr related smp changes. If QEMU is 
>> still broken, I will submit the boot log once I get a chance to test
it.
>
> The QEMU boot still fails (pvops patch up-to-date as of 2 days ago), 
> probably because it is emulating an ide style cdrom drive which I 
> believe xen still has problems with (I have similar problems booting a 
> different system with ide disks). The boot log (bzipped) is attached.
(bzip''s a bit of an overkill, its only 7k.)

Yes, I think the legacy interrupts are not being set up completely, but 
I''m not quite sure how they should be set up.  Will look into it.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-02 10:05 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Jeremy Fitzhardinge wrote:> Yes, I think the legacy interrupts are not being set up completely, but
> I''m not quite sure how they should be set up.  Will look into it.
FYI: Recently my dated, apic-less i386 laptop started to successfully
     boot the pv_ops/dom0 kernel, all the way up to userspace.

cheers,
  Gerd


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Mar-02 10:38 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Mon, 2 Mar 2009, Gerd Hoffmann wrote:
> Jeremy Fitzhardinge wrote:
>> Yes, I think the legacy interrupts are not being set up completely, but
>> I''m not quite sure how they should be set up.  Will look into
it.
>
> FYI: Recently my dated, apic-less i386 laptop started to successfully
>     boot the pv_ops/dom0 kernel, all the way up to userspace.
How recently? I know there was a fix in the first half of last week 
relating to smp mtrr that got one of my machines working, but the problem 
I am having wasn''t fixed by that.

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-02 10:56 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

M A Young wrote:> On Mon, 2 Mar 2009, Gerd Hoffmann wrote:
> 
>> Jeremy Fitzhardinge wrote:
>>> Yes, I think the legacy interrupts are not being set up completely,
but
>>> I''m not quite sure how they should be set up.  Will look
into it.
>>
>> FYI: Recently my dated, apic-less i386 laptop started to successfully
>>     boot the pv_ops/dom0 kernel, all the way up to userspace.
> 
> How recently? I know there was a fix in the first half of last week
> relating to smp mtrr that got one of my machines working, but the
> problem I am having wasn''t fixed by that.
Somewhen last week.  Didn''t try a while before that (early Feb IIRC),
so
there are quite a few candidates which could have fixed it ...

cheers,
  Gerd



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-05 01:52 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

M A Young wrote:> On Wed, 25 Feb 2009, M A Young wrote:
>
>> My x86_64 system has started working with the latest set of patches, 
>> I suspect as a result of the mtrr related smp changes. If QEMU is 
>> still broken, I will submit the boot log once I get a chance to test
it.
>
> The QEMU boot still fails (pvops patch up-to-date as of 2 days ago), 
> probably because it is emulating an ide style cdrom drive which I 
> believe xen still has problems with (I have similar problems booting a 
> different system with ide disks). The boot log (bzipped) is attached. 
I committed a change to properly initialize legacy irqs, which might 
help with IDE devices.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-05 05:15 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:> Jeremy Fitzhardinge wrote:
>   
>> Yes, I think the legacy interrupts are not being set up completely, but
>> I''m not quite sure how they should be set up.  Will look into
it.
>>     
>
> FYI: Recently my dated, apic-less i386 laptop started to successfully
>      boot the pv_ops/dom0 kernel, all the way up to userspace.
>   
Do you get a vga console?  Can you start domains?

x86-32 booting to usermode dom0, but only with serial console and domain 
creation fails (SIGBUS in the domain builder, so I''m hoping its related
to the hvm qemu crash).

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-05 07:35 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Jeremy Fitzhardinge wrote:> Gerd Hoffmann wrote:
>> Jeremy Fitzhardinge wrote:
>>  
>>> Yes, I think the legacy interrupts are not being set up completely,
but
>>> I''m not quite sure how they should be set up.  Will look
into it.
>>>     
>>
>> FYI: Recently my dated, apic-less i386 laptop started to successfully
>>      boot the pv_ops/dom0 kernel, all the way up to userspace.
>>   
> 
> Do you get a vga console?  Can you start domains?
gfx console works (i.e. kernel /xen-3.3.gz vga=gfx-1024x768x16).
vga text console didn''t last time I tried.
> x86-32 booting to usermode dom0, but only with serial console and domain
> creation fails (SIGBUS in the domain builder, so I''m hoping its
related
> to the hvm qemu crash).
Didn''t try yet, the machine is heavily underpowered for serious
virtualization work, it has 192 MB RAM only.  And hvm doesn''t work
anyway because the box is way to old for that (Pentium III).

cheers,
  Gerd

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-05 11:01 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:> Jeremy Fitzhardinge wrote:
>> Gerd Hoffmann wrote:
>>> Jeremy Fitzhardinge wrote:
>>>  
>>>> Yes, I think the legacy interrupts are not being set up
completely, but
>>>> I''m not quite sure how they should be set up.  Will
look into it.
>>>>     
>>> FYI: Recently my dated, apic-less i386 laptop started to
successfully
>>>      boot the pv_ops/dom0 kernel, all the way up to userspace.
>>>   
>> Do you get a vga console?  Can you start domains?
> 
> gfx console works (i.e. kernel /xen-3.3.gz vga=gfx-1024x768x16).
> vga text console didn''t last time I tried.
Update: latest kernel (rc7 based) crashes.  rc6 from somewhen last week
is the working one.  rc7 messages:

unhandled page fault (ec=0003)
page table walk from c1c55000
 l1[0x055] = 9c55061 1c55

-> rw access to r/o page?

EIP c140f4c3

c140f2d2 <alloc_bootmem_core>:
[ ... ]
c140f4c3:       f3 ab                   rep stos %eax,%es:(%edi)

-> memset(page,0,PAGE_SIZE) ?

Dom0 domain builder says page tables are at c1c55000 -> c1c6a000

/me guesses the initial page tables are released to the page allocator,
but still they are mapped r/o  =>  boom as soon as one happens to get
allocated.  Which probably happens very soon on memory-constrained
machines like mine, while other might stay up longer and show strange
bugs later on ;)

BTW: The trick to see the messages on the laptop screen is:

 kernel /xen-3.3.gz vga=text-80x50,keep
 module /vmlinuz-2.6.29-rc7-tip-kraxel ro root=/dev/zen/rawhide \
        console=hvc0
>> x86-32 booting to usermode dom0, but only with serial console and
domain
>> creation fails (SIGBUS in the domain builder, so I''m hoping
its related
>> to the hvm qemu crash).
> 
> Didn''t try yet, the machine is heavily underpowered for serious
> virtualization work, it has 192 MB RAM only.  And hvm doesn''t work
> anyway because the box is way to old for that (Pentium III).
Doesn''t work.  I get messages about failed multicalls with
remap_page_range and privcmd_ioctl in the stack trace.  Most likely
mapping the guest pages in the domain builder doesn''t work.  No
surprise
this leads to SIGBUS.

HTH,
  Gerd


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Pasi Kärkkäinen

2009-Mar-05 15:57 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Thu, Mar 05, 2009 at 08:35:29AM +0100, Gerd Hoffmann
wrote:> Jeremy Fitzhardinge wrote:
> > Gerd Hoffmann wrote:
> >> Jeremy Fitzhardinge wrote:
> >>  
> >>> Yes, I think the legacy interrupts are not being set up
completely, but
> >>> I''m not quite sure how they should be set up.  Will
look into it.
> >>>     
> >>
> >> FYI: Recently my dated, apic-less i386 laptop started to
successfully
> >>      boot the pv_ops/dom0 kernel, all the way up to userspace.
> >>   
> > 
> > Do you get a vga console?  Can you start domains?
> 
> gfx console works (i.e. kernel /xen-3.3.gz vga=gfx-1024x768x16).
> vga text console didn''t last time I tried.
> 
VGA text console doesn''t work for me either.

I''ve only managed to get the serial console working.. although I
haven''t
tried any graphics modes yet. 

Maybe I should play with these again..

-- Pasi

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-05 18:56 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:> Update: latest kernel (rc7 based) crashes.  rc6 from somewhen last week
> is the working one.  rc7 messages:
>
> unhandled page fault (ec=0003)
> page table walk from c1c55000
>  l1[0x055] = 9c55061 1c55
>
> -> rw access to r/o page?
>
> EIP c140f4c3
>
> c140f2d2 <alloc_bootmem_core>:
> [ ... ]
> c140f4c3:       f3 ab                   rep stos %eax,%es:(%edi)
>
> -> memset(page,0,PAGE_SIZE) ?
>
> Dom0 domain builder says page tables are at c1c55000 -> c1c6a000
>
> /me guesses the initial page tables are released to the page allocator,
> but still they are mapped r/o  =>  boom as soon as one happens to get
> allocated.  Which probably happens very soon on memory-constrained
> machines like mine, while other might stay up longer and show strange
> bugs later on ;)
>   
Hm.  You should see "XEN PAGETABLES" in the early reservations, which 
should protect them from then on.  Oh, look, its only doing it in the 
64-bit setup.
> BTW: The trick to see the messages on the laptop screen is:
>
>  kernel /xen-3.3.gz vga=text-80x50,keep
>  module /vmlinuz-2.6.29-rc7-tip-kraxel ro root=/dev/zen/rawhide \
>         console=hvc0
>   
Hm, OK.  But normal vga console should work.  Works fine on 64-bit; 
can''t think of why they might differ...

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Mar-05 20:23 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Wed, 4 Mar 2009, Jeremy Fitzhardinge wrote:
> M A Young wrote:
>> The QEMU boot still fails (pvops patch up-to-date as of 2 days ago), 
>> probably because it is emulating an ide style cdrom drive which I
believe
>> xen still has problems with (I have similar problems booting a
different
>> system with ide disks). The boot log (bzipped) is attached. 
>
> I committed a change to properly initialize legacy irqs, which might help 
> with IDE devices.
Yes, a recent update allows my qemu test environment to see its disk. It 
crashes later, but in what looks to be a non-xen related way. However 
booting the kernel non-xen crashes much faster with the traceback

RAMDISK: gzip decompressor not configured!
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<(null)>] (null)
PGD 0
Oops: 0010 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.29-0.114.2.6.rc7.fc10.x86_64 #1
RIP: 0010:[<0000000000000000>]  [<(null)>] (null)
RSP: 0018:ffff88003f76de38  EFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff88003b22dff0 RCX: ffffffff8162e4dc
RDX: ffffffff8162e49c RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88003f76ded0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000046 R11: ffff88003f76dde0 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff81625fa8
FS:  0000000000000000(0000) GS:ffff880003000000(0000) 
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88003f76c000, task 
ffff88003f770000)
Stack:
  ffffffff8162e7a8 ffffffff8162e471 ffffffff811a5fca ffff880000000010
  ffffffff814c6f0d 000000013f76de70 ffffffff00000000 ffff88003f76dea0
  ffffffff814f0922 0000000000000000 ffffffff814c6d54 000000005c2d2f7c
Call Trace:
  [<ffffffff8162e7a8>] ? rd_load_image+0x27b/0x4dd
  [<ffffffff8162e471>] ? error+0x0/0x2b
  [<ffffffff811a5fca>] ? sscanf+0x38/0x3a
  [<ffffffff8162eaa8>] initrd_load+0x31/0x2ed
  [<ffffffff8162e37f>] prepare_namespace+0xe2/0x19d
  [<ffffffff8162d73f>] kernel_init+0x21a/0x22a
  [<ffffffff81012e6a>] child_rip+0xa/0x20
  [<ffffffff81012850>] ? restore_args+0x0/0x30
  [<ffffffff8162d525>] ? kernel_init+0x0/0x22a
  [<ffffffff81012e60>] ? child_rip+0x0/0x20
Code:  Bad RIP value.
RIP  [<(null)>] (null)
  RSP <ffff88003f76de38>
CR2: 0000000000000000
---[ end trace a678a5d887494ac4 ]---
swapper used greatest stack depth: 4272 bytes left
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: swapper Tainted: G      D 
2.6.29-0.114.2.6.rc7.fc10.x86_64 #1
Call Trace:
  [<ffffffff813a6544>] panic+0x7a/0x13b
  [<ffffffff81071368>] ? trace_hardirqs_on_caller+0x1f/0x151
  [<ffffffff813a91b3>] ? _write_unlock_irq+0x30/0x3b
  [<ffffffff8105092f>] ? do_exit+0x37c/0x8a9
  [<ffffffff81050636>] do_exit+0x83/0x8a9
  [<ffffffff813a6646>] ? printk+0x41/0x43
  [<ffffffff813aa944>] oops_end+0xbf/0xc7
  [<ffffffff81032e8d>] no_context+0x1f2/0x201
  [<ffffffff81033046>] __bad_area_nosemaphore+0x1aa/0x1d0
  [<ffffffff8102e046>] ? pvclock_clocksource_read+0x47/0x83
  [<ffffffff811ab221>] ? debug_check_no_obj_freed+0x152/0x1c8
  [<ffffffff813abd09>] ? do_page_fault+0x11a/0x27f
  [<ffffffff8103307f>] bad_area_nosemaphore+0x13/0x15
  [<ffffffff813abd71>] do_page_fault+0x182/0x27f
  [<ffffffff813a9d65>] page_fault+0x25/0x30
  [<ffffffff8162e4dc>] ? compr_flush+0x0/0x51
  [<ffffffff8162e49c>] ? compr_fill+0x0/0x40
  [<ffffffff8162e7a8>] ? rd_load_image+0x27b/0x4dd
  [<ffffffff8162e471>] ? error+0x0/0x2b
  [<ffffffff811a5fca>] ? sscanf+0x38/0x3a
  [<ffffffff8162eaa8>] initrd_load+0x31/0x2ed
  [<ffffffff8162e37f>] prepare_namespace+0xe2/0x19d
  [<ffffffff8162d73f>] kernel_init+0x21a/0x22a
  [<ffffffff81012e6a>] child_rip+0xa/0x20
  [<ffffffff81012850>] ? restore_args+0x0/0x30
  [<ffffffff8162d525>] ? kernel_init+0x0/0x22a
  [<ffffffff81012e60>] ? child_rip+0x0/0x20
so that might indicate that some recent change breaks a non-dom0 boot 
(if I haven''t done something to break it myself).

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-05 20:34 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

M A Young wrote:> On Wed, 4 Mar 2009, Jeremy Fitzhardinge wrote:
>
>> M A Young wrote:
>>> The QEMU boot still fails (pvops patch up-to-date as of 2 days
ago),
>>> probably because it is emulating an ide style cdrom drive which I 
>>> believe xen still has problems with (I have similar problems
booting
>>> a different system with ide disks). The boot log (bzipped) is
attached.
>>
>> I committed a change to properly initialize legacy irqs, which might 
>> help with IDE devices.
>
> Yes, a recent update allows my qemu test environment to see its disk. 
> It crashes later, but in what looks to be a non-xen related way. 
> However booting the kernel non-xen crashes much faster with the traceback
>
> RAMDISK: gzip decompressor not configured!It looks like you need to configure gzip compression for your initrd.  
Also, make sure you don''t use any of the other compression algorithms 
for the kernel itself, or Xen won''t be able to parse them.
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<(null)>] (null)
This looks like a bug.  HPA, will it fall down a NULL function pointer 
if you leave compression out?

    J> PGD 0
> Oops: 0010 [#1] SMP
> last sysfs file:
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.29-0.114.2.6.rc7.fc10.x86_64 #1
> RIP: 0010:[<0000000000000000>]  [<(null)>] (null)
> RSP: 0018:ffff88003f76de38  EFLAGS: 00010246
> RAX: 0000000000000001 RBX: ffff88003b22dff0 RCX: ffffffff8162e4dc
> RDX: ffffffff8162e49c RSI: 0000000000000000 RDI: 0000000000000000
> RBP: ffff88003f76ded0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000046 R11: ffff88003f76dde0 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff81625fa8
> FS:  0000000000000000(0000) GS:ffff880003000000(0000) 
> knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 1, threadinfo ffff88003f76c000, task 
> ffff88003f770000)
> Stack:
>  ffffffff8162e7a8 ffffffff8162e471 ffffffff811a5fca ffff880000000010
>  ffffffff814c6f0d 000000013f76de70 ffffffff00000000 ffff88003f76dea0
>  ffffffff814f0922 0000000000000000 ffffffff814c6d54 000000005c2d2f7c
> Call Trace:
>  [<ffffffff8162e7a8>] ? rd_load_image+0x27b/0x4dd
>  [<ffffffff8162e471>] ? error+0x0/0x2b
>  [<ffffffff811a5fca>] ? sscanf+0x38/0x3a
>  [<ffffffff8162eaa8>] initrd_load+0x31/0x2ed
>  [<ffffffff8162e37f>] prepare_namespace+0xe2/0x19d
>  [<ffffffff8162d73f>] kernel_init+0x21a/0x22a
>  [<ffffffff81012e6a>] child_rip+0xa/0x20
>  [<ffffffff81012850>] ? restore_args+0x0/0x30
>  [<ffffffff8162d525>] ? kernel_init+0x0/0x22a
>  [<ffffffff81012e60>] ? child_rip+0x0/0x20
> Code:  Bad RIP value.
> RIP  [<(null)>] (null)
>  RSP <ffff88003f76de38>
> CR2: 0000000000000000
> ---[ end trace a678a5d887494ac4 ]---
> swapper used greatest stack depth: 4272 bytes left
> Kernel panic - not syncing: Attempted to kill init!
> Pid: 1, comm: swapper Tainted: G      D 
> 2.6.29-0.114.2.6.rc7.fc10.x86_64 #1
> Call Trace:
>  [<ffffffff813a6544>] panic+0x7a/0x13b
>  [<ffffffff81071368>] ? trace_hardirqs_on_caller+0x1f/0x151
>  [<ffffffff813a91b3>] ? _write_unlock_irq+0x30/0x3b
>  [<ffffffff8105092f>] ? do_exit+0x37c/0x8a9
>  [<ffffffff81050636>] do_exit+0x83/0x8a9
>  [<ffffffff813a6646>] ? printk+0x41/0x43
>  [<ffffffff813aa944>] oops_end+0xbf/0xc7
>  [<ffffffff81032e8d>] no_context+0x1f2/0x201
>  [<ffffffff81033046>] __bad_area_nosemaphore+0x1aa/0x1d0
>  [<ffffffff8102e046>] ? pvclock_clocksource_read+0x47/0x83
>  [<ffffffff811ab221>] ? debug_check_no_obj_freed+0x152/0x1c8
>  [<ffffffff813abd09>] ? do_page_fault+0x11a/0x27f
>  [<ffffffff8103307f>] bad_area_nosemaphore+0x13/0x15
>  [<ffffffff813abd71>] do_page_fault+0x182/0x27f
>  [<ffffffff813a9d65>] page_fault+0x25/0x30
>  [<ffffffff8162e4dc>] ? compr_flush+0x0/0x51
>  [<ffffffff8162e49c>] ? compr_fill+0x0/0x40
>  [<ffffffff8162e7a8>] ? rd_load_image+0x27b/0x4dd
>  [<ffffffff8162e471>] ? error+0x0/0x2b
>  [<ffffffff811a5fca>] ? sscanf+0x38/0x3a
>  [<ffffffff8162eaa8>] initrd_load+0x31/0x2ed
>  [<ffffffff8162e37f>] prepare_namespace+0xe2/0x19d
>  [<ffffffff8162d73f>] kernel_init+0x21a/0x22a
>  [<ffffffff81012e6a>] child_rip+0xa/0x20
>  [<ffffffff81012850>] ? restore_args+0x0/0x30
>  [<ffffffff8162d525>] ? kernel_init+0x0/0x22a
>  [<ffffffff81012e60>] ? child_rip+0x0/0x20
> so that might indicate that some recent change breaks a non-dom0 boot 
> (if I haven''t done something to break it myself).
>
>     Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Mar-06 01:07 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Thu, 5 Mar 2009, Jeremy Fitzhardinge wrote:
>> Yes, a recent update allows my qemu test environment to see its disk.
It
>> crashes later, but in what looks to be a non-xen related way. However 
>> booting the kernel non-xen crashes much faster with the traceback
>> 
>> RAMDISK: gzip decompressor not configured!
> It looks like you need to configure gzip compression for your initrd. 
Also,
> make sure you don''t use any of the other compression algorithms
for the
> kernel itself, or Xen won''t be able to parse them.
>
>> BUG: unable to handle kernel NULL pointer dereference at (null)
>> IP: [<(null)>] (null)
>
> This looks like a bug.  HPA, will it fall down a NULL function pointer if
you
> leave compression out?
It now boots (xen and non-xen) if I build with CONFIG_RD_GZIP=y (and work 
around the problems of my current livecd generating situation). I get the 
impression that the boot is expected to fail if this or an equivalent 
CONFIG_RD_BZIP2 or CONFIG_RD_LZMA setting, but it should do so more 
gracefully so I agree this is a bug.

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-06 14:15 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Jeremy Fitzhardinge wrote:> Gerd Hoffmann wrote:
> Hm.  You should see "XEN PAGETABLES" in the early reservations,
which
> should protect them from then on.  Oh, look, its only doing it in the
> 64-bit setup.
I see it is fixed in latest git.

Now the i386 machine shows other issues, device drivers fail to
register, errno 38 (ENOSYS).  Huh?  Also Shift-PgUp doesn''t work, which
makes me think it is an interrupt issue.  APCI-less box.  The x86_64
machine (with IO-APIC) is doing fine.
>> BTW: The trick to see the messages on the laptop screen is:
>>
>>  kernel /xen-3.3.gz vga=text-80x50,keep
>>  module /vmlinuz-2.6.29-rc7-tip-kraxel ro root=/dev/zen/rawhide \
>>         console=hvc0
>>   
> 
> Hm, OK.  But normal vga console should work.  
I can watch the cursor moving, just no characters appear on the screen.
 Maybe some I/O port access issue?  So the color palette is foobar and
it prints black on black?
> Works fine on 64-bit;
> can''t think of why they might differ...
My x86_64 machine is headless, so I can''t comment on that ;)

cheers,
  Gerd

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-06 14:56 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

> I can watch the cursor moving, just no characters appear on the screen.
>  Maybe some I/O port access issue?  So the color palette is foobar and
> it prints black on black?
Hmm, funny thing is, earlyprintk=vga _does_ print something. Looks
incomplete though (scrolls by quickly) and of course the early console
stops anyway as soon as the vga console takes over.

cheers,
  Gerd



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-06 15:20 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:> Jeremy Fitzhardinge wrote:
>   
>> Gerd Hoffmann wrote:
>> Hm.  You should see "XEN PAGETABLES" in the early
reservations, which
>> should protect them from then on.  Oh, look, its only doing it in the
>> 64-bit setup.
>>     
>
> I see it is fixed in latest git.
>
> Now the i386 machine shows other issues, device drivers fail to
> register, errno 38 (ENOSYS).  Huh?  Also Shift-PgUp doesn''t work,
which
> makes me think it is an interrupt issue.  APCI-less box.  The x86_64
> machine (with IO-APIC) is doing fine.
>   
Hm.  I was not really planning on supporting no-ACPI; I''ve only hooked 
acpi_register_gsi, which is called via acpi_pci_irq_enable.  I guess 
you''d need to do something in pirq_enable_irq as well, and I''m
not sure
if all the stuff gets set up properly for IO_APIC_get_PCI_irq_vector to 
work.
> I can watch the cursor moving, just no characters appear on the screen.
>  Maybe some I/O port access issue?  So the color palette is foobar and
> it prints black on black?
>   
Yes, that''s what I see too.  I spent some time staring at the vga 
framebuffer mapping and I can''t see anything wrong with it at all - and
I think the vga code can see its own framebuffer because it actually 
tests to see if it can write and read back from it.

The fact that the cursor moves around suggests that IO ports are the 
only thing that *are* working, but perhaps the io bitmap is coming in to 
play (hm, there''s the ring 1 vs ring 3 difference between 32 and 64
bit).

Aside from that, there''s the palette, as you suggested, and the 
character generator might be all empty too, I guess.

Also, when I try to start X it just spins there allocating memory until 
everything falls over (used to crash, before the pagetable reservation 
fix).  I''m guessing its related, but I haven''t looked into it
yet.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-06 15:21 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:>> I can watch the cursor moving, just no characters appear on the screen.
>>  Maybe some I/O port access issue?  So the color palette is foobar and
>> it prints black on black?
>>     
>
> Hmm, funny thing is, earlyprintk=vga _does_ print something. Looks
> incomplete though (scrolls by quickly) and of course the early console
> stops anyway as soon as the vga console takes over.
Ah, yes. I''d noticed that and forgotten about it.  Well.  What the hell
does that mean?

    J


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-06 15:34 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Hi,
>> Now the i386 machine shows other issues, device drivers fail to
>> register, errno 38 (ENOSYS).  Huh?  Also Shift-PgUp doesn''t
work, which
>> makes me think it is an interrupt issue.  APCI-less box.  The x86_64
>> machine (with IO-APIC) is doing fine.
> 
> Hm.  I was not really planning on supporting no-ACPI; I''ve only
hooked
> acpi_register_gsi, which is called via acpi_pci_irq_enable.  I guess
> you''d need to do something in pirq_enable_irq as well, and
I''m not sure
> if all the stuff gets set up properly for IO_APIC_get_PCI_irq_vector to
> work.
I have -rc6 kernel which *does* boot.  Maybe that was by accident ;)

The kernel initializes legacy interrupts anyway.  I think you don''t
need
to do more to handle apic-less machines, no?

cheers,
  Gerd

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-06 16:08 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:>> Hm.  I was not really planning on supporting no-ACPI; I''ve
only hooked
>> acpi_register_gsi, which is called via acpi_pci_irq_enable.  I guess
>> you''d need to do something in pirq_enable_irq as well, and
I''m not sure
>> if all the stuff gets set up properly for IO_APIC_get_PCI_irq_vector to
>> work.
>>     
>
> I have -rc6 kernel which *does* boot.  Maybe that was by accident ;)
>   
Hm, well I don''t remember adding any new ENOSYSes in there, so perhaps 
the core kernel has changed in some way under us.

I''m guessing it''s this:

static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
	struct irqaction *old, **old_ptr;
	const char *old_name = NULL;
	unsigned long flags;
	int shared = 0;
	int ret;

	if (!desc)
		return -EINVAL;

	if (desc->chip == &no_irq_chip)
		return -ENOSYS;
...


which gets called from request_irq.  So that means that the desc is 
getting allocated but the chip hasn''t been set up.  Are you using
sparse
irqs?
> The kernel initializes legacy interrupts anyway.  I think you
don''t need
> to do more to handle apic-less machines, no?
>   
I guess not, if its only using irqs < 16.  How old is this machine 
anyway; do you really mean its a literal i386?

But the info I''m using to set up the legacy interrupts comes from acpi 
tables, I think, so perhaps its misprogramming the legacy interrupts, 
whereas before they just happened to work in their default config (???).

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

M A Young

2009-Mar-06 19:35 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

On Thu, 5 Mar 2009, H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
>>
>> This looks like a bug.  HPA, will it fall down a NULL function pointer
>> if you leave compression out?
>>
>
> It''s not supposed to, obviously, but this could be a bug.  Could
the OP
> please post his .config?
The config is attached. It might be specific to x86_64 as I tried to 
reproduce it on i686 PAE with a similar kernel but failed.

 	Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-09 08:02 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Jeremy Fitzhardinge wrote:> static int
> __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
> {
>     struct irqaction *old, **old_ptr;
>     const char *old_name = NULL;
>     unsigned long flags;
>     int shared = 0;
>     int ret;
> 
>     if (!desc)
>         return -EINVAL;
> 
>     if (desc->chip == &no_irq_chip)
>         return -ENOSYS;
> ...
I can try sprinkle in some printk''s to figure ...
> which gets called from request_irq.  So that means that the desc is
> getting allocated but the chip hasn''t been set up.  Are you using
sparse
> irqs?
It''s enabled, yes (config is derived from default fedora one ...)
>> The kernel initializes legacy interrupts anyway.  I think you
don''t need
>> to do more to handle apic-less machines, no?
> 
> I guess not, if its only using irqs < 16.  How old is this machine
> anyway; do you really mean its a literal i386?
Pentium III (~2001), /proc/interrupts on bare metal looks like this:

           CPU0
  0:     617526    XT-PIC-XT        timer
  1:        286    XT-PIC-XT        i8042
  2:          0    XT-PIC-XT        cascade
  3:          3    XT-PIC-XT
  4:          1    XT-PIC-XT
  5:          1    XT-PIC-XT        Intel 440MX Modem, Intel 440MX
  6:          1    XT-PIC-XT
  7:          1    XT-PIC-XT
  8:          1    XT-PIC-XT        rtc0
  9:          5    XT-PIC-XT        acpi
 10:         50    XT-PIC-XT        yenta, firewire_ohci
 11:      13186    XT-PIC-XT        uhci_hcd:usb1, eth0
 12:       6131    XT-PIC-XT        i8042
 14:      56351    XT-PIC-XT        ata_piix
 15:          0    XT-PIC-XT        ata_piix
NMI:          0   Non-maskable interrupts
LOC:          0   Local timer interrupts
RES:          0   Rescheduling interrupts
CAL:          0   Function call interrupts
TLB:          0   TLB shootdowns
TRM:          0   Thermal event interrupts
SPU:          0   Spurious interrupts
ERR:          0
MIS:          0
> But the info I''m using to set up the legacy interrupts comes from
acpi
> tables, I think, so perhaps its misprogramming the legacy interrupts,
> whereas before they just happened to work in their default config (???).
As the *registration* fails already I don''t think it is misprogramming.

cheers,
  Gerd

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-09 13:20 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Hi,
> But the info I''m using to set up the legacy interrupts comes from
acpi
> tables, I think, so perhaps its misprogramming the legacy interrupts,
> whereas before they just happened to work in their default config (???).
Well, if there is no info in the acpi tables, you''ll ignore the IRQ
altogether ...

cheers,
  Gerd


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-09 15:38 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Jeremy Fitzhardinge wrote:> Gerd Hoffmann wrote:
>>> I can watch the cursor moving, just no characters appear on the
screen.
>>>  Maybe some I/O port access issue?  So the color palette is foobar
and
>>> it prints black on black?
>>>     
>>
>> Hmm, funny thing is, earlyprintk=vga _does_ print something. Looks
>> incomplete though (scrolls by quickly) and of course the early console
>> stops anyway as soon as the vga console takes over.
> 
> Ah, yes. I''d noticed that and forgotten about it.  Well.  What the
hell
> does that mean?
Memory mapping issue.  Seems to break during init_memory_mapping().

I can slowly print lines on /dev/tty0 using a shell loop with a sleep in
there.  Doesn''t print anything on the screen.  I can see the stuff
printed by earlyvga scroll through the screen though.  Thus vga register
access (for panning) works just fine.  Accessing the memory probably
ends up somewhere else due to the mappings not being setup correctly.

Last line of earlyvga output is this:
  init_memory_mapping: 0000000000000000-000000001d001000

Note that init_memory_mapping () has this close to the end if the function:

   #ifdef CONFIG_X86_32
        early_ioremap_page_table_range_init();
        load_cr3(swapper_pg_dir);
   #endif

i.e. early iomap setup is different in 32bit and 64bit.  Which would
also explain why vgacon works just fine in 64bit mode.

cheers,
  Gerd

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-09 15:56 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:> i.e. early iomap setup is different in 32bit and 64bit.  Which would
> also explain why vgacon works just fine in 64bit mode.
I think it is something else.

arch/x86/mm/init_64.c, phys_pte_init():

        /*
         * We will re-use the existing mapping.
         * Xen for example has some special requirements, like mapping
         * pagetable pages as RO. So assume someone who pre-setup
         * these mappings are more intelligent.
         */
        if (pte_val(*pte)) {
                pages++;
                continue;
         }

I think that does also make sure vga mappings are not overwritten with
something else.  32bit seems to have no equivalent for this though ...

cheers,
  Gerd


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-09 16:24 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:>   Hi,
>
>   
>> But the info I''m using to set up the legacy interrupts comes
from acpi
>> tables, I think, so perhaps its misprogramming the legacy interrupts,
>> whereas before they just happened to work in their default config
(???).
>>     
>
> Well, if there is no info in the acpi tables, you''ll ignore the
IRQ
> altogether ...
>   
Yes, that''s what I suspected.   Can you write up a proper patch?

Thanks,
    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-09 16:35 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:> Gerd Hoffmann wrote:
>   
>> i.e. early iomap setup is different in 32bit and 64bit.  Which would
>> also explain why vgacon works just fine in 64bit mode.
>>     
>
> I think it is something else.
>
> arch/x86/mm/init_64.c, phys_pte_init():
>
>         /*
>          * We will re-use the existing mapping.
>          * Xen for example has some special requirements, like mapping
>          * pagetable pages as RO. So assume someone who pre-setup
>          * these mappings are more intelligent.
>          */
>         if (pte_val(*pte)) {
>                 pages++;
>                 continue;
>          }
>
> I think that does also make sure vga mappings are not overwritten with
> something else.  32bit seems to have no equivalent for this though ...
>   It does, in a fairly hacky and disgusting way.  During boot, we use a 
special version of xen_set_pte which ignores attempts to convert a 
RO->RW mapping, in order to protect existing pagetable mappings.  But it 
probably won''t help if someone is trying to replace the ISA mappings 
with something else.  In theory all those mappings should be created 
with _PAGE_IOMAP anyway, so we''d do the right thing; but I
don''t think
that''s happening.  In the meantime I could extend the hack to look for 
attempts to overwrite _PAGE_IOMAP mappings or something...  Or force 
_PAGE_IOMAP on for pfns in the ISA window.

Fortunately there seems to be an active attempt to unify 32 and 64-bit 
mapping creation, should help (so long as it converges on the 64-bit 
code, which is more sensible).

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-10 09:39 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Jeremy Fitzhardinge wrote:> 
> Yes, that''s what I suspected.   Can you write up a proper patch?
I''ve two patches for you.  The first turns the silly
"xen-pirq-pirq" in
/proc/interrupts into something useful.  The second does proper legacy
irq setup on top of that.

enjoy,
  Gerd




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-10 17:13 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:> Jeremy Fitzhardinge wrote:
>   
>> Yes, that''s what I suspected.   Can you write up a proper
patch?
>>     
>
> I''ve two patches for you.  The first turns the silly
"xen-pirq-pirq" in
> /proc/interrupts into something useful.  The second does proper legacy
> irq setup on top of that.
>   
Could you s-o-b them too?
> +	if (0 == nr_ioapics) {
> +		for (irq=0; irq < NR_IRQS_LEGACY; irq++)
> +			xen_allocate_pirq(irq, "legacy");
> +		return;
> +	}
>   
I guess the assumption here is that if there''s no ioapics, we
don''t have
acpi?  Or I guess it doesn''t matter because we can''t program
the
triggering anyway.
> +
>  	/* Pre-allocate legacy irqs */
>  	for (irq=0; irq < NR_IRQS_LEGACY; irq++) {
> -		int trigger, polarity;
> -
> -		if (acpi_get_override_irq(irq, &trigger, &polarity) == -1)
> -			continue;
> +		int trigger= 1, polarity = 0;
>  
> +		acpi_get_override_irq(irq, &trigger, &polarity);
>  		xen_register_gsi(irq,
>  			trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE,
>  			polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH);
I don''t think this is correct, for two reasons.  1: I think the default
ISA triggering is edge/active low, so this will result in screaming 
interrupts if we ever use the defaults, but 2: acpi_get_override_irq() 
returns the appropriate default for ISA anyway, and we shouldn''t do 
anything if it fails (otherwise we might try to do things to magic-irq 2 
which could upset things, though I suspect Xen will stop anything really 
bad from happening).

    J


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-10 22:00 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Jeremy Fitzhardinge wrote:> Could you s-o-b them too?
Sure.
>> +    if (0 == nr_ioapics) {
>> +        for (irq=0; irq < NR_IRQS_LEGACY; irq++)
>> +            xen_allocate_pirq(irq, "legacy");
>> +        return;
>> +    }
>>   
> 
> I guess the assumption here is that if there''s no ioapics, we
don''t have
> acpi?  Or I guess it doesn''t matter because we can''t
program the
> triggering anyway.
We can''t program the trigger anyway.  My machine has acpi. 
Nevertheless
acpi_get_override_irq fails due to no ioapic being present.
>>      /* Pre-allocate legacy irqs */
>>      for (irq=0; irq < NR_IRQS_LEGACY; irq++) {
>> -        int trigger, polarity;
>> -
>> -        if (acpi_get_override_irq(irq, &trigger, &polarity) ==
-1)
>> -            continue;
>> +        int trigger= 1, polarity = 0;
>>  
>> +        acpi_get_override_irq(irq, &trigger, &polarity);
>>          xen_register_gsi(irq,
>>              trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE,
>>              polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH);
> 
> 2: acpi_get_override_irq()
> returns the appropriate default for ISA anyway, and we shouldn''t
do
> anything if it fails
Ok.  So the old code should be fine and we just need the additional loop
to handle the ioapic-less case.  Will send updated patches tomorrow.

cheers,
  Gerd



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Mar-10 22:46 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Gerd Hoffmann wrote:> Ok.  So the old code should be fine and we just need the additional loop
> to handle the ioapic-less case.  Will send updated patches tomorrow.
>   
OK.  I already applied them as-is just to check nothing breaks.  I''ll 
replace them when you repost.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gerd Hoffmann

2009-Mar-11 10:33 UTC

head link

Re: [Xen-devel] Re: Continuing problems booting

Jeremy Fitzhardinge wrote:> Gerd Hoffmann wrote:
>> Ok.  So the old code should be fine and we just need the additional
loop
>> to handle the ioapic-less case.  Will send updated patches tomorrow.
>>   
> 
> OK.  I already applied them as-is just to check nothing breaks. 
I''ll
> replace them when you repost.
Here we go.  Fixed ioapic loop as discussed, also updated names to be
more descriptive, looks like this now:

[root@xeni ~]# grep pirq /proc/interrupts
  1:          2          0          0          0  xen-pirq-ioapic-edge
i8042
  3:          3          0          0          0  xen-pirq-ioapic-edge
  4:          3          0          0          0  xen-pirq-ioapic-edge
  7:          0          0          0          0  xen-pirq-ioapic-edge
parport0
  8:          1          0          0          0  xen-pirq-ioapic-edge  rtc0
  9:          0          0          0          0  xen-pirq-ioapic-level
 acpi
 12:          4          0          0          0  xen-pirq-ioapic-edge
i8042
 16:          0          0          0          0  xen-pirq-ioapic-level
 uhci_hcd:usb3, uhci_hcd:usb8
 18:          0          0          0          0  xen-pirq-ioapic-level
 uhci_hcd:usb5
 19:       5288          0          0          0  xen-pirq-ioapic-level
 ehci_hcd:usb1, uhci_hcd:usb7, ahci
 20:        524          0          0          0  xen-pirq-ioapic-level
 eth0
 21:          0          0          0          0  xen-pirq-ioapic-level
 uhci_hcd:usb4
 22:        242          0          0          0  xen-pirq-ioapic-level
 HDA Intel
 23:          0          0          0          0  xen-pirq-ioapic-level
 ehci_hcd:usb2, uhci_hcd:usb6

[root@zen ~]# grep pirq /proc/interrupts
  1:          8  xen-pirq-xt-pic    i8042
  3:          5  xen-pirq-xt-pic
  4:          1  xen-pirq-xt-pic
  5:          0  xen-pirq-xt-pic    Intel 440MX, Intel 440MX Modem
  6:          1  xen-pirq-xt-pic
  7:          1  xen-pirq-xt-pic
  8:          1  xen-pirq-xt-pic    rtc0
 10:     200002  xen-pirq-xt-pic    yenta, firewire_ohci
 11:        196  xen-pirq-xt-pic    uhci_hcd:usb1, eth0
 12:        107  xen-pirq-xt-pic    i8042
 14:       2840  xen-pirq-xt-pic    ata_piix
 15:          0  xen-pirq-xt-pic    ata_piix

cheers,
  Gerd



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Feb 2009 - Continuing problems booting

[Xen-devel] Continuing problems booting

[Xen-devel] Re: Continuing problems booting

[Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting

Re: [Xen-devel] Re: Continuing problems booting