Recently I tried to move the shared_info page in the pvops kernel during shutdown, see "xen PVonHVM: move shared_info to MMIO before kexec" patch. But this change was reverted because it caused reboot failures because the actual moving of the shared page is fragile. So my question is what can be done about it? Right now the shared_info page is in the middle of the RAM and will be overwritten during kexec because its area is not reserved. Can the toolstack help to provide a reserved page which can then be used by the kernel? Olaf
On 27/08/2012 18:55, "Olaf Hering" <olaf@aepfle.de> wrote:> > Recently I tried to move the shared_info page in the pvops kernel during > shutdown, see "xen PVonHVM: move shared_info to MMIO before kexec" > patch. But this change was reverted because it caused reboot failures > because the actual moving of the shared page is fragile.How was it fragile? Moving it into MMIO should just work? -- Keir> So my question is what can be done about it? > Right now the shared_info page is in the middle of the RAM and will be > overwritten during kexec because its area is not reserved. > Can the toolstack help to provide a reserved page which can then be used > by the kernel? > > Olaf > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
On Mon, Aug 27, Keir Fraser wrote:> On 27/08/2012 18:55, "Olaf Hering" <olaf@aepfle.de> wrote: > > > > > Recently I tried to move the shared_info page in the pvops kernel during > > shutdown, see "xen PVonHVM: move shared_info to MMIO before kexec" > > patch. But this change was reverted because it caused reboot failures > > because the actual moving of the shared page is fragile. > > How was it fragile? Moving it into MMIO should just work?The moving itself is not the issue, but the possible access to the page during the move. Its not atomic, nor can it easily be atomic. Olaf
On 27/08/2012 22:32, "Olaf Hering" <olaf@aepfle.de> wrote:>>> Recently I tried to move the shared_info page in the pvops kernel during >>> shutdown, see "xen PVonHVM: move shared_info to MMIO before kexec" >>> patch. But this change was reverted because it caused reboot failures >>> because the actual moving of the shared page is fragile. >> >> How was it fragile? Moving it into MMIO should just work? > > The moving itself is not the issue, but the possible access to the page > during the move. Its not atomic, nor can it easily be atomic.Why not map it into MMIO in the first place, rather than into the middle of RAM? Do that early during boot, and early during resume from save/restore/migrate (i.e., places you presumably already map the shared_info page, but into the middle of RAM)? -- Keir
On Mon, Aug 27, Keir Fraser wrote:> On 27/08/2012 22:32, "Olaf Hering" <olaf@aepfle.de> wrote: > > >>> Recently I tried to move the shared_info page in the pvops kernel during > >>> shutdown, see "xen PVonHVM: move shared_info to MMIO before kexec" > >>> patch. But this change was reverted because it caused reboot failures > >>> because the actual moving of the shared page is fragile. > >> > >> How was it fragile? Moving it into MMIO should just work? > > > > The moving itself is not the issue, but the possible access to the page > > during the move. Its not atomic, nor can it easily be atomic. > > Why not map it into MMIO in the first place, rather than into the middle of > RAM? Do that early during boot, and early during resume from > save/restore/migrate (i.e., places you presumably already map the > shared_info page, but into the middle of RAM)?I think there is no easy way to know where the PCI device is located at the time the shared info page is configured in the early kernel setup. Olaf
On 27/08/2012 22:56, "Olaf Hering" <olaf@aepfle.de> wrote:> On Mon, Aug 27, Keir Fraser wrote: > >> On 27/08/2012 22:32, "Olaf Hering" <olaf@aepfle.de> wrote: >> >>>>> Recently I tried to move the shared_info page in the pvops kernel during >>>>> shutdown, see "xen PVonHVM: move shared_info to MMIO before kexec" >>>>> patch. But this change was reverted because it caused reboot failures >>>>> because the actual moving of the shared page is fragile. >>>> >>>> How was it fragile? Moving it into MMIO should just work? >>> >>> The moving itself is not the issue, but the possible access to the page >>> during the move. Its not atomic, nor can it easily be atomic. >> >> Why not map it into MMIO in the first place, rather than into the middle of >> RAM? Do that early during boot, and early during resume from >> save/restore/migrate (i.e., places you presumably already map the >> shared_info page, but into the middle of RAM)? > > I think there is no easy way to know where the PCI device is located at > the time the shared info page is configured in the early kernel setup.How about we guarantee that guests can use the 1MB in address range 0xFED00000-0xFEDFFFFF for whatever mappings they like, guaranteed unused (or at least, mapped with nothing useful) when guest kernel starts? This is already, and has always been, the case. But we can etch it in stone quite easily. :) Then you can stick shared_info at fed00000 always, plus there''s space there for plenty of other stuff that might one day be useful to map very early. -- Keir> Olaf
On Tue, Aug 28, Keir Fraser wrote:> How about we guarantee that guests can use the 1MB in address range > 0xFED00000-0xFEDFFFFF for whatever mappings they like, guaranteed unused (or > at least, mapped with nothing useful) when guest kernel starts? > > This is already, and has always been, the case. But we can etch it in stone > quite easily. :)0xFED00000UL is appearently the hpet base adress. But if there is room after that, then lets use that. However, I''m not familiar with these things. Should the area appear in the E820 map as reserverd? If so, where is the code which configures the guests E820 map? Olaf
On 28/08/2012 09:23, "Olaf Hering" <olaf@aepfle.de> wrote:> On Tue, Aug 28, Keir Fraser wrote: > >> How about we guarantee that guests can use the 1MB in address range >> 0xFED00000-0xFEDFFFFF for whatever mappings they like, guaranteed unused (or >> at least, mapped with nothing useful) when guest kernel starts? >> >> This is already, and has always been, the case. But we can etch it in stone >> quite easily. :) > > 0xFED00000UL is appearently the hpet base adress. But if there is room > after that, then lets use that. However, I''m not familiar with these > things. Should the area appear in the E820 map as reserverd? If so, > where is the code which configures the guests E820 map?Okay, that was a bit too clever, trying to hide between IOAPIC and LAPIC pages. How about a bit lower in memory -- FE700000-FE7FFFFF? Everything in range FC000000-FFFFFFFF should already be marked E820_RESERVED. You can test that, and also see tools/firmware/hvmloader/e820.c:build_e820_table() (and note that RESERVED_MEMBASE == FC000000). Can document in hvmloader/config.h and have mem_alloc() test against it rather than hvm_info->reserved_mem_pgstart. -- Keir> Olaf
On Tue, Aug 28, Keir Fraser wrote:> Okay, that was a bit too clever, trying to hide between IOAPIC and LAPIC > pages. How about a bit lower in memory -- FE700000-FE7FFFFF? > > Everything in range FC000000-FFFFFFFF should already be marked > E820_RESERVED. You can test that, and also see > tools/firmware/hvmloader/e820.c:build_e820_table() (and note that > RESERVED_MEMBASE == FC000000).Yes, FC000000-FFFFFFFF has already an E820_RESERVED entry. Within that range the kernel finds the IOAPIC, LAPIC and the HPET, perhaps because they are listed in the ACPI table or because they are found by other ways. To make the location of the of the shared pages configurable from the tools, does tools/firmware/hvmloader/acpi/dsdt.asl have a way to describe such special region? Maybe the kernel parses that table early enough, before the shared_info page gets used. Olaf
On 28/08/2012 13:42, "Olaf Hering" <olaf@aepfle.de> wrote:> On Tue, Aug 28, Keir Fraser wrote: > >> Okay, that was a bit too clever, trying to hide between IOAPIC and LAPIC >> pages. How about a bit lower in memory -- FE700000-FE7FFFFF? >> >> Everything in range FC000000-FFFFFFFF should already be marked >> E820_RESERVED. You can test that, and also see >> tools/firmware/hvmloader/e820.c:build_e820_table() (and note that >> RESERVED_MEMBASE == FC000000). > > Yes, FC000000-FFFFFFFF has already an E820_RESERVED entry. Within that > range the kernel finds the IOAPIC, LAPIC and the HPET, perhaps because > they are listed in the ACPI table or because they are found by other > ways.Yes they are all listed in various ACPI tables.> To make the location of the of the shared pages configurable from the > tools, does tools/firmware/hvmloader/acpi/dsdt.asl have a way to > describe such special region? Maybe the kernel parses that table early > enough, before the shared_info page gets used.If you have access to the ASL parser that early that would be awesome. You can then just add a new name or something in the DSDT, like "Name (\XENR, 0xFE700000)" and then evaluate that Name object during boot to get the numeric value (acpi_evaluate_object? Or acpi_evaluate_integer? Just guessing!). I have my doubts though... Some static ACPI tables are parsed very early, but you need some more sophistication brought online before you can parse out the DSDT and eval its contents. Well worth checking however, as sticking this in the DSDT would be about the best option if it can work. -- Keir> Olaf
Konrad Rzeszutek Wilk
2012-Aug-31 23:43 UTC
Re: fixed location of share info page in HVM guests
On Tue, Aug 28, 2012 at 02:35:19PM +0100, Keir Fraser wrote:> On 28/08/2012 13:42, "Olaf Hering" <olaf@aepfle.de> wrote: > > > On Tue, Aug 28, Keir Fraser wrote: > > > >> Okay, that was a bit too clever, trying to hide between IOAPIC and LAPIC > >> pages. How about a bit lower in memory -- FE700000-FE7FFFFF? > >> > >> Everything in range FC000000-FFFFFFFF should already be marked > >> E820_RESERVED. You can test that, and also see > >> tools/firmware/hvmloader/e820.c:build_e820_table() (and note that > >> RESERVED_MEMBASE == FC000000). > > > > Yes, FC000000-FFFFFFFF has already an E820_RESERVED entry. Within that > > range the kernel finds the IOAPIC, LAPIC and the HPET, perhaps because > > they are listed in the ACPI table or because they are found by other > > ways. > > Yes they are all listed in various ACPI tables. > > > To make the location of the of the shared pages configurable from the > > tools, does tools/firmware/hvmloader/acpi/dsdt.asl have a way to > > describe such special region? Maybe the kernel parses that table earlyIsn''t there also a magic string of said structure? If so we should also check for the magic string to make sure we are mapping the proper "thing." It can be done similary to how iBFT or EBDA is found.
On 22/10/2012 19:50, "Olaf Hering" <olaf@aepfle.de> wrote:>> Okay, that was a bit too clever, trying to hide between IOAPIC and LAPIC >> pages. How about a bit lower in memory -- FE700000-FE7FFFFF? >> >> Everything in range FC000000-FFFFFFFF should already be marked >> E820_RESERVED. You can test that, and also see >> tools/firmware/hvmloader/e820.c:build_e820_table() (and note that >> RESERVED_MEMBASE == FC000000). >> >> Can document in hvmloader/config.h and have mem_alloc() test against it >> rather than hvm_info->reserved_mem_pgstart. > > I came up with this (perhaps whitespace damaged) change, the guest still boots > ok. > The asl part is just copy&paste from the HPET part.Is there any point encoding this in the DSDT? I thought the guest OS needs to access the shared info area earlier than it is able to decode the DSDT? -- Keir
On Tue, Aug 28, Keir Fraser wrote:> On 28/08/2012 09:23, "Olaf Hering" <olaf@aepfle.de> wrote: > > > On Tue, Aug 28, Keir Fraser wrote: > > > >> How about we guarantee that guests can use the 1MB in address range > >> 0xFED00000-0xFEDFFFFF for whatever mappings they like, guaranteed unused (or > >> at least, mapped with nothing useful) when guest kernel starts? > >> > >> This is already, and has always been, the case. But we can etch it in stone > >> quite easily. :) > > > > 0xFED00000UL is appearently the hpet base adress. But if there is room > > after that, then lets use that. However, I''m not familiar with these > > things. Should the area appear in the E820 map as reserverd? If so, > > where is the code which configures the guests E820 map? > > Okay, that was a bit too clever, trying to hide between IOAPIC and LAPIC > pages. How about a bit lower in memory -- FE700000-FE7FFFFF? > > Everything in range FC000000-FFFFFFFF should already be marked > E820_RESERVED. You can test that, and also see > tools/firmware/hvmloader/e820.c:build_e820_table() (and note that > RESERVED_MEMBASE == FC000000). > > Can document in hvmloader/config.h and have mem_alloc() test against it > rather than hvm_info->reserved_mem_pgstart.I came up with this (perhaps whitespace damaged) change, the guest still boots ok. The asl part is just copy&paste from the HPET part. Olaf --- tools/firmware/hvmloader/acpi/dsdt.asl | 23 +++++++++++++++++++++++ tools/firmware/hvmloader/config.h | 2 ++ tools/firmware/hvmloader/util.c | 8 ++++++++ xen/arch/x86/hvm/hvm.c | 4 ++++ 4 files changed, 37 insertions(+) Index: xen-4.2.0-testing/tools/firmware/hvmloader/acpi/dsdt.asl ==================================================================--- xen-4.2.0-testing.orig/tools/firmware/hvmloader/acpi/dsdt.asl +++ xen-4.2.0-testing/tools/firmware/hvmloader/acpi/dsdt.asl @@ -50,6 +50,7 @@ DefinitionBlock ("DSDT.aml", "DSDT", 2, UAR1, 1, UAR2, 1, LTP1, 1, + XENR, 1, HPET, 1, Offset(4), PMIN, 32, @@ -166,6 +167,28 @@ DefinitionBlock ("DSDT.aml", "DSDT", 2, Return (PRT0) } + Device(XENR) { + Name(_UID, 0) + Method (_STA, 0, NotSerialized) { + If(LEqual(\_SB.XENR, 0)) { + Return(0x00) + } Else { + Return(0x0F) + } + } + Name(_CRS, ResourceTemplate() { + DWordMemory( + ResourceConsumer, PosDecode, MinFixed, MaxFixed, + NonCacheable, ReadWrite, + 0x00000000, + 0xFE700000, + 0xFE7FFFFF, + 0x00000000, + 0x00100000 /* 1M memory: FE700000 - FE7FFFFF */ + ) + }) + } + Device(HPET) { Name(_HID, EISAID("PNP0103")) Name(_UID, 0) Index: xen-4.2.0-testing/tools/firmware/hvmloader/config.h ==================================================================--- xen-4.2.0-testing.orig/tools/firmware/hvmloader/config.h +++ xen-4.2.0-testing/tools/firmware/hvmloader/config.h @@ -66,6 +66,8 @@ extern unsigned long pci_mem_start, pci_ /* NB. ACPI_INFO_PHYSICAL_ADDRESS *MUST* match definition in acpi/dsdt.asl! */ #define ACPI_INFO_PHYSICAL_ADDRESS 0xFC000000 #define RESERVED_MEMORY_DYNAMIC 0xFC001000 +#define HVM_SHARED_INFO_AREA 0xFE700000 +#define HVM_SHARED_INFO_SIZE 0x00100000 extern unsigned long scratch_start; Index: xen-4.2.0-testing/tools/firmware/hvmloader/util.c ==================================================================--- xen-4.2.0-testing.orig/tools/firmware/hvmloader/util.c +++ xen-4.2.0-testing/tools/firmware/hvmloader/util.c @@ -433,11 +433,19 @@ void *mem_alloc(uint32_t size, uint32_t if ( align < 16 ) align = 16; +retry: s = (reserve + align) & ~(align - 1); e = s + size - 1; BUG_ON((e < s) || (e >> PAGE_SHIFT) >= hvm_info->reserved_mem_pgstart); + /* Skip the shared info region */ + if (s <= HVM_SHARED_INFO_AREA && e >= HVM_SHARED_INFO_AREA) { + printf("%s: skipping HVM_SHARED_INFO_AREA: s %08x e %08x r %08x", __func__, s, e, reserve); + reserve = HVM_SHARED_INFO_AREA + HVM_SHARED_INFO_SIZE - 1; + goto retry; + } + while ( (reserve >> PAGE_SHIFT) != (e >> PAGE_SHIFT) ) { reserve += PAGE_SIZE;
>>> On 22.10.12 at 20:50, Olaf Hering <olaf@aepfle.de> wrote: > I came up with this (perhaps whitespace damaged) change, the guest still > boots ok. > The asl part is just copy&paste from the HPET part.... and, pending a response to Keir''s question on its need in the first place, need to be fixed:> --- xen-4.2.0-testing.orig/tools/firmware/hvmloader/acpi/dsdt.asl > +++ xen-4.2.0-testing/tools/firmware/hvmloader/acpi/dsdt.asl > @@ -50,6 +50,7 @@ DefinitionBlock ("DSDT.aml", "DSDT", 2, > UAR1, 1, > UAR2, 1, > LTP1, 1, > + XENR, 1, > HPET, 1,This gets things out of sync with build.c''s struct acpi_info. Which raises the question whether, even if the below is needed, this change is necessary.> Offset(4), > PMIN, 32, > @@ -166,6 +167,28 @@ DefinitionBlock ("DSDT.aml", "DSDT", 2, > Return (PRT0) > } > > + Device(XENR) { > + Name(_UID, 0) > + Method (_STA, 0, NotSerialized) { > + If(LEqual(\_SB.XENR, 0)) { > + Return(0x00) > + } Else { > + Return(0x0F) > + }Especially with the missing build.c change, this ought to be unconditionally returning 0x0F or get dropped (depending on whether _STA is required, which I don''t recall off the top of my head, but looking at other device declarations it doesn''t look like so). Jan> + } > + Name(_CRS, ResourceTemplate() { > + DWordMemory( > + ResourceConsumer, PosDecode, MinFixed, MaxFixed, > + NonCacheable, ReadWrite, > + 0x00000000, > + 0xFE700000, > + 0xFE7FFFFF, > + 0x00000000, > + 0x00100000 /* 1M memory: FE700000 - FE7FFFFF */ > + ) > + }) > + } > + > Device(HPET) { > Name(_HID, EISAID("PNP0103")) > Name(_UID, 0) > Index: xen-4.2.0-testing/tools/firmware/hvmloader/config.h > ==================================================================> --- xen-4.2.0-testing.orig/tools/firmware/hvmloader/config.h > +++ xen-4.2.0-testing/tools/firmware/hvmloader/config.h > @@ -66,6 +66,8 @@ extern unsigned long pci_mem_start, pci_ > /* NB. ACPI_INFO_PHYSICAL_ADDRESS *MUST* match definition in acpi/dsdt.asl! > */ > #define ACPI_INFO_PHYSICAL_ADDRESS 0xFC000000 > #define RESERVED_MEMORY_DYNAMIC 0xFC001000 > +#define HVM_SHARED_INFO_AREA 0xFE700000 > +#define HVM_SHARED_INFO_SIZE 0x00100000 > > extern unsigned long scratch_start; > > Index: xen-4.2.0-testing/tools/firmware/hvmloader/util.c > ==================================================================> --- xen-4.2.0-testing.orig/tools/firmware/hvmloader/util.c > +++ xen-4.2.0-testing/tools/firmware/hvmloader/util.c > @@ -433,11 +433,19 @@ void *mem_alloc(uint32_t size, uint32_t > if ( align < 16 ) > align = 16; > > +retry: > s = (reserve + align) & ~(align - 1); > e = s + size - 1; > > BUG_ON((e < s) || (e >> PAGE_SHIFT) >= hvm_info->reserved_mem_pgstart); > > + /* Skip the shared info region */ > + if (s <= HVM_SHARED_INFO_AREA && e >= HVM_SHARED_INFO_AREA) { > + printf("%s: skipping HVM_SHARED_INFO_AREA: s %08x e %08x r > %08x", __func__, s, e, reserve); > + reserve = HVM_SHARED_INFO_AREA + HVM_SHARED_INFO_SIZE - 1; > + goto retry; > + } > + > while ( (reserve >> PAGE_SHIFT) != (e >> PAGE_SHIFT) ) > { > reserve += PAGE_SIZE; > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
On Fri, Aug 31, Konrad Rzeszutek Wilk wrote:> On Tue, Aug 28, 2012 at 02:35:19PM +0100, Keir Fraser wrote: > > On 28/08/2012 13:42, "Olaf Hering" <olaf@aepfle.de> wrote: > > > > > On Tue, Aug 28, Keir Fraser wrote: > > > > > >> Okay, that was a bit too clever, trying to hide between IOAPIC and LAPIC > > >> pages. How about a bit lower in memory -- FE700000-FE7FFFFF? > > >> > > >> Everything in range FC000000-FFFFFFFF should already be marked > > >> E820_RESERVED. You can test that, and also see > > >> tools/firmware/hvmloader/e820.c:build_e820_table() (and note that > > >> RESERVED_MEMBASE == FC000000). > > > > > > Yes, FC000000-FFFFFFFF has already an E820_RESERVED entry. Within that > > > range the kernel finds the IOAPIC, LAPIC and the HPET, perhaps because > > > they are listed in the ACPI table or because they are found by other > > > ways. > > > > Yes they are all listed in various ACPI tables. > > > > > To make the location of the of the shared pages configurable from the > > > tools, does tools/firmware/hvmloader/acpi/dsdt.asl have a way to > > > describe such special region? Maybe the kernel parses that table early > > Isn''t there also a magic string of said structure? If so we should also > check for the magic string to make sure we are mapping the proper > "thing." It can be done similary to how iBFT or EBDA is found.I will have a look if ACPI can be used that early, for the time being a hardcoded address is used. This is the proposed kernel change, against 3.7-rc2. The open question is when to switch from early_ioremap to ioremap. Then the change to drivers/xen/platform-pci.c can be removed. Olaf xen PVonHVM: move shared_info to reserved memory area --- arch/x86/include/asm/xen/hypervisor.h | 2 ++ arch/x86/xen/enlighten.c | 53 ++++++++++++++++++++++++++--------- arch/x86/xen/suspend.c | 2 +- arch/x86/xen/xen-ops.h | 2 +- drivers/xen/platform-pci.c | 2 ++ include/xen/events.h | 2 ++ 6 files changed, 48 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h index 66d0fff..e29a02b 100644 --- a/arch/x86/include/asm/xen/hypervisor.h +++ b/arch/x86/include/asm/xen/hypervisor.h @@ -34,6 +34,8 @@ #define _ASM_X86_XEN_HYPERVISOR_H /* arch/i386/kernel/setup.c */ +#define HVM_SHARED_INFO_ADDR 0xFE700000UL + extern struct shared_info *HYPERVISOR_shared_info; extern struct start_info *xen_start_info; diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index e3497f2..b051c3f 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -103,7 +103,6 @@ struct shared_info xen_dummy_shared_info; void *xen_initial_gdt; -RESERVE_BRK(shared_info_page_brk, PAGE_SIZE); __read_mostly int xen_have_vector_callback; EXPORT_SYMBOL_GPL(xen_have_vector_callback); @@ -1497,38 +1496,66 @@ asmlinkage void __init xen_start_kernel(void) #endif } -void __ref xen_hvm_init_shared_info(void) +#ifdef CONFIG_XEN_PVHVM +static struct shared_info *xen_hvm_shared_info; + +static void xen_hvm_connect_shared_info(unsigned long pfn) { - int cpu; struct xen_add_to_physmap xatp; - static struct shared_info *shared_info_page = 0; - if (!shared_info_page) - shared_info_page = (struct shared_info *) - extend_brk(PAGE_SIZE, PAGE_SIZE); xatp.domid = DOMID_SELF; xatp.idx = 0; xatp.space = XENMAPSPACE_shared_info; - xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT; + xatp.gpfn = pfn; if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp)) BUG(); - HYPERVISOR_shared_info = (struct shared_info *)shared_info_page; +} +static void xen_hvm_set_shared_info(struct shared_info *sip) +{ + int cpu; + + HYPERVISOR_shared_info = sip; /* xen_vcpu is a pointer to the vcpu_info struct in the shared_info * page, we use it in the event channel upcall and in some pvclock * related functions. We don''t need the vcpu_info placement * optimizations because we don''t use any pv_mmu or pv_irq op on * HVM. - * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is - * online but xen_hvm_init_shared_info is run at resume time too and + * When xen_hvm_set_shared_info is run at boot time only vcpu 0 is + * online but xen_hvm_set_shared_info is run at resume time too and * in that case multiple vcpus might be online. */ for_each_online_cpu(cpu) { per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; } } -#ifdef CONFIG_XEN_PVHVM +/* Reconnect the shared_info pfn to a (new) mfn */ +void xen_hvm_resume_shared_info(void) +{ + xen_hvm_connect_shared_info(HVM_SHARED_INFO_ADDR >> PAGE_SHIFT); +} + +/* Update shared_info address */ +void __devinit xen_hvm_init_shared_info(void) +{ + struct shared_info *early; + + early = xen_hvm_shared_info; + xen_hvm_shared_info = ioremap(HVM_SHARED_INFO_ADDR, PAGE_SIZE); + xen_hvm_set_shared_info(xen_hvm_shared_info); + wmb(); + early_iounmap(early, PAGE_SIZE); +} + +/* Obtain an address to the pfn in reserved area */ +static void __init xen_hvm_early_init_shared_info(void) +{ + xen_hvm_shared_info = early_ioremap(HVM_SHARED_INFO_ADDR, PAGE_SIZE); + xen_hvm_connect_shared_info(HVM_SHARED_INFO_ADDR >> PAGE_SHIFT); + xen_hvm_set_shared_info(xen_hvm_shared_info); +} + static void __init init_hvm_pv_info(void) { int major, minor; @@ -1578,7 +1605,7 @@ static void __init xen_hvm_guest_init(void) { init_hvm_pv_info(); - xen_hvm_init_shared_info(); + xen_hvm_early_init_shared_info(); if (xen_feature(XENFEAT_hvm_callback_vector)) xen_have_vector_callback = 1; diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c index 45329c8..ae8a00c 100644 --- a/arch/x86/xen/suspend.c +++ b/arch/x86/xen/suspend.c @@ -30,7 +30,7 @@ void xen_arch_hvm_post_suspend(int suspend_cancelled) { #ifdef CONFIG_XEN_PVHVM int cpu; - xen_hvm_init_shared_info(); + xen_hvm_resume_shared_info(); xen_callback_vector(); xen_unplug_emulated_devices(); if (xen_feature(XENFEAT_hvm_safe_pvclock)) { diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index a95b417..d2e73d1 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -40,7 +40,7 @@ void xen_enable_syscall(void); void xen_vcpu_restore(void); void xen_callback_vector(void); -void xen_hvm_init_shared_info(void); +void xen_hvm_resume_shared_info(void); void xen_unplug_emulated_devices(void); void __init xen_build_dynamic_phys_to_machine(void); diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c index 97ca359..989365f 100644 --- a/drivers/xen/platform-pci.c +++ b/drivers/xen/platform-pci.c @@ -138,6 +138,8 @@ static int __devinit platform_pci_init(struct pci_dev *pdev, platform_mmio = mmio_addr; platform_mmiolen = mmio_len; + xen_hvm_init_shared_info(); + if (!xen_have_vector_callback) { ret = xen_allocate_irq(pdev); if (ret) { diff --git a/include/xen/events.h b/include/xen/events.h index c6bfe01..58c8431 100644 --- a/include/xen/events.h +++ b/include/xen/events.h @@ -58,6 +58,8 @@ void notify_remote_via_irq(int irq); void xen_irq_resume(void); +void xen_hvm_init_shared_info(void); + /* Clear an irq''s pending state, in preparation for polling on it */ void xen_clear_irq_pending(int irq); void xen_set_irq_pending(int irq); -- 1.7.12.4
On Tue, Oct 23, Jan Beulich wrote:> >>> On 22.10.12 at 20:50, Olaf Hering <olaf@aepfle.de> wrote: > > I came up with this (perhaps whitespace damaged) change, the guest still > > boots ok. > > The asl part is just copy&paste from the HPET part. > > ... and, pending a response to Keir''s question on its need in the > first place, need to be fixed:Until the kernel can really use such an ACPI entry I will remove this part of the patch. Olaf
>>> On 23.10.12 at 21:49, Olaf Hering <olaf@aepfle.de> wrote: > The open question is when to switch from early_ioremap to > ioremap. Then the change to drivers/xen/platform-pci.c can be removed.Wouldn''t you be better of using early_set_fixmap(FIX_PARAVIRT_BOOTMAP, ...) anyway? If not, mm_init() (due to its calling of vmalloc_init()) would be the apparent boundary between using early and normal ioremaps.> xen PVonHVM: move shared_info to reserved memory area > > --- > arch/x86/include/asm/xen/hypervisor.h | 2 ++ > arch/x86/xen/enlighten.c | 53 ++++++++++++++++++++++++++--------- > arch/x86/xen/suspend.c | 2 +- > arch/x86/xen/xen-ops.h | 2 +- > drivers/xen/platform-pci.c | 2 ++ > include/xen/events.h | 2 ++ > 6 files changed, 48 insertions(+), 15 deletions(-) > > diff --git a/arch/x86/include/asm/xen/hypervisor.h > b/arch/x86/include/asm/xen/hypervisor.h > index 66d0fff..e29a02b 100644 > --- a/arch/x86/include/asm/xen/hypervisor.h > +++ b/arch/x86/include/asm/xen/hypervisor.h > @@ -34,6 +34,8 @@ > #define _ASM_X86_XEN_HYPERVISOR_H > > /* arch/i386/kernel/setup.c */ > +#define HVM_SHARED_INFO_ADDR 0xFE700000UL > + > extern struct shared_info *HYPERVISOR_shared_info; > extern struct start_info *xen_start_info; > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index e3497f2..b051c3f 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -103,7 +103,6 @@ struct shared_info xen_dummy_shared_info; > > void *xen_initial_gdt; > > -RESERVE_BRK(shared_info_page_brk, PAGE_SIZE); > __read_mostly int xen_have_vector_callback; > EXPORT_SYMBOL_GPL(xen_have_vector_callback); > > @@ -1497,38 +1496,66 @@ asmlinkage void __init xen_start_kernel(void) > #endif > } > > -void __ref xen_hvm_init_shared_info(void) > +#ifdef CONFIG_XEN_PVHVM > +static struct shared_info *xen_hvm_shared_info; > + > +static void xen_hvm_connect_shared_info(unsigned long pfn) > { > - int cpu; > struct xen_add_to_physmap xatp; > - static struct shared_info *shared_info_page = 0; > > - if (!shared_info_page) > - shared_info_page = (struct shared_info *) > - extend_brk(PAGE_SIZE, PAGE_SIZE); > xatp.domid = DOMID_SELF; > xatp.idx = 0; > xatp.space = XENMAPSPACE_shared_info; > - xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT; > + xatp.gpfn = pfn; > if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp)) > BUG(); > > - HYPERVISOR_shared_info = (struct shared_info *)shared_info_page; > +} > +static void xen_hvm_set_shared_info(struct shared_info *sip) > +{ > + int cpu; > + > + HYPERVISOR_shared_info = sip; > > /* xen_vcpu is a pointer to the vcpu_info struct in the shared_info > * page, we use it in the event channel upcall and in some pvclock > * related functions. We don''t need the vcpu_info placement > * optimizations because we don''t use any pv_mmu or pv_irq op on > * HVM. > - * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is > - * online but xen_hvm_init_shared_info is run at resume time too and > + * When xen_hvm_set_shared_info is run at boot time only vcpu 0 is > + * online but xen_hvm_set_shared_info is run at resume time too and > * in that case multiple vcpus might be online. */ > for_each_online_cpu(cpu) { > per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; > } > } > > -#ifdef CONFIG_XEN_PVHVM > +/* Reconnect the shared_info pfn to a (new) mfn */ > +void xen_hvm_resume_shared_info(void) > +{ > + xen_hvm_connect_shared_info(HVM_SHARED_INFO_ADDR >> PAGE_SHIFT); > +} > + > +/* Update shared_info address */ > +void __devinit xen_hvm_init_shared_info(void) > +{ > + struct shared_info *early; > + > + early = xen_hvm_shared_info; > + xen_hvm_shared_info = ioremap(HVM_SHARED_INFO_ADDR, PAGE_SIZE); > + xen_hvm_set_shared_info(xen_hvm_shared_info); > + wmb(); > + early_iounmap(early, PAGE_SIZE);This is a call from __devinit to __init, which isn''t allowed. And I''m pretty sure it''s too late to do this here anyway. Jan> +} > + > +/* Obtain an address to the pfn in reserved area */ > +static void __init xen_hvm_early_init_shared_info(void) > +{ > + xen_hvm_shared_info = early_ioremap(HVM_SHARED_INFO_ADDR, PAGE_SIZE); > + xen_hvm_connect_shared_info(HVM_SHARED_INFO_ADDR >> PAGE_SHIFT); > + xen_hvm_set_shared_info(xen_hvm_shared_info); > +} > + > static void __init init_hvm_pv_info(void) > { > int major, minor; > @@ -1578,7 +1605,7 @@ static void __init xen_hvm_guest_init(void) > { > init_hvm_pv_info(); > > - xen_hvm_init_shared_info(); > + xen_hvm_early_init_shared_info(); > > if (xen_feature(XENFEAT_hvm_callback_vector)) > xen_have_vector_callback = 1; > diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c > index 45329c8..ae8a00c 100644 > --- a/arch/x86/xen/suspend.c > +++ b/arch/x86/xen/suspend.c > @@ -30,7 +30,7 @@ void xen_arch_hvm_post_suspend(int suspend_cancelled) > { > #ifdef CONFIG_XEN_PVHVM > int cpu; > - xen_hvm_init_shared_info(); > + xen_hvm_resume_shared_info(); > xen_callback_vector(); > xen_unplug_emulated_devices(); > if (xen_feature(XENFEAT_hvm_safe_pvclock)) { > diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h > index a95b417..d2e73d1 100644 > --- a/arch/x86/xen/xen-ops.h > +++ b/arch/x86/xen/xen-ops.h > @@ -40,7 +40,7 @@ void xen_enable_syscall(void); > void xen_vcpu_restore(void); > > void xen_callback_vector(void); > -void xen_hvm_init_shared_info(void); > +void xen_hvm_resume_shared_info(void); > void xen_unplug_emulated_devices(void); > > void __init xen_build_dynamic_phys_to_machine(void); > diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c > index 97ca359..989365f 100644 > --- a/drivers/xen/platform-pci.c > +++ b/drivers/xen/platform-pci.c > @@ -138,6 +138,8 @@ static int __devinit platform_pci_init(struct pci_dev > *pdev, > platform_mmio = mmio_addr; > platform_mmiolen = mmio_len; > > + xen_hvm_init_shared_info(); > + > if (!xen_have_vector_callback) { > ret = xen_allocate_irq(pdev); > if (ret) { > diff --git a/include/xen/events.h b/include/xen/events.h > index c6bfe01..58c8431 100644 > --- a/include/xen/events.h > +++ b/include/xen/events.h > @@ -58,6 +58,8 @@ void notify_remote_via_irq(int irq); > > void xen_irq_resume(void); > > +void xen_hvm_init_shared_info(void); > + > /* Clear an irq''s pending state, in preparation for polling on it */ > void xen_clear_irq_pending(int irq); > void xen_set_irq_pending(int irq); > -- > 1.7.12.4 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
On Wed, Oct 24, Jan Beulich wrote:> >>> On 23.10.12 at 21:49, Olaf Hering <olaf@aepfle.de> wrote: > > The open question is when to switch from early_ioremap to > > ioremap. Then the change to drivers/xen/platform-pci.c can be removed. > > Wouldn''t you be better of using > early_set_fixmap(FIX_PARAVIRT_BOOTMAP, ...) anyway? If not, > mm_init() (due to its calling of vmalloc_init()) would be the apparent > boundary between using early and normal ioremaps.set_fixmap seems to work. This patch works for me, with kexec and save/restore. Olaf --- arch/x86/include/asm/xen/hypervisor.h | 2 ++ arch/x86/xen/enlighten.c | 41 +++++++++++++++++++++++++---------- arch/x86/xen/suspend.c | 2 +- arch/x86/xen/xen-ops.h | 2 +- 4 files changed, 33 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h index 66d0fff..e29a02b 100644 --- a/arch/x86/include/asm/xen/hypervisor.h +++ b/arch/x86/include/asm/xen/hypervisor.h @@ -34,6 +34,8 @@ #define _ASM_X86_XEN_HYPERVISOR_H /* arch/i386/kernel/setup.c */ +#define HVM_SHARED_INFO_ADDR 0xFE700000UL + extern struct shared_info *HYPERVISOR_shared_info; extern struct start_info *xen_start_info; diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index e3497f2..2cf40c9 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -103,7 +103,6 @@ struct shared_info xen_dummy_shared_info; void *xen_initial_gdt; -RESERVE_BRK(shared_info_page_brk, PAGE_SIZE); __read_mostly int xen_have_vector_callback; EXPORT_SYMBOL_GPL(xen_have_vector_callback); @@ -1497,38 +1496,56 @@ asmlinkage void __init xen_start_kernel(void) #endif } -void __ref xen_hvm_init_shared_info(void) +#ifdef CONFIG_XEN_PVHVM +static struct shared_info *xen_hvm_shared_info; + +static void xen_hvm_connect_shared_info(unsigned long pfn) { - int cpu; struct xen_add_to_physmap xatp; - static struct shared_info *shared_info_page = 0; - if (!shared_info_page) - shared_info_page = (struct shared_info *) - extend_brk(PAGE_SIZE, PAGE_SIZE); xatp.domid = DOMID_SELF; xatp.idx = 0; xatp.space = XENMAPSPACE_shared_info; - xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT; + xatp.gpfn = pfn; if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp)) BUG(); - HYPERVISOR_shared_info = (struct shared_info *)shared_info_page; +} +static void xen_hvm_set_shared_info(struct shared_info *sip) +{ + int cpu; + + HYPERVISOR_shared_info = sip; /* xen_vcpu is a pointer to the vcpu_info struct in the shared_info * page, we use it in the event channel upcall and in some pvclock * related functions. We don''t need the vcpu_info placement * optimizations because we don''t use any pv_mmu or pv_irq op on * HVM. - * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is - * online but xen_hvm_init_shared_info is run at resume time too and + * When xen_hvm_set_shared_info is run at boot time only vcpu 0 is + * online but xen_hvm_set_shared_info is run at resume time too and * in that case multiple vcpus might be online. */ for_each_online_cpu(cpu) { per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; } } -#ifdef CONFIG_XEN_PVHVM +/* Reconnect the shared_info pfn to a (new) mfn */ +void xen_hvm_resume_shared_info(void) +{ + xen_hvm_connect_shared_info(HVM_SHARED_INFO_ADDR >> PAGE_SHIFT); +} + +/* Obtain an address to the pfn in reserved area */ +static void __init xen_hvm_init_shared_info(void) +{ + set_fixmap(FIX_PARAVIRT_BOOTMAP, HVM_SHARED_INFO_ADDR); + xen_hvm_shared_info + (struct shared_info *)fix_to_virt(FIX_PARAVIRT_BOOTMAP); + xen_hvm_connect_shared_info(HVM_SHARED_INFO_ADDR >> PAGE_SHIFT); + xen_hvm_set_shared_info(xen_hvm_shared_info); +} + static void __init init_hvm_pv_info(void) { int major, minor; diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c index 45329c8..ae8a00c 100644 --- a/arch/x86/xen/suspend.c +++ b/arch/x86/xen/suspend.c @@ -30,7 +30,7 @@ void xen_arch_hvm_post_suspend(int suspend_cancelled) { #ifdef CONFIG_XEN_PVHVM int cpu; - xen_hvm_init_shared_info(); + xen_hvm_resume_shared_info(); xen_callback_vector(); xen_unplug_emulated_devices(); if (xen_feature(XENFEAT_hvm_safe_pvclock)) { diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index a95b417..d2e73d1 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -40,7 +40,7 @@ void xen_enable_syscall(void); void xen_vcpu_restore(void); void xen_callback_vector(void); -void xen_hvm_init_shared_info(void); +void xen_hvm_resume_shared_info(void); void xen_unplug_emulated_devices(void); void __init xen_build_dynamic_phys_to_machine(void); -- 1.7.12.4