thr3ads.net - Linux Virtualization - [PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization [Mar 2008]

If this information is useful, please help other people find it:
Share via:

Isaku Yamahata

2008-Mar-05 18:18 UTC

[PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization

Hi. This patchset implements xen/ia64 domU support.
Qing He and Eddie Dong also has been woring on pv_ops so that
I want to discuss before going further and avoid duplicated work.
I suppose that Eddie will also post his own patch. So reviewing both
patches, we can reach to better pv_ops interface.


- I didn't changed the ia64 intrinsic paravirtulization abi from
  the last post. Presumably it would be better to discuss with
  the Eddie's patch.

- I implemented The basic portion of domU pv_ops.
  They may need the interface refinement. Probably Eddie has
  his own opinion.

- This time I dropped the patches which hasn't been pv_ops'fied yet 
  because they aren't changed from the last post.

You can also get the full source from
http://people.valinux.co.jp/~yamahata/xen-ia64/linux-2.6-xen-ia64.git/
branch: xen-ia64-2008mar06

The patchset are organized as follows
- xen arch portability.
  Generalizes x86 xen patches for ia64 support.
- some preliminary patches.
  Make kernel paravirtualization friendly.
- introduce pv_ops and the definitions for native.
- basic helper functions for xen ia64 support.
- introduce the pv_ops instance for xen/ia64.


TODO:
- discuss and define intrinsic paravirtualization abi.
- discuss pv_ops.
- more pv_ops for domU
  - mca/sal call
  - timer
  - gate page
  - fsys
- support save/restore/live migration
- more clean ups
  - remove unnecessary if (is_running_on_xen()).
- Free xen_ivt areas somehow. No waste kernel space
  From Keith Owens idea.
  Probably after defining ABI because this is just optimization.
- dom0
  consider after finishing domU/ia64 merge.


Changes from take 2:
- many clean ups following to comments.
- clean up:assembly instruction macro.
- introduced pv_ops: pv_info, pv_init_ops, pv_iosapic_ops, pv_irq_ops.

Changes from take 1:
Single IVT source code. compile multitimes using assembler macros.

thanks,

Diffstat:
 arch/ia64/Kconfig                                  |   72 +++
 arch/ia64/kernel/Makefile                          |   30 +-
 arch/ia64/kernel/acpi.c                            |    4 +
 arch/ia64/kernel/asm-offsets.c                     |   25 +
 arch/ia64/kernel/entry.S                           |  568 +------------------
 arch/ia64/kernel/head.S                            |    6 +
 arch/ia64/kernel/inst_native.h                     |  183 ++++++
 arch/ia64/kernel/inst_paravirt.h                   |   28 +
 arch/ia64/kernel/iosapic.c                         |   43 +-
 arch/ia64/kernel/irq_ia64.c                        |   21 +-
 arch/ia64/kernel/ivt.S                             |  153 +++---
 arch/ia64/kernel/minstate.h                        |   10 +-
 arch/ia64/kernel/module.c                          |   32 +
 arch/ia64/kernel/pal.S                             |    5 +-
 arch/ia64/kernel/paravirt.c                        |   94 +++
 arch/ia64/kernel/paravirt_alt.c                    |  118 ++++
 arch/ia64/kernel/paravirt_core.c                   |  201 +++++++
 arch/ia64/kernel/paravirt_entry.c                  |   99 ++++
 arch/ia64/kernel/paravirt_nop.c                    |   49 ++
 arch/ia64/kernel/paravirtentry.S                   |   37 ++
 arch/ia64/kernel/setup.c                           |   14 +
 arch/ia64/kernel/smpboot.c                         |    2 +
 arch/ia64/kernel/switch_leave.S                    |  603 +++++++++++++++++++
 arch/ia64/kernel/vmlinux.lds.S                     |   35 ++
 arch/ia64/xen/Makefile                             |    9 +
 arch/ia64/xen/hypercall.S                          |  141 +++++
 arch/ia64/xen/hypervisor.c                         |  235 ++++++++
 arch/ia64/xen/inst_xen.h                           |  503 ++++++++++++++++
 arch/ia64/xen/irq_xen.c                            |  435 ++++++++++++++
 arch/ia64/xen/irq_xen.h                            |    8 +
 arch/ia64/xen/machvec.c                            |    4 +
 arch/ia64/xen/paravirt_xen.c                       |  242 ++++++++
 arch/ia64/xen/privops_asm.S                        |  221 +++++++
 arch/ia64/xen/privops_c.c                          |  279 +++++++++
 arch/ia64/xen/util.c                               |  101 ++++
 arch/ia64/xen/xcom_asm.S                           |   27 +
 arch/ia64/xen/xcom_hcall.c                         |  458 +++++++++++++++
 arch/ia64/xen/xen_pv_ops.c                         |  319 ++++++++++
 arch/ia64/xen/xencomm.c                            |  108 ++++
 arch/ia64/xen/xenivt.S                             |   59 ++
 arch/ia64/{kernel/minstate.h => xen/xenminstate.h} |   96 +---
 arch/ia64/xen/xenpal.S                             |   76 +++
 arch/ia64/xen/xensetup.S                           |   60 ++
 arch/x86/xen/Makefile                              |    4 +-
 arch/x86/xen/grant-table.c                         |   91 +++
 arch/x86/xen/xen-ops.h                             |    2 +-
 drivers/xen/Makefile                               |    3 +-
 {arch/x86 => drivers}/xen/events.c                 |   33 +-
 {arch/x86 => drivers}/xen/features.c               |    0 
 drivers/xen/grant-table.c                          |   37 +--
 drivers/xen/xenbus/xenbus_client.c                 |    6 +-
 drivers/xen/xencomm.c                              |  232 ++++++++
 include/asm-ia64/gcc_intrin.h                      |   58 +-
 include/asm-ia64/hw_irq.h                          |   24 +-
 include/asm-ia64/intel_intrin.h                    |   64 +-
 include/asm-ia64/intrinsics.h                      |   12 +
 include/asm-ia64/iosapic.h                         |   18 +-
 include/asm-ia64/irq.h                             |   33 ++
 include/asm-ia64/machvec.h                         |    2 +
 include/asm-ia64/machvec_xen.h                     |   22 +
 include/asm-ia64/meminit.h                         |    3 +-
 include/asm-ia64/mmu_context.h                     |    6 +-
 include/asm-ia64/module.h                          |    6 +
 include/asm-ia64/page.h                            |    8 +
 include/asm-ia64/paravirt.h                        |  284 +++++++++
 include/asm-ia64/paravirt_alt.h                    |   82 +++
 include/asm-ia64/paravirt_core.h                   |   54 ++
 include/asm-ia64/paravirt_entry.h                  |   62 ++
 include/asm-ia64/paravirt_nop.h                    |   46 ++
 include/asm-ia64/privop.h                          |   67 +++
 include/asm-ia64/privop_paravirt.h                 |  587 +++++++++++++++++++
 include/asm-ia64/sync_bitops.h                     |   59 ++
 include/asm-ia64/system.h                          |    4 +-
 include/asm-ia64/xen/hypercall.h                   |  426 ++++++++++++++
 include/asm-ia64/xen/hypervisor.h                  |  249 ++++++++
 include/asm-ia64/xen/interface.h                   |  585 +++++++++++++++++++
 include/asm-ia64/xen/page.h                        |   41 ++
 include/asm-ia64/xen/privop.h                      |  609 ++++++++++++++++++++
 include/asm-ia64/xen/xcom_hcall.h                  |   55 ++
 include/asm-ia64/xen/xencomm.h                     |   33 ++
 include/asm-x86/xen/hypervisor.h                   |   10 +
 include/asm-x86/xen/interface.h                    |   24 +
 include/{ => asm-x86}/xen/page.h                   |    0 
 include/xen/events.h                               |    1 +
 include/xen/grant_table.h                          |    6 +
 include/xen/interface/callback.h                   |  119 ++++
 include/xen/interface/grant_table.h                |   11 +-
 include/xen/interface/vcpu.h                       |    5 +
 include/xen/interface/xen.h                        |   22 +-
 include/xen/interface/xencomm.h                    |   41 ++
 include/xen/page.h                                 |  181 +------
 include/xen/xen-ops.h                              |    6 +
 include/xen/xencomm.h                              |   77 +++
 93 files changed, 9174 insertions(+), 1049 deletions(-)

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 01/50] xen: add missing __HYPERVISOR_arch_[0-7] definisions which ia64 needs.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/xen/interface/xen.h |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
index 518a5bf..87ad143 100644
--- a/include/xen/interface/xen.h
+++ b/include/xen/interface/xen.h
@@ -58,6 +58,16 @@
 #define __HYPERVISOR_physdev_op           33
 #define __HYPERVISOR_hvm_op               34
 
+/* Architecture-specific hypercall definitions. */
+#define __HYPERVISOR_arch_0               48
+#define __HYPERVISOR_arch_1               49
+#define __HYPERVISOR_arch_2               50
+#define __HYPERVISOR_arch_3               51
+#define __HYPERVISOR_arch_4               52
+#define __HYPERVISOR_arch_5               53
+#define __HYPERVISOR_arch_6               54
+#define __HYPERVISOR_arch_7               55
+
 /*
  * VIRTUAL INTERRUPTS
  *
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 02/50] xen: add missing VIRQ_ARCH_[0-7] definitions which ia64/xen needs.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/xen/interface/xen.h |   12 +++++++++++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
index 87ad143..9b018da 100644
--- a/include/xen/interface/xen.h
+++ b/include/xen/interface/xen.h
@@ -78,8 +78,18 @@
 #define VIRQ_CONSOLE    2  /* (DOM0) Bytes received on emergency console. */
 #define VIRQ_DOM_EXC    3  /* (DOM0) Exceptional event for some domain.   */
 #define VIRQ_DEBUGGER   6  /* (DOM0) A domain has paused for debugging.   */
-#define NR_VIRQS        8
 
+/* Architecture-specific VIRQ definitions. */
+#define VIRQ_ARCH_0    16
+#define VIRQ_ARCH_1    17
+#define VIRQ_ARCH_2    18
+#define VIRQ_ARCH_3    19
+#define VIRQ_ARCH_4    20
+#define VIRQ_ARCH_5    21
+#define VIRQ_ARCH_6    22
+#define VIRQ_ARCH_7    23
+
+#define NR_VIRQS       24
 /*
  * MMU-UPDATE REQUESTS
  *
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 03/50] xen: add missing definitions for xen grant table which ia64/xen needs.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 drivers/xen/grant-table.c           |    2 +-
 include/asm-x86/xen/interface.h     |   24 ++++++++++++++++++++++++
 include/xen/interface/grant_table.h |   11 ++++++++---
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index ea94dba..95016fd 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -466,7 +466,7 @@ static int gnttab_map(unsigned int start_idx, unsigned int
end_idx)
 
 	setup.dom        = DOMID_SELF;
 	setup.nr_frames  = nr_gframes;
-	setup.frame_list = frames;
+	set_xen_guest_handle(setup.frame_list, frames);
 
 	rc = HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1);
 	if (rc == -ENOSYS) {
diff --git a/include/asm-x86/xen/interface.h b/include/asm-x86/xen/interface.h
index 165c396..49993dd 100644
--- a/include/asm-x86/xen/interface.h
+++ b/include/asm-x86/xen/interface.h
@@ -22,6 +22,30 @@
 #define DEFINE_GUEST_HANDLE(name) __DEFINE_GUEST_HANDLE(name, name)
 #define GUEST_HANDLE(name)        __guest_handle_ ## name
 
+#ifdef __XEN__
+#if defined(__i386__)
+#define set_xen_guest_handle(hnd, val)			\
+	do {						\
+		if (sizeof(hnd) == 8)			\
+			*(uint64_t *)&(hnd) = 0;	\
+		(hnd).p = val;				\
+	} while (0)
+#elif defined(__x86_64__)
+#define set_xen_guest_handle(hnd, val)	do { (hnd).p = val; } while (0)
+#endif
+#else
+#if defined(__i386__)
+#define set_xen_guest_handle(hnd, val)			\
+	do {						\
+		if (sizeof(hnd) == 8)			\
+			*(uint64_t *)&(hnd) = 0;	\
+		(hnd) = val;				\
+	} while (0)
+#elif defined(__x86_64__)
+#define set_xen_guest_handle(hnd, val)	do { (hnd) = val; } while (0)
+#endif
+#endif
+
 #ifndef __ASSEMBLY__
 /* Guest handles for primitive C types. */
 __DEFINE_GUEST_HANDLE(uchar, unsigned char);
diff --git a/include/xen/interface/grant_table.h
b/include/xen/interface/grant_table.h
index 2190498..39da93c 100644
--- a/include/xen/interface/grant_table.h
+++ b/include/xen/interface/grant_table.h
@@ -185,6 +185,7 @@ struct gnttab_map_grant_ref {
     grant_handle_t handle;
     uint64_t dev_bus_addr;
 };
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_map_grant_ref);
 
 /*
  * GNTTABOP_unmap_grant_ref: Destroy one or more grant-reference mappings
@@ -206,6 +207,7 @@ struct gnttab_unmap_grant_ref {
     /* OUT parameters. */
     int16_t  status;              /* GNTST_* */
 };
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_unmap_grant_ref);
 
 /*
  * GNTTABOP_setup_table: Set up a grant table for <dom> comprising at
least
@@ -223,8 +225,9 @@ struct gnttab_setup_table {
     uint32_t nr_frames;
     /* OUT parameters. */
     int16_t  status;              /* GNTST_* */
-    ulong *frame_list;
+    GUEST_HANDLE(ulong) frame_list;
 };
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_setup_table);
 
 /*
  * GNTTABOP_dump_table: Dump the contents of the grant table to the
@@ -237,6 +240,7 @@ struct gnttab_dump_table {
     /* OUT parameters. */
     int16_t status;               /* GNTST_* */
 };
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_dump_table);
 
 /*
  * GNTTABOP_transfer_grant_ref: Transfer <frame> to a foreign domain. The
@@ -255,7 +259,7 @@ struct gnttab_transfer {
     /* OUT parameters. */
     int16_t       status;
 };
-
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_transfer);
 
 /*
  * GNTTABOP_copy: Hypervisor based copy
@@ -296,6 +300,7 @@ struct gnttab_copy {
 	/* OUT parameters. */
 	int16_t       status;
 };
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_copy);
 
 /*
  * GNTTABOP_query_size: Query the current and maximum sizes of the shared
@@ -313,7 +318,7 @@ struct gnttab_query_size {
     uint32_t max_nr_frames;
     int16_t  status;              /* GNTST_* */
 };
-
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_query_size);
 
 /*
  * Bitfield values for update_pin_status.flags.
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 04/50] xen: add missing definitions in include/xen/interface/vcpu.h which ia64/xen needs

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/xen/interface/vcpu.h |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/xen/interface/vcpu.h b/include/xen/interface/vcpu.h
index b05d8a6..87e6f8a 100644
--- a/include/xen/interface/vcpu.h
+++ b/include/xen/interface/vcpu.h
@@ -85,6 +85,7 @@ struct vcpu_runstate_info {
 		 */
 		uint64_t time[4];
 };
+DEFINE_GUEST_HANDLE_STRUCT(vcpu_runstate_info);
 
 /* VCPU is currently running on a physical CPU. */
 #define RUNSTATE_running  0
@@ -119,6 +120,7 @@ struct vcpu_runstate_info {
 #define VCPUOP_register_runstate_memory_area 5
 struct vcpu_register_runstate_memory_area {
 		union {
+				GUEST_HANDLE(vcpu_runstate_info) h;
 				struct vcpu_runstate_info *v;
 				uint64_t p;
 		} addr;
@@ -134,6 +136,7 @@ struct vcpu_register_runstate_memory_area {
 struct vcpu_set_periodic_timer {
 		uint64_t period_ns;
 };
+DEFINE_GUEST_HANDLE_STRUCT(vcpu_set_periodic_timer);
 
 /*
  * Set or stop a VCPU's single-shot timer. Every VCPU has one single-shot
@@ -145,6 +148,7 @@ struct vcpu_set_singleshot_timer {
 		uint64_t timeout_abs_ns;
 		uint32_t flags;			   /* VCPU_SSHOTTMR_??? */
 };
+DEFINE_GUEST_HANDLE_STRUCT(vcpu_set_singleshot_timer);
 
 /* Flags to VCPUOP_set_singleshot_timer. */
  /* Require the timeout to be in the future (return -ETIME if it's passed).
*/
@@ -164,5 +168,6 @@ struct vcpu_register_vcpu_info {
     uint32_t offset; /* offset within page */
     uint32_t rsvd;   /* unused */
 };
+DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_vcpu_info);
 
 #endif /* __XEN_PUBLIC_VCPU_H__ */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 05/50] xen: move features.c from arch/x86/xen/features.c to drivers/xen.

ia64/xen also uses it too, so move it into common place.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/x86/xen/Makefile                |    2 +-
 drivers/xen/Makefile                 |    2 +-
 {arch/x86 => drivers}/xen/features.c |    0 
 3 files changed, 2 insertions(+), 2 deletions(-)
 rename {arch/x86 => drivers}/xen/features.c (100%)

diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index 343df24..c5e9aa4 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -1,4 +1,4 @@
-obj-y		:= enlighten.o setup.o features.o multicalls.o mmu.o \
+obj-y		:= enlighten.o setup.o multicalls.o mmu.o \
 			events.o time.o manage.o xen-asm.o
 
 obj-$(CONFIG_SMP)	+= smp.o
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 56592f0..609fdda 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,2 +1,2 @@
-obj-y	+= grant-table.o
+obj-y	+= grant-table.o features.o
 obj-y	+= xenbus/
diff --git a/arch/x86/xen/features.c b/drivers/xen/features.c
similarity index 100%
rename from arch/x86/xen/features.c
rename to drivers/xen/features.c
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 06/50] xen: move arch/x86/xen/events.c undedr drivers/xen and split out arch specific part.

ia64/xen also uses events.c. clean it up so that ia64/xen can use.
make ipi_to_irq globly visible. ia64/xen nees to reference it from other file.
introduce resend_irq_on_evtchn() which ia64 needs.
introduce xen_do_IRQ() to split out arch specific code.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/x86/xen/Makefile              |    2 +-
 arch/x86/xen/xen-ops.h             |    2 +-
 drivers/xen/Makefile               |    2 +-
 {arch/x86 => drivers}/xen/events.c |   33 ++++++++++++++++++++++++---------
 include/asm-x86/xen/hypervisor.h   |    7 +++++++
 include/xen/events.h               |    1 +
 include/xen/xen-ops.h              |    6 ++++++
 7 files changed, 41 insertions(+), 12 deletions(-)
 rename {arch/x86 => drivers}/xen/events.c (95%)
 create mode 100644 include/xen/xen-ops.h

diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index c5e9aa4..95c5926 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -1,4 +1,4 @@
 obj-y		:= enlighten.o setup.o multicalls.o mmu.o \
-			events.o time.o manage.o xen-asm.o
+			time.o manage.o xen-asm.o
 
 obj-$(CONFIG_SMP)	+= smp.o
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index b02a909..caaabf3 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -2,6 +2,7 @@
 #define XEN_OPS_H
 
 #include <linux/init.h>
+#include <xen/xen-ops.h>
 
 /* These are code, but not functions.  Defined in entry.S */
 extern const char xen_hypervisor_callback[];
@@ -9,7 +10,6 @@ extern const char xen_failsafe_callback[];
 
 void xen_copy_trap_info(struct trap_info *traps);
 
-DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
 DECLARE_PER_CPU(unsigned long, xen_cr3);
 DECLARE_PER_CPU(unsigned long, xen_current_cr3);
 
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 609fdda..823ce78 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,2 +1,2 @@
-obj-y	+= grant-table.o features.o
+obj-y	+= grant-table.o features.o events.o
 obj-y	+= xenbus/
diff --git a/arch/x86/xen/events.c b/drivers/xen/events.c
similarity index 95%
rename from arch/x86/xen/events.c
rename to drivers/xen/events.c
index dcf613e..dce2dfc 100644
--- a/arch/x86/xen/events.c
+++ b/drivers/xen/events.c
@@ -33,12 +33,11 @@
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
 
+#include <xen/xen-ops.h>
 #include <xen/events.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
 
-#include "xen-ops.h"
-
 /*
  * This lock protects updates to the following mapping and reference-count
  * arrays. The lock does not need to be acquired to read the mapping tables.
@@ -49,7 +48,7 @@ static DEFINE_SPINLOCK(irq_mapping_update_lock);
 static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1};
 
 /* IRQ <-> IPI mapping */
-static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] =
-1};
+DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1};
 
 /* Packed IRQ information: binding type, sub-type index, and event channel. */
 struct packed_irq
@@ -455,7 +454,6 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector
vector)
 	notify_remote_via_irq(irq);
 }
 
-
 /*
  * Search the CPUs pending events bitmasks.  For each one found, map
  * the event number to an irq, and feed it into do_IRQ() for
@@ -474,7 +472,10 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 
 	vcpu_info->evtchn_upcall_pending = 0;
 
-	/* NB. No need for a barrier here -- XCHG is a barrier on x86. */
+#ifndef CONFIG_X86 /* No need for a barrier -- XCHG is a barrier on x86. */
+	/* Clear master flag /before/ clearing selector flag. */
+	rmb();
+#endif
 	pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
 	while (pending_words != 0) {
 		unsigned long pending_bits;
@@ -486,10 +487,8 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 			int port = (word_idx * BITS_PER_LONG) + bit_idx;
 			int irq = evtchn_to_irq[port];
 
-			if (irq != -1) {
-				regs->orig_ax = ~irq;
-				do_IRQ(regs);
-			}
+			if (irq != -1)
+				xen_do_IRQ(irq, regs);
 		}
 	}
 
@@ -525,6 +524,22 @@ static void set_affinity_irq(unsigned irq, cpumask_t dest)
 	rebind_irq_to_cpu(irq, tcpu);
 }
 
+int resend_irq_on_evtchn(unsigned int irq)
+{
+	int masked, evtchn = evtchn_from_irq(irq);
+	struct shared_info *s = HYPERVISOR_shared_info;
+
+	if (!VALID_EVTCHN(evtchn))
+		return 1;
+
+	masked = sync_test_and_set_bit(evtchn, s->evtchn_mask);
+	sync_set_bit(evtchn, s->evtchn_pending);
+	if (!masked)
+		unmask_evtchn(evtchn);
+
+	return 1;
+}
+
 static void enable_dynirq(unsigned int irq)
 {
 	int evtchn = evtchn_from_irq(irq);
diff --git a/include/asm-x86/xen/hypervisor.h b/include/asm-x86/xen/hypervisor.h
index 8e15dd2..138ee8a 100644
--- a/include/asm-x86/xen/hypervisor.h
+++ b/include/asm-x86/xen/hypervisor.h
@@ -61,6 +61,13 @@ extern struct start_info *xen_start_info;
 /* Force a proper event-channel callback from Xen. */
 extern void force_evtchn_callback(void);
 
+/* macro to avoid header inclusion dependncy hell */
+#define xen_do_IRQ(irq, regs)			\
+	do {					\
+		(regs)->orig_ax = ~(irq);	\
+		do_IRQ(regs);			\
+	} while (0)
+
 /* Turn jiffies into Xen system time. */
 u64 jiffies_to_st(unsigned long jiffies);
 
diff --git a/include/xen/events.h b/include/xen/events.h
index 2bde54d..574cfa4 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -37,6 +37,7 @@ int bind_ipi_to_irqhandler(enum ipi_vector ipi,
 void unbind_from_irqhandler(unsigned int irq, void *dev_id);
 
 void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector);
+int resend_irq_on_evtchn(unsigned int irq);
 
 static inline void notify_remote_via_evtchn(int port)
 {
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
new file mode 100644
index 0000000..003a7c9
--- /dev/null
+++ b/include/xen/xen-ops.h
@@ -0,0 +1,6 @@
+#ifndef INCLUDE_XEN_OPS_H
+#define INCLUDE_XEN_OPS_H
+
+DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
+
+#endif /* INCLUDE_XEN_OPS_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 07/50] xen: make include/xen/page.h portable moving those definitions under asm dir.

Those definitions in include/asm/xen/page.h are arch specific.
ia64/xen wants to define its own version. So move them to arch specific
directory and keep include/xen/page.h in order not to break compilation.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/{ => asm-x86}/xen/page.h |    0 
 include/xen/page.h               |  181 +-------------------------------------
 2 files changed, 1 insertions(+), 180 deletions(-)
 copy include/{ => asm-x86}/xen/page.h (100%)

diff --git a/include/xen/page.h b/include/asm-x86/xen/page.h
similarity index 100%
copy from include/xen/page.h
copy to include/asm-x86/xen/page.h
diff --git a/include/xen/page.h b/include/xen/page.h
index 031ef22..eaf85fa 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -1,180 +1 @@
-#ifndef __XEN_PAGE_H
-#define __XEN_PAGE_H
-
-#include <linux/pfn.h>
-
-#include <asm/uaccess.h>
-#include <asm/pgtable.h>
-
-#include <xen/features.h>
-
-#ifdef CONFIG_X86_PAE
-/* Xen machine address */
-typedef struct xmaddr {
-	unsigned long long maddr;
-} xmaddr_t;
-
-/* Xen pseudo-physical address */
-typedef struct xpaddr {
-	unsigned long long paddr;
-} xpaddr_t;
-#else
-/* Xen machine address */
-typedef struct xmaddr {
-	unsigned long maddr;
-} xmaddr_t;
-
-/* Xen pseudo-physical address */
-typedef struct xpaddr {
-	unsigned long paddr;
-} xpaddr_t;
-#endif
-
-#define XMADDR(x)	((xmaddr_t) { .maddr = (x) })
-#define XPADDR(x)	((xpaddr_t) { .paddr = (x) })
-
-/**** MACHINE <-> PHYSICAL CONVERSION MACROS ****/
-#define INVALID_P2M_ENTRY	(~0UL)
-#define FOREIGN_FRAME_BIT	(1UL<<31)
-#define FOREIGN_FRAME(m)	((m) | FOREIGN_FRAME_BIT)
-
-extern unsigned long *phys_to_machine_mapping;
-
-static inline unsigned long pfn_to_mfn(unsigned long pfn)
-{
-	if (xen_feature(XENFEAT_auto_translated_physmap))
-		return pfn;
-
-	return phys_to_machine_mapping[(unsigned int)(pfn)] &
-		~FOREIGN_FRAME_BIT;
-}
-
-static inline int phys_to_machine_mapping_valid(unsigned long pfn)
-{
-	if (xen_feature(XENFEAT_auto_translated_physmap))
-		return 1;
-
-	return (phys_to_machine_mapping[pfn] != INVALID_P2M_ENTRY);
-}
-
-static inline unsigned long mfn_to_pfn(unsigned long mfn)
-{
-	unsigned long pfn;
-
-	if (xen_feature(XENFEAT_auto_translated_physmap))
-		return mfn;
-
-#if 0
-	if (unlikely((mfn >> machine_to_phys_order) != 0))
-		return max_mapnr;
-#endif
-
-	pfn = 0;
-	/*
-	 * The array access can fail (e.g., device space beyond end of RAM).
-	 * In such cases it doesn't matter what we return (we return garbage),
-	 * but we must handle the fault without crashing!
-	 */
-	__get_user(pfn, &machine_to_phys_mapping[mfn]);
-
-	return pfn;
-}
-
-static inline xmaddr_t phys_to_machine(xpaddr_t phys)
-{
-	unsigned offset = phys.paddr & ~PAGE_MASK;
-	return XMADDR(PFN_PHYS((u64)pfn_to_mfn(PFN_DOWN(phys.paddr))) | offset);
-}
-
-static inline xpaddr_t machine_to_phys(xmaddr_t machine)
-{
-	unsigned offset = machine.maddr & ~PAGE_MASK;
-	return XPADDR(PFN_PHYS((u64)mfn_to_pfn(PFN_DOWN(machine.maddr))) | offset);
-}
-
-/*
- * We detect special mappings in one of two ways:
- *  1. If the MFN is an I/O page then Xen will set the m2p entry
- *     to be outside our maximum possible pseudophys range.
- *  2. If the MFN belongs to a different domain then we will certainly
- *     not have MFN in our p2m table. Conversely, if the page is ours,
- *     then we'll have p2m(m2p(MFN))==MFN.
- * If we detect a special mapping then it doesn't have a 'struct
page'.
- * We force !pfn_valid() by returning an out-of-range pointer.
- *
- * NB. These checks require that, for any MFN that is not in our reservation,
- * there is no PFN such that p2m(PFN) == MFN. Otherwise we can get confused if
- * we are foreign-mapping the MFN, and the other domain as m2p(MFN) == PFN.
- * Yikes! Various places must poke in INVALID_P2M_ENTRY for safety.
- *
- * NB2. When deliberately mapping foreign pages into the p2m table, you *must*
- *      use FOREIGN_FRAME(). This will cause pte_pfn() to choke on it, as we
- *      require. In all the cases we care about, the FOREIGN_FRAME bit is
- *      masked (e.g., pfn_to_mfn()) so behaviour there is correct.
- */
-static inline unsigned long mfn_to_local_pfn(unsigned long mfn)
-{
-	extern unsigned long max_mapnr;
-	unsigned long pfn = mfn_to_pfn(mfn);
-	if ((pfn < max_mapnr)
-	    && !xen_feature(XENFEAT_auto_translated_physmap)
-	    && (phys_to_machine_mapping[pfn] != mfn))
-		return max_mapnr; /* force !pfn_valid() */
-	return pfn;
-}
-
-static inline void set_phys_to_machine(unsigned long pfn, unsigned long mfn)
-{
-	if (xen_feature(XENFEAT_auto_translated_physmap)) {
-		BUG_ON(pfn != mfn && mfn != INVALID_P2M_ENTRY);
-		return;
-	}
-	phys_to_machine_mapping[pfn] = mfn;
-}
-
-/* VIRT <-> MACHINE conversion */
-#define virt_to_machine(v)	(phys_to_machine(XPADDR(__pa(v))))
-#define virt_to_mfn(v)		(pfn_to_mfn(PFN_DOWN(__pa(v))))
-#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << PAGE_SHIFT))
-
-#ifdef CONFIG_X86_PAE
-#define pte_mfn(_pte) (((_pte).pte_low >> PAGE_SHIFT) |			\
-		       (((_pte).pte_high & 0xfff) << (32-PAGE_SHIFT)))
-
-static inline pte_t mfn_pte(unsigned long page_nr, pgprot_t pgprot)
-{
-	pte_t pte;
-
-	pte.pte_high = (page_nr >> (32 - PAGE_SHIFT)) |
-		(pgprot_val(pgprot) >> 32);
-	pte.pte_high &= (__supported_pte_mask >> 32);
-	pte.pte_low = ((page_nr << PAGE_SHIFT) | pgprot_val(pgprot));
-	pte.pte_low &= __supported_pte_mask;
-
-	return pte;
-}
-
-static inline unsigned long long pte_val_ma(pte_t x)
-{
-	return x.pte;
-}
-#define pmd_val_ma(v) ((v).pmd)
-#define pud_val_ma(v) ((v).pgd.pgd)
-#define __pte_ma(x)	((pte_t) { .pte = (x) })
-#define __pmd_ma(x)	((pmd_t) { (x) } )
-#else  /* !X86_PAE */
-#define pte_mfn(_pte) ((_pte).pte_low >> PAGE_SHIFT)
-#define mfn_pte(pfn, prot)	__pte_ma(((pfn) << PAGE_SHIFT) |
pgprot_val(prot))
-#define pte_val_ma(x)	((x).pte)
-#define pmd_val_ma(v)	((v).pud.pgd.pgd)
-#define __pte_ma(x)	((pte_t) { (x) } )
-#endif	/* CONFIG_X86_PAE */
-
-#define pgd_val_ma(x)	((x).pgd)
-
-
-xmaddr_t arbitrary_virt_to_machine(unsigned long address);
-void make_lowmem_page_readonly(void *vaddr);
-void make_lowmem_page_readwrite(void *vaddr);
-
-#endif /* __XEN_PAGE_H */
+#include <asm/xen/page.h>
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 08/50] xen: replace callers of alloc_vm_area()/free_vm_area() with xen_ prefixed one.

Don't use alloc_vm_area()/free_vm_area() directly, instead define
xen_alloc_vm_area()/xen_free_vm_area() and use them.

alloc_vm_area()/free_vm_area() are used to allocate/free area which
are for grant table mapping. Xen/x86 grant table is based on virtual
address so that alloc_vm_area()/free_vm_area() are suitable.
On the other hand Xen/ia64 (and Xen/powerpc) grant table is based on
pseudo physical address (guest physical address) so that allocation
should be done differently.
The original version of xenified Linux/IA64 have its own
allocate_vm_area()/free_vm_area() definitions which don't allocate vm area
contradictory to those names.
Now vanilla Linux already has its definitions so that it's impossible
to have IA64 definitions of allocate_vm_area()/free_vm_area().
Instead introduce xen_allocate_vm_area()/xen_free_vm_area() and use them.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 drivers/xen/grant-table.c          |    2 +-
 drivers/xen/xenbus/xenbus_client.c |    6 +++---
 include/asm-x86/xen/hypervisor.h   |    3 +++
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 95016fd..9fcde20 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -478,7 +478,7 @@ static int gnttab_map(unsigned int start_idx, unsigned int
end_idx)
 
 	if (shared == NULL) {
 		struct vm_struct *area;
-		area = alloc_vm_area(PAGE_SIZE * max_nr_grant_frames());
+		area = xen_alloc_vm_area(PAGE_SIZE * max_nr_grant_frames());
 		BUG_ON(area == NULL);
 		shared = area->addr;
 	}
diff --git a/drivers/xen/xenbus/xenbus_client.c
b/drivers/xen/xenbus/xenbus_client.c
index 9fd2f70..0f86b0f 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -399,7 +399,7 @@ int xenbus_map_ring_valloc(struct xenbus_device *dev, int
gnt_ref, void **vaddr)
 
 	*vaddr = NULL;
 
-	area = alloc_vm_area(PAGE_SIZE);
+	area = xen_alloc_vm_area(PAGE_SIZE);
 	if (!area)
 		return -ENOMEM;
 
@@ -409,7 +409,7 @@ int xenbus_map_ring_valloc(struct xenbus_device *dev, int
gnt_ref, void **vaddr)
 		BUG();
 
 	if (op.status != GNTST_okay) {
-		free_vm_area(area);
+		xen_free_vm_area(area);
 		xenbus_dev_fatal(dev, op.status,
 				 "mapping in shared page %d from domain %d",
 				 gnt_ref, dev->otherend_id);
@@ -508,7 +508,7 @@ int xenbus_unmap_ring_vfree(struct xenbus_device *dev, void
*vaddr)
 		BUG();
 
 	if (op.status == GNTST_okay)
-		free_vm_area(area);
+		xen_free_vm_area(area);
 	else
 		xenbus_dev_error(dev, op.status,
 				 "unmapping page at handle %d error %d",
diff --git a/include/asm-x86/xen/hypervisor.h b/include/asm-x86/xen/hypervisor.h
index 138ee8a..31836ad 100644
--- a/include/asm-x86/xen/hypervisor.h
+++ b/include/asm-x86/xen/hypervisor.h
@@ -57,6 +57,9 @@ extern struct shared_info *HYPERVISOR_shared_info;
 extern struct start_info *xen_start_info;
 #define is_initial_xendomain() (xen_start_info->flags & SIF_INITDOMAIN)
 
+#define xen_alloc_vm_area(size)	alloc_vm_area(size)
+#define xen_free_vm_area(area)	free_vm_area(area)
+
 /* arch/i386/mach-xen/evtchn.c */
 /* Force a proper event-channel callback from Xen. */
 extern void force_evtchn_callback(void);
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 09/50] xen: make grant table arch portable.

split out x86 specific part from grant-table.c

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/x86/xen/Makefile      |    2 +-
 arch/x86/xen/grant-table.c |   91 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/xen/grant-table.c  |   35 +---------------
 include/xen/grant_table.h  |    6 +++
 4 files changed, 101 insertions(+), 33 deletions(-)
 create mode 100644 arch/x86/xen/grant-table.c

diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index 95c5926..3d8df98 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -1,4 +1,4 @@
 obj-y		:= enlighten.o setup.o multicalls.o mmu.o \
-			time.o manage.o xen-asm.o
+			time.o manage.o xen-asm.o grant-table.o
 
 obj-$(CONFIG_SMP)	+= smp.o
diff --git a/arch/x86/xen/grant-table.c b/arch/x86/xen/grant-table.c
new file mode 100644
index 0000000..49ba9b5
--- /dev/null
+++ b/arch/x86/xen/grant-table.c
@@ -0,0 +1,91 @@
+/******************************************************************************
+ * grant_table.c
+ * x86 specific part
+ *
+ * Granting foreign access to our memory reservation.
+ *
+ * Copyright (c) 2005-2006, Christopher Clark
+ * Copyright (c) 2004-2005, K A Fraser
+ * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan. Split out x86 specific part.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software
without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/vmalloc.h>
+
+#include <xen/interface/xen.h>
+#include <xen/page.h>
+#include <xen/grant_table.h>
+
+#include <asm/pgtable.h>
+
+static int map_pte_fn(pte_t *pte, struct page *pmd_page,
+		      unsigned long addr, void *data)
+{
+	unsigned long **frames = (unsigned long **)data;
+
+	set_pte_at(&init_mm, addr, pte, mfn_pte((*frames)[0], PAGE_KERNEL));
+	(*frames)++;
+	return 0;
+}
+
+static int unmap_pte_fn(pte_t *pte, struct page *pmd_page,
+			unsigned long addr, void *data)
+{
+
+	set_pte_at(&init_mm, addr, pte, __pte(0));
+	return 0;
+}
+
+int arch_gnttab_map_shared(unsigned long *frames, unsigned long nr_gframes,
+			   unsigned long max_nr_gframes,
+			   struct grant_entry **__shared)
+{
+	int rc;
+	struct grant_entry *shared = *__shared;
+
+	if (shared == NULL) {
+		struct vm_struct *area +			xen_alloc_vm_area(PAGE_SIZE * max_nr_gframes);
+		BUG_ON(area == NULL);
+		shared = area->addr;
+		*__shared = shared;
+	}
+
+	rc = apply_to_page_range(&init_mm, (unsigned long)shared,
+				 PAGE_SIZE * nr_gframes,
+				 map_pte_fn, &frames);
+	return rc;
+}
+
+void arch_gnttab_unmap_shared(struct grant_entry *shared,
+			      unsigned long nr_gframes)
+{
+	apply_to_page_range(&init_mm, (unsigned long)shared,
+			    PAGE_SIZE * nr_gframes, unmap_pte_fn, NULL);
+}
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 9fcde20..22f5104 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -435,24 +435,6 @@ static inline unsigned int max_nr_grant_frames(void)
 	return xen_max;
 }
 
-static int map_pte_fn(pte_t *pte, struct page *pmd_page,
-		      unsigned long addr, void *data)
-{
-	unsigned long **frames = (unsigned long **)data;
-
-	set_pte_at(&init_mm, addr, pte, mfn_pte((*frames)[0], PAGE_KERNEL));
-	(*frames)++;
-	return 0;
-}
-
-static int unmap_pte_fn(pte_t *pte, struct page *pmd_page,
-			unsigned long addr, void *data)
-{
-
-	set_pte_at(&init_mm, addr, pte, __pte(0));
-	return 0;
-}
-
 static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
 {
 	struct gnttab_setup_table setup;
@@ -476,17 +458,9 @@ static int gnttab_map(unsigned int start_idx, unsigned int
end_idx)
 
 	BUG_ON(rc || setup.status);
 
-	if (shared == NULL) {
-		struct vm_struct *area;
-		area = xen_alloc_vm_area(PAGE_SIZE * max_nr_grant_frames());
-		BUG_ON(area == NULL);
-		shared = area->addr;
-	}
-	rc = apply_to_page_range(&init_mm, (unsigned long)shared,
-				 PAGE_SIZE * nr_gframes,
-				 map_pte_fn, &frames);
+	rc = arch_gnttab_map_shared(frames, nr_gframes, max_nr_grant_frames(),
+				    &shared);
 	BUG_ON(rc);
-	frames -= nr_gframes; /* adjust after map_pte_fn() */
 
 	kfree(frames);
 
@@ -502,10 +476,7 @@ static int gnttab_resume(void)
 
 static int gnttab_suspend(void)
 {
-	apply_to_page_range(&init_mm, (unsigned long)shared,
-			    PAGE_SIZE * nr_grant_frames,
-			    unmap_pte_fn, NULL);
-
+	arch_gnttab_unmap_shared(shared, nr_grant_frames);
 	return 0;
 }
 
diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index 761c834..50ca16e 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -102,6 +102,12 @@ void gnttab_grant_foreign_access_ref(grant_ref_t ref,
domid_t domid,
 void gnttab_grant_foreign_transfer_ref(grant_ref_t, domid_t domid,
 				       unsigned long pfn);
 
+int arch_gnttab_map_shared(unsigned long *frames, unsigned long nr_gframes,
+			   unsigned long max_nr_gframes,
+			   struct grant_entry **__shared);
+void arch_gnttab_unmap_shared(struct grant_entry *shared,
+			      unsigned long nr_gframes);
+
 #define gnttab_map_vaddr(map) ((void *)(map.host_virt_addr))
 
 #endif /* __ASM_GNTTAB_H__ */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 10/50] xen: import include/xen/interface/callback.h which ia64/xen needs.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/xen/interface/callback.h |  119 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 119 insertions(+), 0 deletions(-)
 create mode 100644 include/xen/interface/callback.h

diff --git a/include/xen/interface/callback.h b/include/xen/interface/callback.h
new file mode 100644
index 0000000..04c8b5d
--- /dev/null
+++ b/include/xen/interface/callback.h
@@ -0,0 +1,119 @@
+/******************************************************************************
+ * callback.h
+ *
+ * Register guest OS callbacks with Xen.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the
"Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2006, Ian Campbell
+ */
+
+#ifndef __XEN_PUBLIC_CALLBACK_H__
+#define __XEN_PUBLIC_CALLBACK_H__
+
+#include "xen.h"
+
+/*
+ * Prototype for this hypercall is:
+ *   long callback_op(int cmd, void *extra_args)
+ * @cmd        == CALLBACKOP_??? (callback operation).
+ * @extra_args == Operation-specific extra arguments (NULL if none).
+ */
+
+/* ia64, x86: Callback for event delivery. */
+#define CALLBACKTYPE_event                 0
+
+/* x86: Failsafe callback when guest state cannot be restored by Xen. */
+#define CALLBACKTYPE_failsafe              1
+
+/* x86/64 hypervisor: Syscall by 64-bit guest app ('64-on-64-on-64').
*/
+#define CALLBACKTYPE_syscall               2
+
+/*
+ * x86/32 hypervisor: Only available on x86/32 when supervisor_mode_kernel
+ *     feature is enabled. Do not use this callback type in new code.
+ */
+#define CALLBACKTYPE_sysenter_deprecated   3
+
+/* x86: Callback for NMI delivery. */
+#define CALLBACKTYPE_nmi                   4
+
+/*
+ * x86: sysenter is only available as follows:
+ * - 32-bit hypervisor: with the supervisor_mode_kernel feature enabled
+ * - 64-bit hypervisor: 32-bit guest applications on Intel CPUs
+ *                      ('32-on-32-on-64', '32-on-64-on-64')
+ *                      [nb. also 64-bit guest applications on Intel CPUs
+ *                           ('64-on-64-on-64'), but syscall is
preferred]
+ */
+#define CALLBACKTYPE_sysenter              5
+
+/*
+ * x86/64 hypervisor: Syscall by 32-bit guest app on AMD CPUs
+ *                    ('32-on-32-on-64', '32-on-64-on-64')
+ */
+#define CALLBACKTYPE_syscall32             7
+
+/*
+ * Disable event deliver during callback? This flag is ignored for event and
+ * NMI callbacks: event delivery is unconditionally disabled.
+ */
+#define _CALLBACKF_mask_events             0
+#define CALLBACKF_mask_events              (1U << _CALLBACKF_mask_events)
+
+/*
+ * Register a callback.
+ */
+#define CALLBACKOP_register                0
+struct callback_register {
+    uint16_t type;
+    uint16_t flags;
+    xen_callback_t address;
+};
+DEFINE_GUEST_HANDLE_STRUCT(callback_register);
+
+/*
+ * Unregister a callback.
+ *
+ * Not all callbacks can be unregistered. -EINVAL will be returned if
+ * you attempt to unregister such a callback.
+ */
+#define CALLBACKOP_unregister              1
+struct callback_unregister {
+    uint16_t type;
+    uint16_t _unused;
+};
+DEFINE_GUEST_HANDLE_STRUCT(callback_unregister);
+
+#if __XEN_INTERFACE_VERSION__ < 0x00030207
+#undef CALLBACKTYPE_sysenter
+#define CALLBACKTYPE_sysenter CALLBACKTYPE_sysenter_deprecated
+#endif
+
+#endif /* __XEN_PUBLIC_CALLBACK_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 11/50] xen: import arch generic part of xencomm.

On xen/ia64 and xen/powerpc hypercall arguments are passed by pseudo
physical address (guest physical address) so that it's necessary to
convert from virtual address into pseudo physical address. The frame
work is called xencomm.
Import arch generic part of xencomm.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 drivers/xen/Makefile            |    1 +
 drivers/xen/xencomm.c           |  232 +++++++++++++++++++++++++++++++++++++++
 include/xen/interface/xencomm.h |   41 +++++++
 include/xen/xencomm.h           |   77 +++++++++++++
 4 files changed, 351 insertions(+), 0 deletions(-)
 create mode 100644 drivers/xen/xencomm.c
 create mode 100644 include/xen/interface/xencomm.h
 create mode 100644 include/xen/xencomm.h

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 823ce78..43f014c 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,2 +1,3 @@
 obj-y	+= grant-table.o features.o events.o
 obj-y	+= xenbus/
+obj-$(CONFIG_XEN_XENCOMM)	+= xencomm.o
diff --git a/drivers/xen/xencomm.c b/drivers/xen/xencomm.c
new file mode 100644
index 0000000..797cb4e
--- /dev/null
+++ b/drivers/xen/xencomm.c
@@ -0,0 +1,232 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ *
+ * Copyright (C) IBM Corp. 2006
+ *
+ * Authors: Hollis Blanchard <hollisb at us.ibm.com>
+ */
+
+#include <linux/gfp.h>
+#include <linux/mm.h>
+#include <asm/page.h>
+#include <xen/xencomm.h>
+#include <xen/interface/xen.h>
+#ifdef __ia64__
+#include <asm/xen/xencomm.h>	/* for is_kern_addr() */
+#endif
+
+#ifdef HAVE_XEN_PLATFORM_COMPAT_H
+#include <xen/platform-compat.h>
+#endif
+
+static int xencomm_init(struct xencomm_desc *desc,
+			void *buffer, unsigned long bytes)
+{
+	unsigned long recorded = 0;
+	int i = 0;
+
+	while ((recorded < bytes) && (i < desc->nr_addrs)) {
+		unsigned long vaddr = (unsigned long)buffer + recorded;
+		unsigned long paddr;
+		int offset;
+		int chunksz;
+
+		offset = vaddr % PAGE_SIZE; /* handle partial pages */
+		chunksz = min(PAGE_SIZE - offset, bytes - recorded);
+
+		paddr = xencomm_vtop(vaddr);
+		if (paddr == ~0UL) {
+			printk(KERN_DEBUG "%s: couldn't translate vaddr %lx\n",
+			       __func__, vaddr);
+			return -EINVAL;
+		}
+
+		desc->address[i++] = paddr;
+		recorded += chunksz;
+	}
+
+	if (recorded < bytes) {
+		printk(KERN_DEBUG
+		       "%s: could only translate %ld of %ld bytes\n",
+		       __func__, recorded, bytes);
+		return -ENOSPC;
+	}
+
+	/* mark remaining addresses invalid (just for safety) */
+	while (i < desc->nr_addrs)
+		desc->address[i++] = XENCOMM_INVALID;
+
+	desc->magic = XENCOMM_MAGIC;
+
+	return 0;
+}
+
+static struct xencomm_desc *xencomm_alloc(gfp_t gfp_mask,
+					  void *buffer, unsigned long bytes)
+{
+	struct xencomm_desc *desc;
+	unsigned long buffer_ulong = (unsigned long)buffer;
+	unsigned long start = buffer_ulong & PAGE_MASK;
+	unsigned long end = (buffer_ulong + bytes) | ~PAGE_MASK;
+	unsigned long nr_addrs = (end - start + 1) >> PAGE_SHIFT;
+	unsigned long size = sizeof(*desc) +
+		sizeof(desc->address[0]) * nr_addrs;
+
+	/*
+	 * slab allocator returns at least sizeof(void*) aligned pointer.
+	 * When sizeof(*desc) > sizeof(void*), struct xencomm_desc might
+	 * cross page boundary.
+	 */
+	if (sizeof(*desc) > sizeof(void *)) {
+		unsigned long order = get_order(size);
+		desc = (struct xencomm_desc *)__get_free_pages(gfp_mask,
+							       order);
+		if (desc == NULL)
+			return NULL;
+
+		desc->nr_addrs +			((PAGE_SIZE << order) - sizeof(struct
xencomm_desc)) /
+			sizeof(*desc->address);
+	} else {
+		desc = kmalloc(size, gfp_mask);
+		if (desc == NULL)
+			return NULL;
+
+		desc->nr_addrs = nr_addrs;
+	}
+	return desc;
+}
+
+void xencomm_free(struct xencomm_handle *desc)
+{
+	if (desc && !((ulong)desc & XENCOMM_INLINE_FLAG)) {
+		struct xencomm_desc *desc__ = (struct xencomm_desc *)desc;
+		if (sizeof(*desc__) > sizeof(void *)) {
+			unsigned long size = sizeof(*desc__) +
+				sizeof(desc__->address[0]) * desc__->nr_addrs;
+			unsigned long order = get_order(size);
+			free_pages((unsigned long)__va(desc), order);
+		} else
+			kfree(__va(desc));
+	}
+}
+
+static int xencomm_create(void *buffer, unsigned long bytes,
+			  struct xencomm_desc **ret, gfp_t gfp_mask)
+{
+	struct xencomm_desc *desc;
+	int rc;
+
+	pr_debug("%s: %p[%ld]\n", __func__, buffer, bytes);
+
+	if (bytes == 0) {
+		/* don't create a descriptor; Xen recognizes NULL. */
+		BUG_ON(buffer != NULL);
+		*ret = NULL;
+		return 0;
+	}
+
+	BUG_ON(buffer == NULL); /* 'bytes' is non-zero */
+
+	desc = xencomm_alloc(gfp_mask, buffer, bytes);
+	if (!desc) {
+		printk(KERN_DEBUG "%s failure\n", "xencomm_alloc");
+		return -ENOMEM;
+	}
+
+	rc = xencomm_init(desc, buffer, bytes);
+	if (rc) {
+		printk(KERN_DEBUG "%s failure: %d\n", "xencomm_init",
rc);
+		xencomm_free((struct xencomm_handle *)__pa(desc));
+		return rc;
+	}
+
+	*ret = desc;
+	return 0;
+}
+
+/* check if memory address is within VMALLOC region  */
+static int is_phys_contiguous(unsigned long addr)
+{
+	if (!is_kernel_addr(addr))
+		return 0;
+
+	return (addr < VMALLOC_START) || (addr >= VMALLOC_END);
+}
+
+static struct xencomm_handle *xencomm_create_inline(void *ptr)
+{
+	unsigned long paddr;
+
+	BUG_ON(!is_phys_contiguous((unsigned long)ptr));
+
+	paddr = (unsigned long)xencomm_pa(ptr);
+	BUG_ON(paddr & XENCOMM_INLINE_FLAG);
+	return (struct xencomm_handle *)(paddr | XENCOMM_INLINE_FLAG);
+}
+
+/* "mini" routine, for stack-based communications: */
+static int xencomm_create_mini(void *buffer,
+	unsigned long bytes, struct xencomm_mini *xc_desc,
+	struct xencomm_desc **ret)
+{
+	int rc = 0;
+	struct xencomm_desc *desc;
+	BUG_ON(((unsigned long)xc_desc) % sizeof(*xc_desc) != 0);
+
+	desc = (void *)xc_desc;
+
+	desc->nr_addrs = XENCOMM_MINI_ADDRS;
+
+	rc = xencomm_init(desc, buffer, bytes);
+	if (!rc)
+		*ret = desc;
+
+	return rc;
+}
+
+struct xencomm_handle *xencomm_map(void *ptr, unsigned long bytes)
+{
+	int rc;
+	struct xencomm_desc *desc;
+
+	if (is_phys_contiguous((unsigned long)ptr))
+		return xencomm_create_inline(ptr);
+
+	rc = xencomm_create(ptr, bytes, &desc, GFP_KERNEL);
+
+	if (rc || desc == NULL)
+		return NULL;
+
+	return xencomm_pa(desc);
+}
+
+struct xencomm_handle *__xencomm_map_no_alloc(void *ptr, unsigned long bytes,
+			struct xencomm_mini *xc_desc)
+{
+	int rc;
+	struct xencomm_desc *desc = NULL;
+
+	if (is_phys_contiguous((unsigned long)ptr))
+		return xencomm_create_inline(ptr);
+
+	rc = xencomm_create_mini(ptr, bytes, xc_desc,
+				&desc);
+
+	if (rc)
+		return NULL;
+
+	return xencomm_pa(desc);
+}
diff --git a/include/xen/interface/xencomm.h b/include/xen/interface/xencomm.h
new file mode 100644
index 0000000..ac45e07
--- /dev/null
+++ b/include/xen/interface/xencomm.h
@@ -0,0 +1,41 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the
"Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (C) IBM Corp. 2006
+ */
+
+#ifndef _XEN_XENCOMM_H_
+#define _XEN_XENCOMM_H_
+
+/* A xencomm descriptor is a scatter/gather list containing physical
+ * addresses corresponding to a virtually contiguous memory area. The
+ * hypervisor translates these physical addresses to machine addresses to copy
+ * to and from the virtually contiguous area.
+ */
+
+#define XENCOMM_MAGIC 0x58434F4D /* 'XCOM' */
+#define XENCOMM_INVALID (~0UL)
+
+struct xencomm_desc {
+    uint32_t magic;
+    uint32_t nr_addrs; /* the number of entries in address[] */
+    uint64_t address[0];
+};
+
+#endif /* _XEN_XENCOMM_H_ */
diff --git a/include/xen/xencomm.h b/include/xen/xencomm.h
new file mode 100644
index 0000000..e43b039
--- /dev/null
+++ b/include/xen/xencomm.h
@@ -0,0 +1,77 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ *
+ * Copyright (C) IBM Corp. 2006
+ *
+ * Authors: Hollis Blanchard <hollisb at us.ibm.com>
+ *          Jerone Young <jyoung5 at us.ibm.com>
+ */
+
+#ifndef _LINUX_XENCOMM_H_
+#define _LINUX_XENCOMM_H_
+
+#include <xen/interface/xencomm.h>
+
+#define XENCOMM_MINI_ADDRS 3
+struct xencomm_mini {
+	struct xencomm_desc _desc;
+	uint64_t address[XENCOMM_MINI_ADDRS];
+};
+
+/* To avoid additionnal virt to phys conversion, an opaque structure is
+   presented.  */
+struct xencomm_handle;
+
+extern void xencomm_free(struct xencomm_handle *desc);
+extern struct xencomm_handle *xencomm_map(void *ptr, unsigned long bytes);
+extern struct xencomm_handle *__xencomm_map_no_alloc(void *ptr,
+			unsigned long bytes,  struct xencomm_mini *xc_area);
+
+#if 0
+#define XENCOMM_MINI_ALIGNED(xc_desc, n)				\
+	struct xencomm_mini xc_desc ## _base[(n)]			\
+	__attribute__((__aligned__(sizeof(struct xencomm_mini))));	\
+	struct xencomm_mini *xc_desc = &xc_desc ## _base[0];
+#else
+/*
+ * gcc bug workaround:
+ * http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16660
+ * gcc doesn't handle properly stack variable with
+ * __attribute__((__align__(sizeof(struct xencomm_mini))))
+ */
+#define XENCOMM_MINI_ALIGNED(xc_desc, n)				\
+	unsigned char xc_desc ## _base[((n) + 1 ) *			\
+				       sizeof(struct xencomm_mini)];	\
+	struct xencomm_mini *xc_desc = (struct xencomm_mini *)		\
+		((unsigned long)xc_desc ## _base +			\
+		 (sizeof(struct xencomm_mini) -				\
+		  ((unsigned long)xc_desc ## _base) %			\
+		  sizeof(struct xencomm_mini)));
+#endif
+#define xencomm_map_no_alloc(ptr, bytes)			\
+	({ XENCOMM_MINI_ALIGNED(xc_desc, 1);			\
+		__xencomm_map_no_alloc(ptr, bytes, xc_desc); })
+
+/* provided by architecture code: */
+extern unsigned long xencomm_vtop(unsigned long vaddr);
+
+static inline void *xencomm_pa(void *ptr)
+{
+	return (void *)xencomm_vtop((unsigned long)ptr);
+}
+
+#define xen_guest_handle(hnd)  ((hnd).p)
+
+#endif /* _LINUX_XENCOMM_H_ */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 12/50] ia64/pv_ops: introduce ia64_set_rr0_to_rr4() to make kernel paravirtualization friendly.

ia64/Xen will replace setting rr[0-4] with single hypercall later.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/asm-ia64/intrinsics.h  |   10 ++++++++++
 include/asm-ia64/mmu_context.h |    6 +-----
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/asm-ia64/intrinsics.h b/include/asm-ia64/intrinsics.h
index f1135b5..c206755 100644
--- a/include/asm-ia64/intrinsics.h
+++ b/include/asm-ia64/intrinsics.h
@@ -18,6 +18,15 @@
 # include <asm/gcc_intrin.h>
 #endif
 
+#define ia64_set_rr0_to_rr4(val0, val1, val2, val3, val4)	\
+do {								\
+	ia64_set_rr(0x0000000000000000UL, (val0));		\
+	ia64_set_rr(0x2000000000000000UL, (val1));		\
+	ia64_set_rr(0x4000000000000000UL, (val2));		\
+	ia64_set_rr(0x6000000000000000UL, (val3));		\
+	ia64_set_rr(0x8000000000000000UL, (val4));		\
+} while (0)
+
 /*
  * Force an unresolved reference if someone tries to use
  * ia64_fetch_and_add() with a bad value.
@@ -183,4 +192,5 @@ extern long ia64_cmpxchg_called_with_bad_pointer (void);
 #endif /* !CONFIG_IA64_DEBUG_CMPXCHG */
 
 #endif
+#include <asm/privop.h>
 #endif /* _ASM_IA64_INTRINSICS_H */
diff --git a/include/asm-ia64/mmu_context.h b/include/asm-ia64/mmu_context.h
index cef2400..040bc87 100644
--- a/include/asm-ia64/mmu_context.h
+++ b/include/asm-ia64/mmu_context.h
@@ -152,11 +152,7 @@ reload_context (nv_mm_context_t context)
 #  endif
 #endif
 
-	ia64_set_rr(0x0000000000000000UL, rr0);
-	ia64_set_rr(0x2000000000000000UL, rr1);
-	ia64_set_rr(0x4000000000000000UL, rr2);
-	ia64_set_rr(0x6000000000000000UL, rr3);
-	ia64_set_rr(0x8000000000000000UL, rr4);
+	ia64_set_rr0_to_rr4(rr0, rr1, rr2, rr3, rr4);
 	ia64_srlz_i();			/* srlz.i implies srlz.d */
 }
 
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 13/50] ia64/pv_ops: introduce ia64_get_psr_i() to make kernel paravirtualization friendly.

__local_irq_save() and local_save_flags() are used to mask interruptions.
They read all psr bits that requres whole bit emulation.
On the other hand, reading only psr.i, the single bit, can be virtualized
cheaply.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/asm-ia64/intrinsics.h |    2 ++
 include/asm-ia64/system.h     |    4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/asm-ia64/intrinsics.h b/include/asm-ia64/intrinsics.h
index c206755..5800ad0 100644
--- a/include/asm-ia64/intrinsics.h
+++ b/include/asm-ia64/intrinsics.h
@@ -18,6 +18,8 @@
 # include <asm/gcc_intrin.h>
 #endif
 
+#define ia64_get_psr_i()	(ia64_getreg(_IA64_REG_PSR) & IA64_PSR_I)
+
 #define ia64_set_rr0_to_rr4(val0, val1, val2, val3, val4)	\
 do {								\
 	ia64_set_rr(0x0000000000000000UL, (val0));		\
diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
index 595112b..2bca73e 100644
--- a/include/asm-ia64/system.h
+++ b/include/asm-ia64/system.h
@@ -125,7 +125,7 @@ extern struct ia64_boot_param {
 #define __local_irq_save(x)			\
 do {						\
 	ia64_stop();				\
-	(x) = ia64_getreg(_IA64_REG_PSR);	\
+	(x) = ia64_get_psr_i();			\
 	ia64_stop();				\
 	ia64_rsm(IA64_PSR_I);			\
 } while (0)
@@ -173,7 +173,7 @@ do {								\
 #endif /* !CONFIG_IA64_DEBUG_IRQ */
 
 #define local_irq_enable()	({ ia64_stop(); ia64_ssm(IA64_PSR_I); ia64_srlz_d();
})
-#define local_save_flags(flags)	({ ia64_stop(); (flags) =
ia64_getreg(_IA64_REG_PSR); })
+#define local_save_flags(flags)	({ ia64_stop(); (flags) = ia64_get_psr_i(); })
 
 #define irqs_disabled()				\
 ({						\
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 14/50] ia64/pv_ops: split out ia64_swtich_to(), ia64_leave_syscall() and ia64_leave_kernel from entry.S to switch_leave.S for paravirtualization.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/Makefile       |    2 +-
 arch/ia64/kernel/entry.S        |  564 +------------------------------------
 arch/ia64/kernel/switch_leave.S |  594 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 603 insertions(+), 557 deletions(-)
 create mode 100644 arch/ia64/kernel/switch_leave.S

diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index 33e5a59..f9bc3c4 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -4,7 +4,7 @@
 
 extra-y	:= head.o init_task.o vmlinux.lds
 
-obj-y := acpi.o entry.o efi.o efi_stub.o gate-data.o fsys.o ia64_ksyms.o irq.o
irq_ia64.o	\
+obj-y := acpi.o entry.o switch_leave.o efi.o efi_stub.o gate-data.o fsys.o
ia64_ksyms.o irq.o irq_ia64.o	\
 	 irq_lsapic.o ivt.o machvec.o pal.o patch.o process.o perfmon.o ptrace.o sal.o
\
 	 salinfo.o semaphore.o setup.o signal.o sys_ia64.o time.o traps.o unaligned.o
\
 	 unwind.o mca.o mca_asm.o topology.o
diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S
index 3c331c4..df8dcc9 100644
--- a/arch/ia64/kernel/entry.S
+++ b/arch/ia64/kernel/entry.S
@@ -14,15 +14,6 @@
  * Copyright (C) 1999 Walt Drummond <drummond at valinux.com>
  */
 /*
- * ia64_switch_to now places correct virtual mapping in in TR2 for
- * kernel stack. This allows us to handle interrupts without changing
- * to physical mode.
- *
- * Jonathan Nicklin	<nicklin at missioncriticallinux.com>
- * Patrick O'Rourke	<orourke at missioncriticallinux.com>
- * 11/07/2000
- */
-/*
  * Global (preserved) predicate usage on syscall entry/exit path:
  *
  *	pKStk:		See entry.h.
@@ -175,68 +166,6 @@ GLOBAL_ENTRY(sys_clone)
 END(sys_clone)
 
 /*
- * prev_task <- ia64_switch_to(struct task_struct *next)
- *	With Ingo's new scheduler, interrupts are disabled when this routine
gets
- *	called.  The code starting at .map relies on this.  The rest of the code
- *	doesn't care about the interrupt masking status.
- */
-GLOBAL_ENTRY(ia64_switch_to)
-	.prologue
-	alloc r16=ar.pfs,1,0,0,0
-	DO_SAVE_SWITCH_STACK
-	.body
-
-	adds r22=IA64_TASK_THREAD_KSP_OFFSET,r13
-	movl r25=init_task
-	mov r27=IA64_KR(CURRENT_STACK)
-	adds r21=IA64_TASK_THREAD_KSP_OFFSET,in0
-	dep r20=0,in0,61,3		// physical address of "next"
-	;;
-	st8 [r22]=sp			// save kernel stack pointer of old task
-	shr.u r26=r20,IA64_GRANULE_SHIFT
-	cmp.eq p7,p6=r25,in0
-	;;
-	/*
-	 * If we've already mapped this task's page, we can skip doing it
again.
-	 */
-(p6)	cmp.eq p7,p6=r26,r27
-(p6)	br.cond.dpnt .map
-	;;
-.done:
-	ld8 sp=[r21]			// load kernel stack pointer of new task
-	mov IA64_KR(CURRENT)=in0	// update "current" application register
-	mov r8=r13			// return pointer to previously running task
-	mov r13=in0			// set "current" pointer
-	;;
-	DO_LOAD_SWITCH_STACK
-
-#ifdef CONFIG_SMP
-	sync.i				// ensure "fc"s done by this CPU are visible on other CPUs
-#endif
-	br.ret.sptk.many rp		// boogie on out in new context
-
-.map:
-	rsm psr.ic			// interrupts (psr.i) are already disabled here
-	movl r25=PAGE_KERNEL
-	;;
-	srlz.d
-	or r23=r25,r20			// construct PA | page properties
-	mov r25=IA64_GRANULE_SHIFT<<2
-	;;
-	mov cr.itir=r25
-	mov cr.ifa=in0			// VA of next task...
-	;;
-	mov r25=IA64_TR_CURRENT_STACK
-	mov IA64_KR(CURRENT_STACK)=r26	// remember last page we mapped...
-	;;
-	itr.d dtr[r25]=r23		// wire in new mapping...
-	ssm psr.ic			// reenable the psr.ic bit
-	;;
-	srlz.d
-	br.cond.sptk .done
-END(ia64_switch_to)
-
-/*
  * Note that interrupts are enabled during save_switch_stack and
load_switch_stack.  This
  * means that we may get an interrupt with "sp" pointing to the new
kernel stack while
  * ar.bspstore is still pointing to the old kernel backing store area.  Since
ar.rsc,
@@ -570,7 +499,7 @@ GLOBAL_ENTRY(ia64_trace_syscall)
 	br.call.sptk.many rp=syscall_trace_leave // give parent a chance to catch
return value
 .ret3:
 (pUStk)	cmp.eq.unc p6,p0=r0,r0			// p6 <- pUStk
-	br.cond.sptk .work_pending_syscall_end
+	br.cond.sptk ia64_work_pending_syscall_end
 
 strace_error:
 	ld8 r3=[r2]				// load pt_regs.r8
@@ -635,160 +564,10 @@ GLOBAL_ENTRY(ia64_ret_from_syscall)
 	adds r2=PT(R8)+16,sp			// r2 = &pt_regs.r8
 	mov r10=r0				// clear error indication in r10
 (p7)	br.cond.spnt handle_syscall_error	// handle potential syscall failure
-END(ia64_ret_from_syscall)
-	// fall through
-/*
- * ia64_leave_syscall(): Same as ia64_leave_kernel, except that it doesn't
- *	need to switch to bank 0 and doesn't restore the scratch registers.
- *	To avoid leaking kernel bits, the scratch registers are set to
- *	the following known-to-be-safe values:
- *
- *		  r1: restored (global pointer)
- *		  r2: cleared
- *		  r3: 1 (when returning to user-level)
- *	      r8-r11: restored (syscall return value(s))
- *		 r12: restored (user-level stack pointer)
- *		 r13: restored (user-level thread pointer)
- *		 r14: set to __kernel_syscall_via_epc
- *		 r15: restored (syscall #)
- *	     r16-r17: cleared
- *		 r18: user-level b6
- *		 r19: cleared
- *		 r20: user-level ar.fpsr
- *		 r21: user-level b0
- *		 r22: cleared
- *		 r23: user-level ar.bspstore
- *		 r24: user-level ar.rnat
- *		 r25: user-level ar.unat
- *		 r26: user-level ar.pfs
- *		 r27: user-level ar.rsc
- *		 r28: user-level ip
- *		 r29: user-level psr
- *		 r30: user-level cfm
- *		 r31: user-level pr
- *	      f6-f11: cleared
- *		  pr: restored (user-level pr)
- *		  b0: restored (user-level rp)
- *	          b6: restored
- *		  b7: set to __kernel_syscall_via_epc
- *	     ar.unat: restored (user-level ar.unat)
- *	      ar.pfs: restored (user-level ar.pfs)
- *	      ar.rsc: restored (user-level ar.rsc)
- *	     ar.rnat: restored (user-level ar.rnat)
- *	 ar.bspstore: restored (user-level ar.bspstore)
- *	     ar.fpsr: restored (user-level ar.fpsr)
- *	      ar.ccv: cleared
- *	      ar.csd: cleared
- *	      ar.ssd: cleared
- */
-ENTRY(ia64_leave_syscall)
-	PT_REGS_UNWIND_INFO(0)
-	/*
-	 * work.need_resched etc. mustn't get changed by this CPU before it
returns to
-	 * user- or fsys-mode, hence we disable interrupts early on.
-	 *
-	 * p6 controls whether current_thread_info()->flags needs to be check for
-	 * extra work.  We always check for extra work when returning to user-level.
-	 * With CONFIG_PREEMPT, we also check for extra work when the preempt_count
-	 * is 0.  After extra work processing has been completed, execution
-	 * resumes at .work_processed_syscall with p6 set to 1 if the extra-work-check
-	 * needs to be redone.
-	 */
-#ifdef CONFIG_PREEMPT
-	rsm psr.i				// disable interrupts
-	cmp.eq pLvSys,p0=r0,r0			// pLvSys=1: leave from syscall
-(pKStk) adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
-	;;
-	.pred.rel.mutex pUStk,pKStk
-(pKStk) ld4 r21=[r20]			// r21 <- preempt_count
-(pUStk)	mov r21=0			// r21 <- 0
-	;;
-	cmp.eq p6,p0=r21,r0		// p6 <- pUStk || (preempt_count == 0)
-#else /* !CONFIG_PREEMPT */
-(pUStk)	rsm psr.i
-	cmp.eq pLvSys,p0=r0,r0		// pLvSys=1: leave from syscall
-(pUStk)	cmp.eq.unc p6,p0=r0,r0		// p6 <- pUStk
-#endif
-.work_processed_syscall:
-	adds r2=PT(LOADRS)+16,r12
-	adds r3=PT(AR_BSPSTORE)+16,r12
-	adds r18=TI_FLAGS+IA64_TASK_SIZE,r13
 	;;
-(p6)	ld4 r31=[r18]				// load current_thread_info()->flags
-	ld8 r19=[r2],PT(B6)-PT(LOADRS)		// load ar.rsc value for "loadrs"
-	nop.i 0
-	;;
-	mov r16=ar.bsp				// M2  get existing backing store pointer
-	ld8 r18=[r2],PT(R9)-PT(B6)		// load b6
-(p6)	and r15=TIF_WORK_MASK,r31		// any work other than TIF_SYSCALL_TRACE?
-	;;
-	ld8 r23=[r3],PT(R11)-PT(AR_BSPSTORE)	// load ar.bspstore (may be garbage)
-(p6)	cmp4.ne.unc p6,p0=r15, r0		// any special work pending?
-(p6)	br.cond.spnt .work_pending_syscall
-	;;
-	// start restoring the state saved on the kernel stack (struct pt_regs):
-	ld8 r9=[r2],PT(CR_IPSR)-PT(R9)
-	ld8 r11=[r3],PT(CR_IIP)-PT(R11)
-(pNonSys) break 0		//      bug check: we shouldn't be here if pNonSys is
TRUE!
-	;;
-	invala			// M0|1 invalidate ALAT
-	rsm psr.i | psr.ic	// M2   turn off interrupts and interruption collection
-	cmp.eq p9,p0=r0,r0	// A    set p9 to indicate that we should restore cr.ifs
-
-	ld8 r29=[r2],16		// M0|1 load cr.ipsr
-	ld8 r28=[r3],16		// M0|1 load cr.iip
-	mov r22=r0		// A    clear r22
-	;;
-	ld8 r30=[r2],16		// M0|1 load cr.ifs
-	ld8 r25=[r3],16		// M0|1 load ar.unat
-(pUStk) add r14=IA64_TASK_THREAD_ON_USTACK_OFFSET,r13
-	;;
-	ld8 r26=[r2],PT(B0)-PT(AR_PFS)	// M0|1 load ar.pfs
-(pKStk)	mov r22=psr			// M2   read PSR now that interrupts are disabled
-	nop 0
-	;;
-	ld8 r21=[r2],PT(AR_RNAT)-PT(B0) // M0|1 load b0
-	ld8 r27=[r3],PT(PR)-PT(AR_RSC)	// M0|1 load ar.rsc
-	mov f6=f0			// F    clear f6
+	br.cond.sptk.few ia64_leave_syscall
 	;;
-	ld8 r24=[r2],PT(AR_FPSR)-PT(AR_RNAT)	// M0|1 load ar.rnat (may be garbage)
-	ld8 r31=[r3],PT(R1)-PT(PR)		// M0|1 load predicates
-	mov f7=f0				// F    clear f7
-	;;
-	ld8 r20=[r2],PT(R12)-PT(AR_FPSR)	// M0|1 load ar.fpsr
-	ld8.fill r1=[r3],16			// M0|1 load r1
-(pUStk) mov r17=1				// A
-	;;
-(pUStk) st1 [r14]=r17				// M2|3
-	ld8.fill r13=[r3],16			// M0|1
-	mov f8=f0				// F    clear f8
-	;;
-	ld8.fill r12=[r2]			// M0|1 restore r12 (sp)
-	ld8.fill r15=[r3]			// M0|1 restore r15
-	mov b6=r18				// I0   restore b6
-
-	LOAD_PHYS_STACK_REG_SIZE(r17)
-	mov f9=f0					// F    clear f9
-(pKStk) br.cond.dpnt.many skip_rbs_switch		// B
-
-	srlz.d				// M0   ensure interruption collection is off (for cover)
-	shr.u r18=r19,16		// I0|1 get byte size of existing "dirty"
partition
-	cover				// B    add current frame into dirty partition & set cr.ifs
-	;;
-	mov r19=ar.bsp			// M2   get new backing store pointer
-	mov f10=f0			// F    clear f10
-
-	nop.m 0
-	movl r14=__kernel_syscall_via_epc // X
-	;;
-	mov.m ar.csd=r0			// M2   clear ar.csd
-	mov.m ar.ccv=r0			// M2   clear ar.ccv
-	mov b7=r14			// I0   clear b7 (hint with __kernel_syscall_via_epc)
-
-	mov.m ar.ssd=r0			// M2   clear ar.ssd
-	mov f11=f0			// F    clear f11
-	br.cond.sptk.many rbs_switch	// B
-END(ia64_leave_syscall)
+END(ia64_ret_from_syscall)
 
 #ifdef CONFIG_IA32_SUPPORT
 GLOBAL_ENTRY(ia64_ret_from_ia32_execve)
@@ -800,339 +579,12 @@ GLOBAL_ENTRY(ia64_ret_from_ia32_execve)
 	st8.spill [r2]=r8	// store return value in slot for r8 and set unat bit
 	.mem.offset 8,0
 	st8.spill [r3]=r0	// clear error indication in slot for r10 and set unat bit
-END(ia64_ret_from_ia32_execve)
-	// fall through
-#endif /* CONFIG_IA32_SUPPORT */
-GLOBAL_ENTRY(ia64_leave_kernel)
-	PT_REGS_UNWIND_INFO(0)
-	/*
-	 * work.need_resched etc. mustn't get changed by this CPU before it
returns to
-	 * user- or fsys-mode, hence we disable interrupts early on.
-	 *
-	 * p6 controls whether current_thread_info()->flags needs to be check for
-	 * extra work.  We always check for extra work when returning to user-level.
-	 * With CONFIG_PREEMPT, we also check for extra work when the preempt_count
-	 * is 0.  After extra work processing has been completed, execution
-	 * resumes at .work_processed_syscall with p6 set to 1 if the extra-work-check
-	 * needs to be redone.
-	 */
-#ifdef CONFIG_PREEMPT
-	rsm psr.i				// disable interrupts
-	cmp.eq p0,pLvSys=r0,r0			// pLvSys=0: leave from kernel
-(pKStk)	adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
-	;;
-	.pred.rel.mutex pUStk,pKStk
-(pKStk)	ld4 r21=[r20]			// r21 <- preempt_count
-(pUStk)	mov r21=0			// r21 <- 0
-	;;
-	cmp.eq p6,p0=r21,r0		// p6 <- pUStk || (preempt_count == 0)
-#else
-(pUStk)	rsm psr.i
-	cmp.eq p0,pLvSys=r0,r0		// pLvSys=0: leave from kernel
-(pUStk)	cmp.eq.unc p6,p0=r0,r0		// p6 <- pUStk
-#endif
-.work_processed_kernel:
-	adds r17=TI_FLAGS+IA64_TASK_SIZE,r13
-	;;
-(p6)	ld4 r31=[r17]				// load current_thread_info()->flags
-	adds r21=PT(PR)+16,r12
-	;;
-
-	lfetch [r21],PT(CR_IPSR)-PT(PR)
-	adds r2=PT(B6)+16,r12
-	adds r3=PT(R16)+16,r12
 	;;
-	lfetch [r21]
-	ld8 r28=[r2],8		// load b6
-	adds r29=PT(R24)+16,r12
-
-	ld8.fill r16=[r3],PT(AR_CSD)-PT(R16)
-	adds r30=PT(AR_CCV)+16,r12
-(p6)	and r19=TIF_WORK_MASK,r31		// any work other than TIF_SYSCALL_TRACE?
-	;;
-	ld8.fill r24=[r29]
-	ld8 r15=[r30]		// load ar.ccv
-(p6)	cmp4.ne.unc p6,p0=r19, r0		// any special work pending?
-	;;
-	ld8 r29=[r2],16		// load b7
-	ld8 r30=[r3],16		// load ar.csd
-(p6)	br.cond.spnt .work_pending
-	;;
-	ld8 r31=[r2],16		// load ar.ssd
-	ld8.fill r8=[r3],16
-	;;
-	ld8.fill r9=[r2],16
-	ld8.fill r10=[r3],PT(R17)-PT(R10)
-	;;
-	ld8.fill r11=[r2],PT(R18)-PT(R11)
-	ld8.fill r17=[r3],16
-	;;
-	ld8.fill r18=[r2],16
-	ld8.fill r19=[r3],16
-	;;
-	ld8.fill r20=[r2],16
-	ld8.fill r21=[r3],16
-	mov ar.csd=r30
-	mov ar.ssd=r31
-	;;
-	rsm psr.i | psr.ic	// initiate turning off of interrupt and interruption
collection
-	invala			// invalidate ALAT
-	;;
-	ld8.fill r22=[r2],24
-	ld8.fill r23=[r3],24
-	mov b6=r28
-	;;
-	ld8.fill r25=[r2],16
-	ld8.fill r26=[r3],16
-	mov b7=r29
-	;;
-	ld8.fill r27=[r2],16
-	ld8.fill r28=[r3],16
-	;;
-	ld8.fill r29=[r2],16
-	ld8.fill r30=[r3],24
-	;;
-	ld8.fill r31=[r2],PT(F9)-PT(R31)
-	adds r3=PT(F10)-PT(F6),r3
-	;;
-	ldf.fill f9=[r2],PT(F6)-PT(F9)
-	ldf.fill f10=[r3],PT(F8)-PT(F10)
-	;;
-	ldf.fill f6=[r2],PT(F7)-PT(F6)
-	;;
-	ldf.fill f7=[r2],PT(F11)-PT(F7)
-	ldf.fill f8=[r3],32
+	// don't fall through, ia64_leave_kernel may be #define'd
+	br.cond.sptk.few ia64_leave_kernel
 	;;
-	srlz.d	// ensure that inter. collection is off (VHPT is don't care, since
text is pinned)
-	mov ar.ccv=r15
-	;;
-	ldf.fill f11=[r2]
-	bsw.0			// switch back to bank 0 (no stop bit required beforehand...)
-	;;
-(pUStk)	mov r18=IA64_KR(CURRENT)// M2 (12 cycle read latency)
-	adds r16=PT(CR_IPSR)+16,r12
-	adds r17=PT(CR_IIP)+16,r12
-
-(pKStk)	mov r22=psr		// M2 read PSR now that interrupts are disabled
-	nop.i 0
-	nop.i 0
-	;;
-	ld8 r29=[r16],16	// load cr.ipsr
-	ld8 r28=[r17],16	// load cr.iip
-	;;
-	ld8 r30=[r16],16	// load cr.ifs
-	ld8 r25=[r17],16	// load ar.unat
-	;;
-	ld8 r26=[r16],16	// load ar.pfs
-	ld8 r27=[r17],16	// load ar.rsc
-	cmp.eq p9,p0=r0,r0	// set p9 to indicate that we should restore cr.ifs
-	;;
-	ld8 r24=[r16],16	// load ar.rnat (may be garbage)
-	ld8 r23=[r17],16	// load ar.bspstore (may be garbage)
-	;;
-	ld8 r31=[r16],16	// load predicates
-	ld8 r21=[r17],16	// load b0
-	;;
-	ld8 r19=[r16],16	// load ar.rsc value for "loadrs"
-	ld8.fill r1=[r17],16	// load r1
-	;;
-	ld8.fill r12=[r16],16
-	ld8.fill r13=[r17],16
-(pUStk)	adds r18=IA64_TASK_THREAD_ON_USTACK_OFFSET,r18
-	;;
-	ld8 r20=[r16],16	// ar.fpsr
-	ld8.fill r15=[r17],16
-	;;
-	ld8.fill r14=[r16],16
-	ld8.fill r2=[r17]
-(pUStk)	mov r17=1
-	;;
-	ld8.fill r3=[r16]
-(pUStk)	st1 [r18]=r17		// restore current->thread.on_ustack
-	shr.u r18=r19,16	// get byte size of existing "dirty" partition
-	;;
-	mov r16=ar.bsp		// get existing backing store pointer
-	LOAD_PHYS_STACK_REG_SIZE(r17)
-(pKStk)	br.cond.dpnt skip_rbs_switch
-
-	/*
-	 * Restore user backing store.
-	 *
-	 * NOTE: alloc, loadrs, and cover can't be predicated.
-	 */
-(pNonSys) br.cond.dpnt dont_preserve_current_frame
-	cover				// add current frame into dirty partition and set cr.ifs
-	;;
-	mov r19=ar.bsp			// get new backing store pointer
-rbs_switch:
-	sub r16=r16,r18			// krbs = old bsp - size of dirty partition
-	cmp.ne p9,p0=r0,r0		// clear p9 to skip restore of cr.ifs
-	;;
-	sub r19=r19,r16			// calculate total byte size of dirty partition
-	add r18=64,r18			// don't force in0-in7 into memory...
-	;;
-	shl r19=r19,16			// shift size of dirty partition into loadrs position
-	;;
-dont_preserve_current_frame:
-	/*
-	 * To prevent leaking bits between the kernel and user-space,
-	 * we must clear the stacked registers in the "invalid" partition
here.
-	 * Not pretty, but at least it's fast (3.34 registers/cycle on Itanium,
-	 * 5 registers/cycle on McKinley).
-	 */
-#	define pRecurse	p6
-#	define pReturn	p7
-#ifdef CONFIG_ITANIUM
-#	define Nregs	10
-#else
-#	define Nregs	14
-#endif
-	alloc loc0=ar.pfs,2,Nregs-2,2,0
-	shr.u loc1=r18,9		// RNaTslots <= floor(dirtySize / (64*8))
-	sub r17=r17,r18			// r17 = (physStackedSize + 8) - dirtySize
-	;;
-	mov ar.rsc=r19			// load ar.rsc to be used for "loadrs"
-	shladd in0=loc1,3,r17
-	mov in1=0
-	;;
-	TEXT_ALIGN(32)
-rse_clear_invalid:
-#ifdef CONFIG_ITANIUM
-	// cycle 0
- { .mii
-	alloc loc0=ar.pfs,2,Nregs-2,2,0
-	cmp.lt pRecurse,p0=Nregs*8,in0	// if more than Nregs regs left to clear,
(re)curse
-	add out0=-Nregs*8,in0
-}{ .mfb
-	add out1=1,in1			// increment recursion count
-	nop.f 0
-	nop.b 0				// can't do br.call here because of alloc (WAW on CFM)
-	;;
-}{ .mfi	// cycle 1
-	mov loc1=0
-	nop.f 0
-	mov loc2=0
-}{ .mib
-	mov loc3=0
-	mov loc4=0
-(pRecurse) br.call.sptk.many b0=rse_clear_invalid
-
-}{ .mfi	// cycle 2
-	mov loc5=0
-	nop.f 0
-	cmp.ne pReturn,p0=r0,in1	// if recursion count != 0, we need to do a br.ret
-}{ .mib
-	mov loc6=0
-	mov loc7=0
-(pReturn) br.ret.sptk.many b0
-}
-#else /* !CONFIG_ITANIUM */
-	alloc loc0=ar.pfs,2,Nregs-2,2,0
-	cmp.lt pRecurse,p0=Nregs*8,in0	// if more than Nregs regs left to clear,
(re)curse
-	add out0=-Nregs*8,in0
-	add out1=1,in1			// increment recursion count
-	mov loc1=0
-	mov loc2=0
-	;;
-	mov loc3=0
-	mov loc4=0
-	mov loc5=0
-	mov loc6=0
-	mov loc7=0
-(pRecurse) br.call.dptk.few b0=rse_clear_invalid
-	;;
-	mov loc8=0
-	mov loc9=0
-	cmp.ne pReturn,p0=r0,in1	// if recursion count != 0, we need to do a br.ret
-	mov loc10=0
-	mov loc11=0
-(pReturn) br.ret.dptk.many b0
-#endif /* !CONFIG_ITANIUM */
-#	undef pRecurse
-#	undef pReturn
-	;;
-	alloc r17=ar.pfs,0,0,0,0	// drop current register frame
-	;;
-	loadrs
-	;;
-skip_rbs_switch:
-	mov ar.unat=r25		// M2
-(pKStk)	extr.u r22=r22,21,1	// I0 extract current value of psr.pp from r22
-(pLvSys)mov r19=r0		// A  clear r19 for leave_syscall, no-op otherwise
-	;;
-(pUStk)	mov ar.bspstore=r23	// M2
-(pKStk)	dep r29=r22,r29,21,1	// I0 update ipsr.pp with psr.pp
-(pLvSys)mov r16=r0		// A  clear r16 for leave_syscall, no-op otherwise
-	;;
-	mov cr.ipsr=r29		// M2
-	mov ar.pfs=r26		// I0
-(pLvSys)mov r17=r0		// A  clear r17 for leave_syscall, no-op otherwise
-
-(p9)	mov cr.ifs=r30		// M2
-	mov b0=r21		// I0
-(pLvSys)mov r18=r0		// A  clear r18 for leave_syscall, no-op otherwise
-
-	mov ar.fpsr=r20		// M2
-	mov cr.iip=r28		// M2
-	nop 0
-	;;
-(pUStk)	mov ar.rnat=r24		// M2 must happen with RSE in lazy mode
-	nop 0
-(pLvSys)mov r2=r0
-
-	mov ar.rsc=r27		// M2
-	mov pr=r31,-1		// I0
-	rfi			// B
-
-	/*
-	 * On entry:
-	 *	r20 = &current->thread_info->pre_count (if CONFIG_PREEMPT)
-	 *	r31 = current->thread_info->flags
-	 * On exit:
-	 *	p6 = TRUE if work-pending-check needs to be redone
-	 */
-.work_pending_syscall:
-	add r2=-8,r2
-	add r3=-8,r3
-	;;
-	st8 [r2]=r8
-	st8 [r3]=r10
-.work_pending:
-	tbit.z p6,p0=r31,TIF_NEED_RESCHED		//
current_thread_info()->need_resched==0?
-(p6)	br.cond.sptk.few .notify
-#ifdef CONFIG_PREEMPT
-(pKStk) dep r21=-1,r0,PREEMPT_ACTIVE_BIT,1
-	;;
-(pKStk) st4 [r20]=r21
-	ssm psr.i		// enable interrupts
-#endif
-	br.call.spnt.many rp=schedule
-.ret9:	cmp.eq p6,p0=r0,r0				// p6 <- 1
-	rsm psr.i		// disable interrupts
-	;;
-#ifdef CONFIG_PREEMPT
-(pKStk)	adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
-	;;
-(pKStk)	st4 [r20]=r0		// preempt_count() <- 0
-#endif
-(pLvSys)br.cond.sptk.few  .work_pending_syscall_end
-	br.cond.sptk.many .work_processed_kernel	// re-check
-
-.notify:
-(pUStk)	br.call.spnt.many rp=notify_resume_user
-.ret10:	cmp.ne p6,p0=r0,r0				// p6 <- 0
-(pLvSys)br.cond.sptk.few  .work_pending_syscall_end
-	br.cond.sptk.many .work_processed_kernel	// don't re-check
-
-.work_pending_syscall_end:
-	adds r2=PT(R8)+16,r12
-	adds r3=PT(R10)+16,r12
-	;;
-	ld8 r8=[r2]
-	ld8 r10=[r3]
-	br.cond.sptk.many .work_processed_syscall	// re-check
-
-END(ia64_leave_kernel)
+END(ia64_ret_from_ia32_execve)
+#endif /* CONFIG_IA32_SUPPORT */
 
 ENTRY(handle_syscall_error)
 	/*
@@ -1234,7 +686,7 @@ ENTRY(sys_rt_sigreturn)
 	adds sp=16,sp
 	;;
 	ld8 r9=[sp]				// load new ar.unat
-	mov.sptk b7=r8,ia64_leave_kernel
+	mov b7=r8
 	;;
 	mov ar.unat=r9
 	br.many b7
diff --git a/arch/ia64/kernel/switch_leave.S b/arch/ia64/kernel/switch_leave.S
new file mode 100644
index 0000000..5ca5b84
--- /dev/null
+++ b/arch/ia64/kernel/switch_leave.S
@@ -0,0 +1,594 @@
+/*
+ * arch/ia64/kernel/switch_leave.S
+ * Kernel entry points.
+ * ia64_switch_to(), ia64_leave_syscall() and ia64_leave_kernel()
+ * split from arch/ia64/kernel/entry.S for paravirtualization
+ *
+ * Copyright (C) 1998-2003, 2005 Hewlett-Packard Co
+ *	David Mosberger-Tang <davidm at hpl.hp.com>
+ * Copyright (C) 1999, 2002-2003
+ *	Asit Mallick <Asit.K.Mallick at intel.com>
+ * 	Don Dugger <Don.Dugger at intel.com>
+ *	Suresh Siddha <suresh.b.siddha at intel.com>
+ *	Fenghua Yu <fenghua.yu at intel.com>
+ * Copyright (C) 1999 VA Linux Systems
+ * Copyright (C) 1999 Walt Drummond <drummond at valinux.com>
+ */
+/*
+ * ia64_switch_to now places correct virtual mapping in in TR2 for
+ * kernel stack. This allows us to handle interrupts without changing
+ * to physical mode.
+ *
+ * Jonathan Nicklin	<nicklin at missioncriticallinux.com>
+ * Patrick O'Rourke	<orourke at missioncriticallinux.com>
+ * 11/07/2000
+ */
+/*
+ * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *                    pv_ops.
+ */
+/*
+ * Global (preserved) predicate usage on syscall entry/exit path:
+ *
+ *	pKStk:		See entry.h.
+ *	pUStk:		See entry.h.
+ *	pSys:		See entry.h.
+ *	pNonSys:	!pSys
+ */
+
+
+#include <asm/asmmacro.h>
+#include <asm/kregs.h>
+#include <asm/asm-offsets.h>
+#include <asm/pgtable.h>
+#include <asm/thread_info.h>
+
+#include "minstate.h"
+
+
+/*
+ * prev_task <- ia64_switch_to(struct task_struct *next)
+ *	With Ingo's new scheduler, interrupts are disabled when this routine
gets
+ *	called.  The code starting at .map relies on this.  The rest of the code
+ *	doesn't care about the interrupt masking status.
+ */
+GLOBAL_ENTRY(ia64_switch_to)
+	.prologue
+	alloc r16=ar.pfs,1,0,0,0
+	DO_SAVE_SWITCH_STACK
+	.body
+
+	adds r22=IA64_TASK_THREAD_KSP_OFFSET,r13
+	movl r25=init_task
+	mov r27=IA64_KR(CURRENT_STACK)
+	adds r21=IA64_TASK_THREAD_KSP_OFFSET,in0
+	dep r20=0,in0,61,3		// physical address of "next"
+	;;
+	st8 [r22]=sp			// save kernel stack pointer of old task
+	shr.u r26=r20,IA64_GRANULE_SHIFT
+	cmp.eq p7,p6=r25,in0
+	;;
+	/*
+	 * If we've already mapped this task's page, we can skip doing it
again.
+	 */
+(p6)	cmp.eq p7,p6=r26,r27
+(p6)	br.cond.dpnt .map
+	;;
+.done:
+	ld8 sp=[r21]			// load kernel stack pointer of new task
+	mov IA64_KR(CURRENT)=in0	// update "current" application register
+	mov r8=r13			// return pointer to previously running task
+	mov r13=in0			// set "current" pointer
+	;;
+	DO_LOAD_SWITCH_STACK
+
+#ifdef CONFIG_SMP
+	sync.i				// ensure "fc"s done by this CPU are visible on other CPUs
+#endif
+	br.ret.sptk.many rp		// boogie on out in new context
+
+.map:
+	rsm psr.ic			// interrupts (psr.i) are already disabled here
+	movl r25=PAGE_KERNEL
+	;;
+	srlz.d
+	or r23=r25,r20			// construct PA | page properties
+	mov r25=IA64_GRANULE_SHIFT<<2
+	;;
+	mov cr.itir=r25
+	mov cr.ifa=in0			// VA of next task...
+	;;
+	mov r25=IA64_TR_CURRENT_STACK
+	mov IA64_KR(CURRENT_STACK)=r26	// remember last page we mapped...
+	;;
+	itr.d dtr[r25]=r23		// wire in new mapping...
+	ssm psr.ic			// reenable the psr.ic bit
+	;;
+	srlz.d
+	br.cond.sptk .done
+END(ia64_switch_to)
+
+/*
+ * ia64_leave_syscall(): Same as ia64_leave_kernel, except that it doesn't
+ *	need to switch to bank 0 and doesn't restore the scratch registers.
+ *	To avoid leaking kernel bits, the scratch registers are set to
+ *	the following known-to-be-safe values:
+ *
+ *		  r1: restored (global pointer)
+ *		  r2: cleared
+ *		  r3: 1 (when returning to user-level)
+ *	      r8-r11: restored (syscall return value(s))
+ *		 r12: restored (user-level stack pointer)
+ *		 r13: restored (user-level thread pointer)
+ *		 r14: set to __kernel_syscall_via_epc
+ *		 r15: restored (syscall #)
+ *	     r16-r17: cleared
+ *		 r18: user-level b6
+ *		 r19: cleared
+ *		 r20: user-level ar.fpsr
+ *		 r21: user-level b0
+ *		 r22: cleared
+ *		 r23: user-level ar.bspstore
+ *		 r24: user-level ar.rnat
+ *		 r25: user-level ar.unat
+ *		 r26: user-level ar.pfs
+ *		 r27: user-level ar.rsc
+ *		 r28: user-level ip
+ *		 r29: user-level psr
+ *		 r30: user-level cfm
+ *		 r31: user-level pr
+ *	      f6-f11: cleared
+ *		  pr: restored (user-level pr)
+ *		  b0: restored (user-level rp)
+ *	          b6: restored
+ *		  b7: set to __kernel_syscall_via_epc
+ *	     ar.unat: restored (user-level ar.unat)
+ *	      ar.pfs: restored (user-level ar.pfs)
+ *	      ar.rsc: restored (user-level ar.rsc)
+ *	     ar.rnat: restored (user-level ar.rnat)
+ *	 ar.bspstore: restored (user-level ar.bspstore)
+ *	     ar.fpsr: restored (user-level ar.fpsr)
+ *	      ar.ccv: cleared
+ *	      ar.csd: cleared
+ *	      ar.ssd: cleared
+ */
+ENTRY(ia64_leave_syscall)
+	PT_REGS_UNWIND_INFO(0)
+	/*
+	 * work.need_resched etc. mustn't get changed by this CPU before it
returns to
+	 * user- or fsys-mode, hence we disable interrupts early on.
+	 *
+	 * p6 controls whether current_thread_info()->flags needs to be check for
+	 * extra work.  We always check for extra work when returning to user-level.
+	 * With CONFIG_PREEMPT, we also check for extra work when the preempt_count
+	 * is 0.  After extra work processing has been completed, execution
+	 * resumes at .work_processed_syscall with p6 set to 1 if the extra-work-check
+	 * needs to be redone.
+	 */
+#ifdef CONFIG_PREEMPT
+	rsm psr.i				// disable interrupts
+	cmp.eq pLvSys,p0=r0,r0			// pLvSys=1: leave from syscall
+(pKStk) adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
+	;;
+	.pred.rel.mutex pUStk,pKStk
+(pKStk) ld4 r21=[r20]			// r21 <- preempt_count
+(pUStk)	mov r21=0			// r21 <- 0
+	;;
+	cmp.eq p6,p0=r21,r0		// p6 <- pUStk || (preempt_count == 0)
+#else /* !CONFIG_PREEMPT */
+(pUStk)	rsm psr.i
+	cmp.eq pLvSys,p0=r0,r0		// pLvSys=1: leave from syscall
+(pUStk)	cmp.eq.unc p6,p0=r0,r0		// p6 <- pUStk
+#endif
+.work_processed_syscall:
+	adds r2=PT(LOADRS)+16,r12
+	adds r3=PT(AR_BSPSTORE)+16,r12
+	adds r18=TI_FLAGS+IA64_TASK_SIZE,r13
+	;;
+(p6)	ld4 r31=[r18]				// load current_thread_info()->flags
+	ld8 r19=[r2],PT(B6)-PT(LOADRS)		// load ar.rsc value for "loadrs"
+	nop.i 0
+	;;
+	mov r16=ar.bsp				// M2  get existing backing store pointer
+	ld8 r18=[r2],PT(R9)-PT(B6)		// load b6
+(p6)	and r15=TIF_WORK_MASK,r31		// any work other than TIF_SYSCALL_TRACE?
+	;;
+	ld8 r23=[r3],PT(R11)-PT(AR_BSPSTORE)	// load ar.bspstore (may be garbage)
+(p6)	cmp4.ne.unc p6,p0=r15, r0		// any special work pending?
+(p6)	br.cond.spnt .work_pending_syscall
+	;;
+	// start restoring the state saved on the kernel stack (struct pt_regs):
+	ld8 r9=[r2],PT(CR_IPSR)-PT(R9)
+	ld8 r11=[r3],PT(CR_IIP)-PT(R11)
+(pNonSys) break 0		//      bug check: we shouldn't be here if pNonSys is
TRUE!
+	;;
+	invala			// M0|1 invalidate ALAT
+	rsm psr.i | psr.ic	// M2   turn off interrupts and interruption collection
+	cmp.eq p9,p0=r0,r0	// A    set p9 to indicate that we should restore cr.ifs
+
+	ld8 r29=[r2],16		// M0|1 load cr.ipsr
+	ld8 r28=[r3],16		// M0|1 load cr.iip
+	mov r22=r0		// A    clear r22
+	;;
+	ld8 r30=[r2],16		// M0|1 load cr.ifs
+	ld8 r25=[r3],16		// M0|1 load ar.unat
+(pUStk) add r14=IA64_TASK_THREAD_ON_USTACK_OFFSET,r13
+	;;
+	ld8 r26=[r2],PT(B0)-PT(AR_PFS)	// M0|1 load ar.pfs
+(pKStk)	mov r22=psr			// M2   read PSR now that interrupts are disabled
+	nop 0
+	;;
+	ld8 r21=[r2],PT(AR_RNAT)-PT(B0) // M0|1 load b0
+	ld8 r27=[r3],PT(PR)-PT(AR_RSC)	// M0|1 load ar.rsc
+	mov f6=f0			// F    clear f6
+	;;
+	ld8 r24=[r2],PT(AR_FPSR)-PT(AR_RNAT)	// M0|1 load ar.rnat (may be garbage)
+	ld8 r31=[r3],PT(R1)-PT(PR)		// M0|1 load predicates
+	mov f7=f0				// F    clear f7
+	;;
+	ld8 r20=[r2],PT(R12)-PT(AR_FPSR)	// M0|1 load ar.fpsr
+	ld8.fill r1=[r3],16			// M0|1 load r1
+(pUStk) mov r17=1				// A
+	;;
+(pUStk) st1 [r14]=r17				// M2|3
+	ld8.fill r13=[r3],16			// M0|1
+	mov f8=f0				// F    clear f8
+	;;
+	ld8.fill r12=[r2]			// M0|1 restore r12 (sp)
+	ld8.fill r15=[r3]			// M0|1 restore r15
+	mov b6=r18				// I0   restore b6
+
+	LOAD_PHYS_STACK_REG_SIZE(r17)
+	mov f9=f0					// F    clear f9
+(pKStk) br.cond.dpnt.many skip_rbs_switch		// B
+
+	srlz.d				// M0   ensure interruption collection is off (for cover)
+	shr.u r18=r19,16		// I0|1 get byte size of existing "dirty"
partition
+	cover				// B    add current frame into dirty partition & set cr.ifs
+	;;
+	mov r19=ar.bsp			// M2   get new backing store pointer
+	mov f10=f0			// F    clear f10
+
+	nop.m 0
+	movl r14=__kernel_syscall_via_epc // X
+	;;
+	mov.m ar.csd=r0			// M2   clear ar.csd
+	mov.m ar.ccv=r0			// M2   clear ar.ccv
+	mov b7=r14			// I0   clear b7 (hint with __kernel_syscall_via_epc)
+
+	mov.m ar.ssd=r0			// M2   clear ar.ssd
+	mov f11=f0			// F    clear f11
+	br.cond.sptk.many rbs_switch	// B
+END(ia64_leave_syscall)
+
+GLOBAL_ENTRY(ia64_leave_kernel)
+	PT_REGS_UNWIND_INFO(0)
+	/*
+	 * work.need_resched etc. mustn't get changed by this CPU before it
returns to
+	 * user- or fsys-mode, hence we disable interrupts early on.
+	 *
+	 * p6 controls whether current_thread_info()->flags needs to be check for
+	 * extra work.  We always check for extra work when returning to user-level.
+	 * With CONFIG_PREEMPT, we also check for extra work when the preempt_count
+	 * is 0.  After extra work processing has been completed, execution
+	 * resumes at .work_processed_syscall with p6 set to 1 if the extra-work-check
+	 * needs to be redone.
+	 */
+#ifdef CONFIG_PREEMPT
+	rsm psr.i				// disable interrupts
+	cmp.eq p0,pLvSys=r0,r0			// pLvSys=0: leave from kernel
+(pKStk)	adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
+	;;
+	.pred.rel.mutex pUStk,pKStk
+(pKStk)	ld4 r21=[r20]			// r21 <- preempt_count
+(pUStk)	mov r21=0			// r21 <- 0
+	;;
+	cmp.eq p6,p0=r21,r0		// p6 <- pUStk || (preempt_count == 0)
+#else
+(pUStk)	rsm psr.i
+	cmp.eq p0,pLvSys=r0,r0		// pLvSys=0: leave from kernel
+(pUStk)	cmp.eq.unc p6,p0=r0,r0		// p6 <- pUStk
+#endif
+.work_processed_kernel:
+	adds r17=TI_FLAGS+IA64_TASK_SIZE,r13
+	;;
+(p6)	ld4 r31=[r17]				// load current_thread_info()->flags
+	adds r21=PT(PR)+16,r12
+	;;
+
+	lfetch [r21],PT(CR_IPSR)-PT(PR)
+	adds r2=PT(B6)+16,r12
+	adds r3=PT(R16)+16,r12
+	;;
+	lfetch [r21]
+	ld8 r28=[r2],8		// load b6
+	adds r29=PT(R24)+16,r12
+
+	ld8.fill r16=[r3],PT(AR_CSD)-PT(R16)
+	adds r30=PT(AR_CCV)+16,r12
+(p6)	and r19=TIF_WORK_MASK,r31		// any work other than TIF_SYSCALL_TRACE?
+	;;
+	ld8.fill r24=[r29]
+	ld8 r15=[r30]		// load ar.ccv
+(p6)	cmp4.ne.unc p6,p0=r19, r0		// any special work pending?
+	;;
+	ld8 r29=[r2],16		// load b7
+	ld8 r30=[r3],16		// load ar.csd
+(p6)	br.cond.spnt .work_pending
+	;;
+	ld8 r31=[r2],16		// load ar.ssd
+	ld8.fill r8=[r3],16
+	;;
+	ld8.fill r9=[r2],16
+	ld8.fill r10=[r3],PT(R17)-PT(R10)
+	;;
+	ld8.fill r11=[r2],PT(R18)-PT(R11)
+	ld8.fill r17=[r3],16
+	;;
+	ld8.fill r18=[r2],16
+	ld8.fill r19=[r3],16
+	;;
+	ld8.fill r20=[r2],16
+	ld8.fill r21=[r3],16
+	mov ar.csd=r30
+	mov ar.ssd=r31
+	;;
+	rsm psr.i | psr.ic	// initiate turning off of interrupt and interruption
collection
+	invala			// invalidate ALAT
+	;;
+	ld8.fill r22=[r2],24
+	ld8.fill r23=[r3],24
+	mov b6=r28
+	;;
+	ld8.fill r25=[r2],16
+	ld8.fill r26=[r3],16
+	mov b7=r29
+	;;
+	ld8.fill r27=[r2],16
+	ld8.fill r28=[r3],16
+	;;
+	ld8.fill r29=[r2],16
+	ld8.fill r30=[r3],24
+	;;
+	ld8.fill r31=[r2],PT(F9)-PT(R31)
+	adds r3=PT(F10)-PT(F6),r3
+	;;
+	ldf.fill f9=[r2],PT(F6)-PT(F9)
+	ldf.fill f10=[r3],PT(F8)-PT(F10)
+	;;
+	ldf.fill f6=[r2],PT(F7)-PT(F6)
+	;;
+	ldf.fill f7=[r2],PT(F11)-PT(F7)
+	ldf.fill f8=[r3],32
+	;;
+	srlz.d	// ensure that inter. collection is off (VHPT is don't care, since
text is pinned)
+	mov ar.ccv=r15
+	;;
+	ldf.fill f11=[r2]
+	bsw.0			// switch back to bank 0 (no stop bit required beforehand...)
+	;;
+(pUStk)	mov r18=IA64_KR(CURRENT)// M2 (12 cycle read latency)
+	adds r16=PT(CR_IPSR)+16,r12
+	adds r17=PT(CR_IIP)+16,r12
+
+(pKStk)	mov r22=psr		// M2 read PSR now that interrupts are disabled
+	nop.i 0
+	nop.i 0
+	;;
+	ld8 r29=[r16],16	// load cr.ipsr
+	ld8 r28=[r17],16	// load cr.iip
+	;;
+	ld8 r30=[r16],16	// load cr.ifs
+	ld8 r25=[r17],16	// load ar.unat
+	;;
+	ld8 r26=[r16],16	// load ar.pfs
+	ld8 r27=[r17],16	// load ar.rsc
+	cmp.eq p9,p0=r0,r0	// set p9 to indicate that we should restore cr.ifs
+	;;
+	ld8 r24=[r16],16	// load ar.rnat (may be garbage)
+	ld8 r23=[r17],16	// load ar.bspstore (may be garbage)
+	;;
+	ld8 r31=[r16],16	// load predicates
+	ld8 r21=[r17],16	// load b0
+	;;
+	ld8 r19=[r16],16	// load ar.rsc value for "loadrs"
+	ld8.fill r1=[r17],16	// load r1
+	;;
+	ld8.fill r12=[r16],16
+	ld8.fill r13=[r17],16
+(pUStk)	adds r18=IA64_TASK_THREAD_ON_USTACK_OFFSET,r18
+	;;
+	ld8 r20=[r16],16	// ar.fpsr
+	ld8.fill r15=[r17],16
+	;;
+	ld8.fill r14=[r16],16
+	ld8.fill r2=[r17]
+(pUStk)	mov r17=1
+	;;
+	ld8.fill r3=[r16]
+(pUStk)	st1 [r18]=r17		// restore current->thread.on_ustack
+	shr.u r18=r19,16	// get byte size of existing "dirty" partition
+	;;
+	mov r16=ar.bsp		// get existing backing store pointer
+	LOAD_PHYS_STACK_REG_SIZE(r17)
+(pKStk)	br.cond.dpnt skip_rbs_switch
+
+	/*
+	 * Restore user backing store.
+	 *
+	 * NOTE: alloc, loadrs, and cover can't be predicated.
+	 */
+(pNonSys) br.cond.dpnt dont_preserve_current_frame
+	cover				// add current frame into dirty partition and set cr.ifs
+	;;
+	mov r19=ar.bsp			// get new backing store pointer
+rbs_switch:
+	sub r16=r16,r18			// krbs = old bsp - size of dirty partition
+	cmp.ne p9,p0=r0,r0		// clear p9 to skip restore of cr.ifs
+	;;
+	sub r19=r19,r16			// calculate total byte size of dirty partition
+	add r18=64,r18			// don't force in0-in7 into memory...
+	;;
+	shl r19=r19,16			// shift size of dirty partition into loadrs position
+	;;
+dont_preserve_current_frame:
+	/*
+	 * To prevent leaking bits between the kernel and user-space,
+	 * we must clear the stacked registers in the "invalid" partition
here.
+	 * Not pretty, but at least it's fast (3.34 registers/cycle on Itanium,
+	 * 5 registers/cycle on McKinley).
+	 */
+#	define pRecurse	p6
+#	define pReturn	p7
+#ifdef CONFIG_ITANIUM
+#	define Nregs	10
+#else
+#	define Nregs	14
+#endif
+	alloc loc0=ar.pfs,2,Nregs-2,2,0
+	shr.u loc1=r18,9		// RNaTslots <= floor(dirtySize / (64*8))
+	sub r17=r17,r18			// r17 = (physStackedSize + 8) - dirtySize
+	;;
+	mov ar.rsc=r19			// load ar.rsc to be used for "loadrs"
+	shladd in0=loc1,3,r17
+	mov in1=0
+	;;
+	TEXT_ALIGN(32)
+rse_clear_invalid:
+#ifdef CONFIG_ITANIUM
+	// cycle 0
+ { .mii
+	alloc loc0=ar.pfs,2,Nregs-2,2,0
+	cmp.lt pRecurse,p0=Nregs*8,in0	// if more than Nregs regs left to clear,
(re)curse
+	add out0=-Nregs*8,in0
+}{ .mfb
+	add out1=1,in1			// increment recursion count
+	nop.f 0
+	nop.b 0				// can't do br.call here because of alloc (WAW on CFM)
+	;;
+}{ .mfi	// cycle 1
+	mov loc1=0
+	nop.f 0
+	mov loc2=0
+}{ .mib
+	mov loc3=0
+	mov loc4=0
+(pRecurse) br.call.sptk.many b0=rse_clear_invalid
+
+}{ .mfi	// cycle 2
+	mov loc5=0
+	nop.f 0
+	cmp.ne pReturn,p0=r0,in1	// if recursion count != 0, we need to do a br.ret
+}{ .mib
+	mov loc6=0
+	mov loc7=0
+(pReturn) br.ret.sptk.many b0
+}
+#else /* !CONFIG_ITANIUM */
+	alloc loc0=ar.pfs,2,Nregs-2,2,0
+	cmp.lt pRecurse,p0=Nregs*8,in0	// if more than Nregs regs left to clear,
(re)curse
+	add out0=-Nregs*8,in0
+	add out1=1,in1			// increment recursion count
+	mov loc1=0
+	mov loc2=0
+	;;
+	mov loc3=0
+	mov loc4=0
+	mov loc5=0
+	mov loc6=0
+	mov loc7=0
+(pRecurse) br.call.dptk.few b0=rse_clear_invalid
+	;;
+	mov loc8=0
+	mov loc9=0
+	cmp.ne pReturn,p0=r0,in1	// if recursion count != 0, we need to do a br.ret
+	mov loc10=0
+	mov loc11=0
+(pReturn) br.ret.dptk.many b0
+#endif /* !CONFIG_ITANIUM */
+#	undef pRecurse
+#	undef pReturn
+	;;
+	alloc r17=ar.pfs,0,0,0,0	// drop current register frame
+	;;
+	loadrs
+	;;
+skip_rbs_switch:
+	mov ar.unat=r25		// M2
+(pKStk)	extr.u r22=r22,21,1	// I0 extract current value of psr.pp from r22
+(pLvSys)mov r19=r0		// A  clear r19 for leave_syscall, no-op otherwise
+	;;
+(pUStk)	mov ar.bspstore=r23	// M2
+(pKStk)	dep r29=r22,r29,21,1	// I0 update ipsr.pp with psr.pp
+(pLvSys)mov r16=r0		// A  clear r16 for leave_syscall, no-op otherwise
+	;;
+	mov cr.ipsr=r29		// M2
+	mov ar.pfs=r26		// I0
+(pLvSys)mov r17=r0		// A  clear r17 for leave_syscall, no-op otherwise
+
+(p9)	mov cr.ifs=r30		// M2
+	mov b0=r21		// I0
+(pLvSys)mov r18=r0		// A  clear r18 for leave_syscall, no-op otherwise
+
+	mov ar.fpsr=r20		// M2
+	mov cr.iip=r28		// M2
+	nop 0
+	;;
+(pUStk)	mov ar.rnat=r24		// M2 must happen with RSE in lazy mode
+	nop 0
+(pLvSys)mov r2=r0
+
+	mov ar.rsc=r27		// M2
+	mov pr=r31,-1		// I0
+	rfi			// B
+
+	/*
+	 * On entry:
+	 *	r20 = &current->thread_info->pre_count (if CONFIG_PREEMPT)
+	 *	r31 = current->thread_info->flags
+	 * On exit:
+	 *	p6 = TRUE if work-pending-check needs to be redone
+	 */
+.work_pending_syscall:
+	add r2=-8,r2
+	add r3=-8,r3
+	;;
+	st8 [r2]=r8
+	st8 [r3]=r10
+.work_pending:
+	tbit.z p6,p0=r31,TIF_NEED_RESCHED		//
current_thread_info()->need_resched==0?
+(p6)	br.cond.sptk.few .notify
+#ifdef CONFIG_PREEMPT
+(pKStk) dep r21=-1,r0,PREEMPT_ACTIVE_BIT,1
+	;;
+(pKStk) st4 [r20]=r21
+	ssm psr.i		// enable interrupts
+#endif
+	br.call.spnt.many rp=schedule
+.ret9:	cmp.eq p6,p0=r0,r0				// p6 <- 1
+	rsm psr.i		// disable interrupts
+	;;
+#ifdef CONFIG_PREEMPT
+(pKStk)	adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
+	;;
+(pKStk)	st4 [r20]=r0		// preempt_count() <- 0
+#endif
+(pLvSys)br.cond.sptk.few  ia64_work_pending_syscall_end
+	br.cond.sptk.many .work_processed_kernel	// re-check
+
+.notify:
+(pUStk)	br.call.spnt.many rp=notify_resume_user
+.ret10:	cmp.ne p6,p0=r0,r0				// p6 <- 0
+(pLvSys)br.cond.sptk.few  ia64_work_pending_syscall_end
+	br.cond.sptk.many .work_processed_kernel	// don't re-check
+
+.global ia64_work_pending_syscall_end;
+ia64_work_pending_syscall_end:
+	adds r2=PT(R8)+16,r12
+	adds r3=PT(R10)+16,r12
+	;;
+	ld8 r8=[r2]
+	ld8 r10=[r3]
+	br.cond.sptk.many .work_processed_syscall	// re-check
+END(ia64_leave_kernel)
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 15/50] ia64/pv_ops: preparation for paravirtualizatin of switch_leave.S and ivt.S

make some symbol global and add some hooks.
define __IA64_ASM_PARAVIRTUALIZED_NATIVE to tell its native compilation
when compiling ivt.S and switch_leave.S
replace COVER with __COVER to avoid name conflict.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/Makefile       |    6 ++++++
 arch/ia64/kernel/entry.S        |    4 ++--
 arch/ia64/kernel/minstate.h     |    6 ++++--
 arch/ia64/kernel/switch_leave.S |   19 ++++++++++---------
 include/asm-ia64/privop.h       |   26 ++++++++++++++++++++++++++
 5 files changed, 48 insertions(+), 13 deletions(-)
 create mode 100644 include/asm-ia64/privop.h

diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index f9bc3c4..9281bf6 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -70,3 +70,9 @@ $(obj)/gate-syms.o: $(obj)/gate.lds $(obj)/gate.o FORCE
 # We must build gate.so before we can assemble it.
 # Note: kbuild does not track this dependency due to usage of .incbin
 $(obj)/gate-data.o: $(obj)/gate.so
+
+#
+# native ivt.S and switch_leave.S
+#
+AFLAGS_ivt.o += -D__IA64_ASM_PARAVIRTUALIZED_NATIVE
+AFLAGS_switch_leave.o += -D__IA64_ASM_PARAVIRTUALIZED_NATIVE
diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S
index df8dcc9..de91f61 100644
--- a/arch/ia64/kernel/entry.S
+++ b/arch/ia64/kernel/entry.S
@@ -304,7 +304,7 @@ END(save_switch_stack)
  *	- b7 holds address to return to
  *	- must not touch r8-r11
  */
-ENTRY(load_switch_stack)
+GLOBAL_ENTRY(load_switch_stack)
 	.prologue
 	.altrp b7
 
@@ -624,7 +624,7 @@ END(ia64_invoke_schedule_tail)
 	 * be set up by the caller.  We declare 8 input registers so the system call
 	 * args get preserved, in case we need to restart a system call.
 	 */
-ENTRY(notify_resume_user)
+GLOBAL_ENTRY(notify_resume_user)
 	.prologue ASM_UNW_PRLG_RP|ASM_UNW_PRLG_PFS, ASM_UNW_PRLG_GRSAVE(8)
 	alloc loc1=ar.pfs,8,2,3,0 // preserve all eight input regs in case of syscall
restart!
 	mov r9=ar.unat
diff --git a/arch/ia64/kernel/minstate.h b/arch/ia64/kernel/minstate.h
index c9ac8ba..fc99141 100644
--- a/arch/ia64/kernel/minstate.h
+++ b/arch/ia64/kernel/minstate.h
@@ -3,6 +3,7 @@
 
 #include "entry.h"
 
+#ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
 /*
  * DO_SAVE_MIN switches to the kernel stacks (if necessary) and saves
  * the minimum state necessary that allows us to turn psr.ic back
@@ -28,7 +29,7 @@
  * Note that psr.ic is NOT turned on by this macro.  This is so that
  * we can pass interruption state as arguments to a handler.
  */
-#define DO_SAVE_MIN(COVER,SAVE_IFS,EXTRA)							\
+#define DO_SAVE_MIN(__COVER,SAVE_IFS,EXTRA)							\
 	mov r16=IA64_KR(CURRENT);	/* M */							\
 	mov r27=ar.rsc;			/* M */							\
 	mov r20=r1;			/* A */							\
@@ -37,7 +38,7 @@
 	mov r26=ar.pfs;			/* I */							\
 	mov r28=cr.iip;			/* M */							\
 	mov r21=ar.fpsr;		/* M */							\
-	COVER;				/* B;; (or nothing) */					\
+	__COVER;			/* B;; (or nothing) */					\
 	;;											\
 	adds r16=IA64_TASK_THREAD_ON_USTACK_OFFSET,r16;						\
 	;;											\
@@ -129,6 +130,7 @@
 	;;											\
 	bsw.1;			/* switch back to bank 1 (must be last in insn group) */	\
 	;;
+#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */
 
 /*
  * SAVE_REST saves the remainder of pt_regs (with psr.ic on).
diff --git a/arch/ia64/kernel/switch_leave.S b/arch/ia64/kernel/switch_leave.S
index 5ca5b84..9918160 100644
--- a/arch/ia64/kernel/switch_leave.S
+++ b/arch/ia64/kernel/switch_leave.S
@@ -53,7 +53,7 @@
  *	called.  The code starting at .map relies on this.  The rest of the code
  *	doesn't care about the interrupt masking status.
  */
-GLOBAL_ENTRY(ia64_switch_to)
+GLOBAL_ENTRY(native_switch_to)
 	.prologue
 	alloc r16=ar.pfs,1,0,0,0
 	DO_SAVE_SWITCH_STACK
@@ -107,7 +107,7 @@ GLOBAL_ENTRY(ia64_switch_to)
 	;;
 	srlz.d
 	br.cond.sptk .done
-END(ia64_switch_to)
+END(native_switch_to)
 
 /*
  * ia64_leave_syscall(): Same as ia64_leave_kernel, except that it doesn't
@@ -153,7 +153,7 @@ END(ia64_switch_to)
  *	      ar.csd: cleared
  *	      ar.ssd: cleared
  */
-ENTRY(ia64_leave_syscall)
+GLOBAL_ENTRY(native_leave_syscall)
 	PT_REGS_UNWIND_INFO(0)
 	/*
 	 * work.need_resched etc. mustn't get changed by this CPU before it
returns to
@@ -163,7 +163,7 @@ ENTRY(ia64_leave_syscall)
 	 * extra work.  We always check for extra work when returning to user-level.
 	 * With CONFIG_PREEMPT, we also check for extra work when the preempt_count
 	 * is 0.  After extra work processing has been completed, execution
-	 * resumes at .work_processed_syscall with p6 set to 1 if the extra-work-check
+	 * resumes at ia64_work_processed_syscall with p6 set to 1 if the
extra-work-check
 	 * needs to be redone.
 	 */
 #ifdef CONFIG_PREEMPT
@@ -181,7 +181,8 @@ ENTRY(ia64_leave_syscall)
 	cmp.eq pLvSys,p0=r0,r0		// pLvSys=1: leave from syscall
 (pUStk)	cmp.eq.unc p6,p0=r0,r0		// p6 <- pUStk
 #endif
-.work_processed_syscall:
+.global native_work_processed_syscall;
+native_work_processed_syscall:
 	adds r2=PT(LOADRS)+16,r12
 	adds r3=PT(AR_BSPSTORE)+16,r12
 	adds r18=TI_FLAGS+IA64_TASK_SIZE,r13
@@ -260,9 +261,9 @@ ENTRY(ia64_leave_syscall)
 	mov.m ar.ssd=r0			// M2   clear ar.ssd
 	mov f11=f0			// F    clear f11
 	br.cond.sptk.many rbs_switch	// B
-END(ia64_leave_syscall)
+END(native_leave_syscall)
 
-GLOBAL_ENTRY(ia64_leave_kernel)
+GLOBAL_ENTRY(native_leave_kernel)
 	PT_REGS_UNWIND_INFO(0)
 	/*
 	 * work.need_resched etc. mustn't get changed by this CPU before it
returns to
@@ -590,5 +591,5 @@ ia64_work_pending_syscall_end:
 	;;
 	ld8 r8=[r2]
 	ld8 r10=[r3]
-	br.cond.sptk.many .work_processed_syscall	// re-check
-END(ia64_leave_kernel)
+	br.cond.sptk.many ia64_work_processed_syscall	// re-check
+END(native_leave_kernel)
diff --git a/include/asm-ia64/privop.h b/include/asm-ia64/privop.h
new file mode 100644
index 0000000..11d26f7
--- /dev/null
+++ b/include/asm-ia64/privop.h
@@ -0,0 +1,26 @@
+#ifndef _ASM_IA64_PRIVOP_H
+#define _ASM_IA64_PRIVOP_H
+
+#ifndef _ASM_IA64_INTRINSICS_H
+#error "don't include privop.h directly. instead include
intrinsics.h"
+#endif
+/*
+ * Copyright (C) 2005 Hewlett-Packard Co
+ *	Dan Magenheimer <dan.magenheimer at hp.com>
+ *
+ */
+
+#ifdef CONFIG_XEN
+#include <asm/xen/privop.h>
+#endif
+
+/* fallback for native case */
+
+#ifndef IA64_PARAVIRTUALIZED_ENTRY
+#define ia64_switch_to			native_switch_to
+#define ia64_leave_syscall		native_leave_syscall
+#define ia64_work_processed_syscall	native_work_processed_syscall
+#define ia64_leave_kernel		native_leave_kernel
+#endif /* !IA64_PARAVIRTUALIZED_ENTRY */
+
+#endif /* _ASM_IA64_PRIVOP_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 16/50] ia64/pv_ops: hook pal_call_static() for paravirtualization.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/pal.S    |    5 +++--
 include/asm-ia64/privop.h |    1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/kernel/pal.S b/arch/ia64/kernel/pal.S
index 0b53344..d52fd70 100644
--- a/arch/ia64/kernel/pal.S
+++ b/arch/ia64/kernel/pal.S
@@ -16,6 +16,7 @@
 #include <asm/processor.h>
 
 	.data
+	.globl pal_entry_point
 pal_entry_point:
 	data8 ia64_pal_default_handler
 	.text
@@ -52,7 +53,7 @@ END(ia64_pal_default_handler)
  * in0         Index of PAL service
  * in1 - in3   Remaining PAL arguments
  */
-GLOBAL_ENTRY(ia64_pal_call_static)
+GLOBAL_ENTRY(native_pal_call_static)
 	.prologue ASM_UNW_PRLG_RP|ASM_UNW_PRLG_PFS, ASM_UNW_PRLG_GRSAVE(4)
 	alloc loc1 = ar.pfs,4,5,0,0
 	movl loc2 = pal_entry_point
@@ -86,7 +87,7 @@ GLOBAL_ENTRY(ia64_pal_call_static)
 	;;
 	srlz.d				// seralize restoration of psr.l
 	br.ret.sptk.many b0
-END(ia64_pal_call_static)
+END(native_pal_call_static)
 
 /*
  * Make a PAL call using the stacked registers calling convention.
diff --git a/include/asm-ia64/privop.h b/include/asm-ia64/privop.h
index 11d26f7..7b9de4f 100644
--- a/include/asm-ia64/privop.h
+++ b/include/asm-ia64/privop.h
@@ -21,6 +21,7 @@
 #define ia64_leave_syscall		native_leave_syscall
 #define ia64_work_processed_syscall	native_work_processed_syscall
 #define ia64_leave_kernel		native_leave_kernel
+#define ia64_pal_call_static		native_pal_call_static
 #endif /* !IA64_PARAVIRTUALIZED_ENTRY */
 
 #endif /* _ASM_IA64_PRIVOP_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 17/50] ia64/pv_ops: introduce basic facilities for binary patching.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/Kconfig                 |   72 +++++++++++++
 arch/ia64/kernel/Makefile         |    5 +
 arch/ia64/kernel/paravirt_alt.c   |  118 ++++++++++++++++++++++
 arch/ia64/kernel/paravirt_core.c  |  201 +++++++++++++++++++++++++++++++++++++
 arch/ia64/kernel/paravirt_entry.c |   99 ++++++++++++++++++
 arch/ia64/kernel/paravirt_nop.c   |   49 +++++++++
 arch/ia64/kernel/vmlinux.lds.S    |   35 +++++++
 include/asm-ia64/module.h         |    6 +
 include/asm-ia64/paravirt_alt.h   |   82 +++++++++++++++
 include/asm-ia64/paravirt_core.h  |   54 ++++++++++
 include/asm-ia64/paravirt_entry.h |   62 +++++++++++
 include/asm-ia64/paravirt_nop.h   |   46 +++++++++
 12 files changed, 829 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/kernel/paravirt_alt.c
 create mode 100644 arch/ia64/kernel/paravirt_core.c
 create mode 100644 arch/ia64/kernel/paravirt_entry.c
 create mode 100644 arch/ia64/kernel/paravirt_nop.c
 create mode 100644 include/asm-ia64/paravirt_alt.h
 create mode 100644 include/asm-ia64/paravirt_core.h
 create mode 100644 include/asm-ia64/paravirt_entry.h
 create mode 100644 include/asm-ia64/paravirt_nop.h

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 8fa3faf..e7302ee 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -111,6 +111,78 @@ config AUDIT_ARCH
 	bool
 	default y
 
+menuconfig PARAVIRT_GUEST
+	bool "Paravirtualized guest support"
+	help
+	  Say Y here to get to see options related to running Linux under
+	  various hypervisors.  This option alone does not add any kernel code.
+
+	  If you say N, all options in this submenu will be skipped and disabled.
+
+if PARAVIRT_GUEST
+
+config PARAVIRT
+	bool
+	default y
+	help
+	  This changes the kernel so it can modify itself when it is run
+	  under a hypervisor, potentially improving performance significantly
+	  over full virtualization.  However, when run without a hypervisor
+	  the kernel is theoretically slower and slightly larger.
+
+config PARAVIRT_ALT
+	bool "paravirt_alt binary patching infrastructure"
+	depends on PARAVIRT
+	default y
+	help
+	  The binary patching infratstructure to replace some privileged
+	  instructions with hypervisor specific instrutions.
+	  There are several sensitive(i.e. non-virtualizable) instructions and
+	  performance critical privileged instructions which Xen
+	  paravirtualize as hyperprivops.
+	  For transparent paravirtualization (i.e. single binary should run
+	  on both baremetal and xen environment), xenLinux/IA64 needs
+	  something like "if (is_running_on_xen()) {} else {}" where
+	  is_running_on_xen() is determined at boot time.
+	  This configuration tries to eliminate the overheads for hyperprivops
+	  by annotating such instructions and replacing them with hyperprivops
+	  at boot time.
+
+config PARAVIRT_ENTRY
+	bool "paravirt entry"
+	depends on PARAVIRT
+	default y
+	help
+	  The entry point hooking infrastructure to change the execution path
+	  at the boot time.
+	  There are several paravirtualized paths in hand coded assembly code
+	  which isn't binary patched easily by the paravirt_alt infrastructure.
+	  E.g. ia64_switch_to, ia64_leave_syscall, ia64_leave_kernel and
+	  ia64_pal_call_static.
+	  For those hand written assembly code, change the execution path
+	  by hooking them and jumping to hand paravirtualized code.
+
+config PARAVIRT_NOP_B_PATCH
+	bool "paravirt branch if native"
+	depends on PARAVIRT
+	default y
+	help
+	  paravirt branch if native
+	  There are several paravirtualized paths in hand coded assembly code.
+	  For transparent paravirtualization, there are codes like
+	  GLOBAL_ENTRY(xen_xxx)
+	  'movl reg=running_on_xen;;'
+	  'ld4 reg=[reg];;'
+	  'cmp.e1 pred,p0=reg,r0'
+	  '(pred) br.cond.sptk.many <native_xxx>;;'
+	  To reduce overhead when running on bare metal, just
+	  "br.cond.sptk.many <native_xxx>" and replace it with
'nop.b 0'
+	  when running on xen.
+
+#source "arch/ia64/xen/Kconfig"
+
+endif
+
 choice
 	prompt "System type"
 	default IA64_GENERIC
diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index 9281bf6..185e0e2 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -36,6 +36,11 @@ obj-$(CONFIG_PCI_MSI)		+= msi_ia64.o
 mca_recovery-y			+= mca_drv.o mca_drv_asm.o
 obj-$(CONFIG_IA64_MC_ERR_INJECT)+= err_inject.o
 
+obj-$(CONFIG_PARAVIRT)		+= paravirt_core.o
+obj-$(CONFIG_PARAVIRT_ALT)	+= paravirt_alt.o
+obj-$(CONFIG_PARAVIRT_ENTRY)	+= paravirt_entry.o paravirtentry.o
+obj-$(CONFIG_PARAVIRT_NOP_B_PATCH) += paravirt_nop.o
+
 obj-$(CONFIG_IA64_ESI)		+= esi.o
 ifneq ($(CONFIG_IA64_ESI),)
 obj-y				+= esi_stub.o	# must be in kernel proper
diff --git a/arch/ia64/kernel/paravirt_alt.c b/arch/ia64/kernel/paravirt_alt.c
new file mode 100644
index 0000000..d0a34a7
--- /dev/null
+++ b/arch/ia64/kernel/paravirt_alt.c
@@ -0,0 +1,118 @@
+/******************************************************************************
+ * linux/arch/ia64/xen/paravirt_alt.c
+ *
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <asm/paravirt_core.h>
+
+extern const char nop_bundle[];
+extern const unsigned long nop_bundle_size;
+
+static void __init_or_module
+fill_nop(void *sbundle, void *ebundle)
+{
+	void *bundle = sbundle;
+	BUG_ON((((unsigned long)sbundle) % sizeof(bundle_t)) != 0);
+	BUG_ON((((unsigned long)ebundle) % sizeof(bundle_t)) != 0);
+
+	while (bundle < ebundle) {
+		memcpy(bundle, nop_bundle, nop_bundle_size);
+
+		bundle += nop_bundle_size;
+	}
+}
+
+void __init_or_module
+paravirt_alt_bundle_patch_apply(struct paravirt_alt_bundle_patch *start,
+				struct paravirt_alt_bundle_patch *end,
+				unsigned long(*patch)(void *sbundle,
+						      void *ebundle,
+						      unsigned long type))
+{
+	struct paravirt_alt_bundle_patch *p;
+
+	for (p = start; p < end; p++) {
+		unsigned long used;
+
+		used = (*patch)(p->sbundle, p->ebundle, p->type);
+		if (used == 0)
+			continue;
+
+		fill_nop(p->sbundle + used, p->ebundle);
+		paravirt_flush_i_cache_range(p->sbundle,
+					     p->ebundle - p->sbundle);
+	}
+	ia64_sync_i();
+	ia64_srlz_i();
+}
+
+/*
+ * nop.i, nop.m, nop.f instruction are same format.
+ * but nop.b has differennt format.
+ * This doesn't support nop.b for now.
+ */
+static void __init_or_module
+fill_nop_inst(unsigned long stag, unsigned long etag)
+{
+	extern const bundle_t nop_mfi_inst_bundle[];
+	unsigned long tag;
+	const cmp_inst_t nop_inst = paravirt_read_slot0(nop_mfi_inst_bundle);
+
+	for (tag = stag; tag < etag; tag = paravirt_get_next_tag(tag))
+		paravirt_write_inst(tag, nop_inst);
+}
+
+void __init_or_module
+paravirt_alt_inst_patch_apply(struct paravirt_alt_inst_patch *start,
+			      struct paravirt_alt_inst_patch *end,
+			      unsigned long (*patch)(unsigned long stag,
+						     unsigned long etag,
+						     unsigned long type))
+{
+	struct paravirt_alt_inst_patch *p;
+
+	for (p = start; p < end; p++) {
+		unsigned long tag;
+		bundle_t *sbundle;
+		bundle_t *ebundle;
+
+		tag = (*patch)(p->stag, p->etag, p->type);
+		if (tag == p->stag)
+			continue;
+
+		fill_nop_inst(tag, p->etag);
+		sbundle = paravirt_get_bundle(p->stag);
+		ebundle = paravirt_get_bundle(p->etag) + 1;
+		paravirt_flush_i_cache_range(sbundle, (ebundle - sbundle) *
+					     sizeof(bundle_t));
+	}
+	ia64_sync_i();
+	ia64_srlz_i();
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "linux"
+ * c-basic-offset: 8
+ * tab-width: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/arch/ia64/kernel/paravirt_core.c b/arch/ia64/kernel/paravirt_core.c
new file mode 100644
index 0000000..6b7c70f
--- /dev/null
+++ b/arch/ia64/kernel/paravirt_core.c
@@ -0,0 +1,201 @@
+/******************************************************************************
+ * linux/arch/ia64/xen/paravirt_core.c
+ *
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <asm/paravirt_core.h>
+
+/*
+ * flush_icache_range() can't be used here.
+ * we are here before cpu_init() which initializes
+ * ia64_i_cache_stride_shift. flush_icache_range() uses it.
+ */
+void __init_or_module
+paravirt_flush_i_cache_range(const void *instr, unsigned long size)
+{
+	unsigned long i;
+
+	for (i = 0; i < size; i += sizeof(bundle_t))
+		asm volatile ("fc.i %0":: "r"(instr + i):
"memory");
+}
+
+bundle_t* __init_or_module
+paravirt_get_bundle(unsigned long tag)
+{
+	return (bundle_t *)(tag & ~3UL);
+}
+
+unsigned long __init_or_module
+paravirt_get_slot(unsigned long tag)
+{
+	return tag & 3UL;
+}
+
+#if 0
+unsigned long __init_or_module
+paravirt_get_num_inst(unsigned long stag, unsigned long etag)
+{
+	bundle_t *sbundle = paravirt_get_bundle(stag);
+	unsigned long sslot = paravirt_get_slot(stag);
+	bundle_t *ebundle = paravirt_get_bundle(etag);
+	unsigned long eslot = paravirt_get_slot(etag);
+
+	return (ebundle - sbundle) * 3 + eslot - sslot + 1;
+}
+#endif
+
+unsigned long __init_or_module
+paravirt_get_next_tag(unsigned long tag)
+{
+	unsigned long slot = paravirt_get_slot(tag);
+
+	switch (slot) {
+	case 0:
+	case 1:
+		return tag + 1;
+	case 2: {
+		bundle_t *bundle = paravirt_get_bundle(tag);
+		return (unsigned long)(bundle + 1);
+	}
+	default:
+		BUG();
+	}
+	/* NOTREACHED */
+}
+
+cmp_inst_t __init_or_module
+paravirt_read_slot0(const bundle_t *bundle)
+{
+	cmp_inst_t inst;
+	inst.l = bundle->quad0.slot0;
+	return inst;
+}
+
+cmp_inst_t __init_or_module
+paravirt_read_slot1(const bundle_t *bundle)
+{
+	cmp_inst_t inst;
+	inst.l = bundle->quad0.slot1_p0 |
+		((unsigned long long)bundle->quad1.slot1_p1 << 18UL);
+	return inst;
+}
+
+cmp_inst_t __init_or_module
+paravirt_read_slot2(const bundle_t *bundle)
+{
+	cmp_inst_t inst;
+	inst.l = bundle->quad1.slot2;
+	return inst;
+}
+
+cmp_inst_t __init_or_module
+paravirt_read_inst(unsigned long tag)
+{
+	bundle_t *bundle = paravirt_get_bundle(tag);
+	unsigned long slot = paravirt_get_slot(tag);
+
+	switch (slot) {
+	case 0:
+		return paravirt_read_slot0(bundle);
+	case 1:
+		return paravirt_read_slot1(bundle);
+	case 2:
+		return paravirt_read_slot2(bundle);
+	default:
+		BUG();
+	}
+	/* NOTREACHED */
+}
+
+void __init_or_module
+paravirt_write_slot0(bundle_t *bundle, cmp_inst_t inst)
+{
+	bundle->quad0.slot0 = inst.l;
+}
+
+void __init_or_module
+paravirt_write_slot1(bundle_t *bundle, cmp_inst_t inst)
+{
+	bundle->quad0.slot1_p0 = inst.l;
+	bundle->quad1.slot1_p1 = inst.l >> 18UL;
+}
+
+void __init_or_module
+paravirt_write_slot2(bundle_t *bundle, cmp_inst_t inst)
+{
+	bundle->quad1.slot2 = inst.l;
+}
+
+void __init_or_module
+paravirt_write_inst(unsigned long tag, cmp_inst_t inst)
+{
+	bundle_t *bundle = paravirt_get_bundle(tag);
+	unsigned long slot = paravirt_get_slot(tag);
+
+	switch (slot) {
+	case 0:
+		paravirt_write_slot0(bundle, inst);
+		break;
+	case 1:
+		paravirt_write_slot1(bundle, inst);
+		break;
+	case 2:
+		paravirt_write_slot2(bundle, inst);
+		break;
+	default:
+		BUG();
+	}
+	paravirt_flush_i_cache_range(bundle, sizeof(*bundle));
+}
+
+/* for debug */
+void
+print_bundle(const bundle_t *bundle)
+{
+	const unsigned long *quad = (const unsigned long *)bundle;
+	cmp_inst_t slot0 = paravirt_read_slot0(bundle);
+	cmp_inst_t slot1 = paravirt_read_slot1(bundle);
+	cmp_inst_t slot2 = paravirt_read_slot2(bundle);
+
+	printk(KERN_DEBUG
+	       "bundle 0x%p 0x%016lx 0x%016lx\n", bundle, quad[0], quad[1]);
+	printk(KERN_DEBUG
+	       "bundle template 0x%x\n",
+	       bundle->quad0.template);
+	printk(KERN_DEBUG
+	       "slot0 0x%lx slot1_p0 0x%lx slot1_p1 0x%lx slot2 0x%lx\n",
+	       (unsigned long)bundle->quad0.slot0,
+	       (unsigned long)bundle->quad0.slot1_p0,
+	       (unsigned long)bundle->quad1.slot1_p1,
+	       (unsigned long)bundle->quad1.slot2);
+	printk(KERN_DEBUG
+	       "slot0 0x%016llx slot1 0x%016llx slot2 0x%016llx\n",
+	       slot0.l, slot1.l, slot2.l);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "linux"
+ * c-basic-offset: 8
+ * tab-width: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/arch/ia64/kernel/paravirt_entry.c
b/arch/ia64/kernel/paravirt_entry.c
new file mode 100644
index 0000000..708287a
--- /dev/null
+++ b/arch/ia64/kernel/paravirt_entry.c
@@ -0,0 +1,99 @@
+/******************************************************************************
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <asm/paravirt_core.h>
+#include <asm/paravirt_entry.h>
+
+/* br.cond.sptk.many <target25>	B1 */
+typedef union inst_b1 {
+	cmp_inst_t inst;
+	struct {
+		unsigned long qp: 6;
+		unsigned long btype: 3;
+		unsigned long unused: 3;
+		unsigned long p: 1;
+		unsigned long imm20b: 20;
+		unsigned long wh: 2;
+		unsigned long d: 1;
+		unsigned long s: 1;
+		unsigned long opcode: 4;
+	};
+	unsigned long l;
+} inst_b1_t;
+
+static void __init
+__paravirt_entry_apply(unsigned long tag, const void *target)
+{
+	bundle_t *bundle = paravirt_get_bundle(tag);
+	cmp_inst_t inst = paravirt_read_inst(tag);
+	unsigned long target25 = (unsigned long)target - (unsigned long)bundle;
+	inst_b1_t inst_b1;
+
+	inst_b1.l = inst.l;
+	if (target25 & (1UL << 63))
+		inst_b1.s = 1;
+	else
+		inst_b1.s = 0;
+
+	inst_b1.imm20b = target25 >> 4;
+	inst.l = inst_b1.l;
+
+	paravirt_write_inst(tag, inst);
+	paravirt_flush_i_cache_range(bundle, sizeof(*bundle));
+}
+
+static void __init
+paravirt_entry_apply(const struct paravirt_entry_patch *entry_patch,
+		     const struct paravirt_entry *entries,
+		     unsigned int nr_entries)
+{
+	unsigned int i;
+	for (i = 0; i < nr_entries; i++) {
+		if (entry_patch->type == entries[i].type) {
+			__paravirt_entry_apply(entry_patch->tag,
+					       entries[i].entry);
+			break;
+		}
+	}
+}
+
+void __init
+paravirt_entry_patch_apply(const struct paravirt_entry_patch *start,
+			   const struct paravirt_entry_patch *end,
+			   const struct paravirt_entry *entries,
+			   unsigned int nr_entries)
+{
+	const struct paravirt_entry_patch *p;
+	for (p = start; p < end; p++)
+		paravirt_entry_apply(p, entries, nr_entries);
+
+	ia64_sync_i();
+	ia64_srlz_i();
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "linux"
+ * c-basic-offset: 8
+ * tab-width: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/arch/ia64/kernel/paravirt_nop.c b/arch/ia64/kernel/paravirt_nop.c
new file mode 100644
index 0000000..ee5a204
--- /dev/null
+++ b/arch/ia64/kernel/paravirt_nop.c
@@ -0,0 +1,49 @@
+/******************************************************************************
+ * linux/arch/ia64/xen/paravirt_nop.c
+ *
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <asm/paravirt_core.h>
+#include <asm/paravirt_nop.h>
+
+void __init_or_module
+paravirt_nop_b_patch_apply(const struct paravirt_nop_patch *start,
+			   const struct paravirt_nop_patch *end)
+{
+	extern const bundle_t nop_b_inst_bundle;
+	const cmp_inst_t nop_b_inst = paravirt_read_slot0(&nop_b_inst_bundle);
+	const struct paravirt_nop_patch *p;
+
+	for (p = start; p < end; p++)
+		paravirt_write_inst(p->tag, nop_b_inst);
+
+	ia64_sync_i();
+	ia64_srlz_i();
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "linux"
+ * c-basic-offset: 8
+ * tab-width: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
index 80622ac..0cbe0a1 100644
--- a/arch/ia64/kernel/vmlinux.lds.S
+++ b/arch/ia64/kernel/vmlinux.lds.S
@@ -163,6 +163,41 @@ SECTIONS
 	  __end___mckinley_e9_bundles = .;
 	}
 
+#if defined(CONFIG_PARAVIRT_ALT)
+  . = ALIGN(16);
+  .paravirt_bundles : AT(ADDR(.paravirt_bundles) - LOAD_OFFSET)
+	{
+	  __start_paravirt_bundles = .;
+          *(.paravirt_bundles)
+	  __stop_paravirt_bundles = .;
+	}
+  . = ALIGN(16);
+  .paravirt_insts : AT(ADDR(.paravirt_insts) - LOAD_OFFSET)
+	{
+	  __start_paravirt_insts = .;
+          *(.paravirt_insts)
+	  __stop_paravirt_insts = .;
+	}
+#endif
+#if defined(CONFIG_PARAVIRT_NOP_B_PATCH)
+  . = ALIGN(16);
+  .paravirt_nop_b : AT(ADDR(.paravirt_nop_b) - LOAD_OFFSET)
+	{
+	  __start_paravirt_nop_b = .;
+	  *(.paravirt_nop_b)
+	  __stop_paravirt_nop_b = .;
+	}
+#endif
+#if defined(CONFIG_PARAVIRT_ENTRY)
+  . = ALIGN(16);
+  .paravirt_entry : AT(ADDR(.paravirt_entry) - LOAD_OFFSET)
+	{
+	  __start_paravirt_entry = .;
+	  *(.paravirt_entry)
+	  __stop_paravirt_entry = .;
+	}
+#endif
+
 #if defined(CONFIG_IA64_GENERIC)
   /* Machine Vector */
   . = ALIGN(16);
diff --git a/include/asm-ia64/module.h b/include/asm-ia64/module.h
index d2da61e..44f63ff 100644
--- a/include/asm-ia64/module.h
+++ b/include/asm-ia64/module.h
@@ -16,6 +16,12 @@ struct mod_arch_specific {
 	struct elf64_shdr *got;		/* global offset table */
 	struct elf64_shdr *opd;		/* official procedure descriptors */
 	struct elf64_shdr *unwind;	/* unwind-table section */
+#ifdef CONFIG_PARAVIRT_ALT
+	struct elf64_shdr *paravirt_bundles;
+					/* paravirt_alt_bundle_patch table */
+	struct elf64_shdr *paravirt_insts;
+					/* paravirt_alt_inst_patch table */
+#endif
 	unsigned long gp;		/* global-pointer for module */
 
 	void *core_unw_table;		/* core unwind-table cookie returned by unwinder */
diff --git a/include/asm-ia64/paravirt_alt.h b/include/asm-ia64/paravirt_alt.h
new file mode 100644
index 0000000..34c5473
--- /dev/null
+++ b/include/asm-ia64/paravirt_alt.h
@@ -0,0 +1,82 @@
+/******************************************************************************
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#ifndef __ASM_PARAVIRT_ALT_H
+#define __ASM_PARAVIRT_ALT_H
+
+#ifndef __ASSEMBLER__
+/* for binary patch */
+struct paravirt_alt_bundle_patch {
+	void		*sbundle;
+	void		*ebundle;
+	unsigned long	type;
+};
+
+/* label means the beginning of new bundle */
+#define paravirt_alt_bundle(instr, privop)				\
+	"\t1:\n"							\
+	"\t" instr "\n"							\
+	"\t2:\n"							\
+	"\t.section .paravirt_bundles, \"a\"\n"				\
+	"\t.previous\n"							\
+	"\t.xdata8 \".paravirt_bundles\", 1b, 2b, "			\
+	__stringify(privop) "\n"
+
+struct paravirt_alt_inst_patch {
+	unsigned long	stag;
+	unsigned long	etag;
+	unsigned long	type;
+};
+
+#define paravirt_alt_inst(instr, privop)				\
+	"\t[1:]\n"							\
+	"\t" instr "\n"							\
+	"\t[2:]\n"							\
+	"\t.section .paravirt_insts, \"a\"\n"				\
+	"\t.previous\n"							\
+	"\t.xdata8 \".paravirt_insts\", 1b, 2b, "			\
+	__stringify(privop) "\n"
+
+void
+paravirt_alt_bundle_patch_apply(struct paravirt_alt_bundle_patch *start,
+				struct paravirt_alt_bundle_patch *end,
+				unsigned long(*patch)(void *sbundle,
+						      void *ebundle,
+						      unsigned long type));
+
+void
+paravirt_alt_inst_patch_apply(struct paravirt_alt_inst_patch *start,
+			      struct paravirt_alt_inst_patch *end,
+			      unsigned long (*patch)(unsigned long stag,
+						     unsigned long etag,
+						     unsigned long type));
+#endif /* __ASSEMBLER__ */
+
+#endif /* __ASM_PARAVIRT_ALT_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "linux"
+ * c-basic-offset: 8
+ * tab-width: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/include/asm-ia64/paravirt_core.h b/include/asm-ia64/paravirt_core.h
new file mode 100644
index 0000000..9979740
--- /dev/null
+++ b/include/asm-ia64/paravirt_core.h
@@ -0,0 +1,54 @@
+/******************************************************************************
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#ifndef __ASM_PARAVIRT_CORE_H
+#define __ASM_PARAVIRT_CORE_H
+
+#include <asm/kprobes.h>
+
+void paravirt_flush_i_cache_range(const void *instr, unsigned long size);
+
+bundle_t *paravirt_get_bundle(unsigned long tag);
+unsigned long paravirt_get_slot(unsigned long tag);
+unsigned long paravirt_get_next_tag(unsigned long tag);
+
+cmp_inst_t paravirt_read_slot0(const bundle_t *bundle);
+cmp_inst_t paravirt_read_slot1(const bundle_t *bundle);
+cmp_inst_t paravirt_read_slot2(const bundle_t *bundle);
+cmp_inst_t paravirt_read_inst(unsigned long tag);
+
+void paravirt_write_slot0(bundle_t *bundle, cmp_inst_t inst);
+void paravirt_write_slot1(bundle_t *bundle, cmp_inst_t inst);
+void paravirt_write_slot2(bundle_t *bundle, cmp_inst_t inst);
+void paravirt_write_inst(unsigned long tag, cmp_inst_t inst);
+
+void print_bundle(const bundle_t *bundle);
+
+#endif /* __ASM_PARAVIRT_CORE_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "linux"
+ * c-basic-offset: 8
+ * tab-width: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/include/asm-ia64/paravirt_entry.h
b/include/asm-ia64/paravirt_entry.h
new file mode 100644
index 0000000..857fd37
--- /dev/null
+++ b/include/asm-ia64/paravirt_entry.h
@@ -0,0 +1,62 @@
+/******************************************************************************
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#ifndef __ASM_PARAVIRT_ENTRY_H
+#define __ASM_PARAVIRT_ENTRY_H
+
+#ifdef __ASSEMBLY__
+
+#define BR_COND_SPTK_MANY(target, type)		\
+	[1:] ;					\
+	br.cond.sptk.many target;; ;		\
+	.section .paravirt_entry, "a" ;		\
+	.previous ;				\
+	.xdata8 ".paravirt_entry", 1b, type
+
+#else /* __ASSEMBLY__ */
+
+struct paravirt_entry_patch {
+	unsigned long	tag;
+	unsigned long	type;
+};
+
+struct paravirt_entry {
+	void		*entry;
+	unsigned long	type;
+};
+
+void
+paravirt_entry_patch_apply(const struct paravirt_entry_patch *start,
+			   const struct paravirt_entry_patch *end,
+			   const struct paravirt_entry *entries,
+			   unsigned int nr_entries);
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __ASM_PARAVIRT_ENTRY_H */
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "linux"
+ * c-basic-offset: 8
+ * tab-width: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/include/asm-ia64/paravirt_nop.h b/include/asm-ia64/paravirt_nop.h
new file mode 100644
index 0000000..2b05430
--- /dev/null
+++ b/include/asm-ia64/paravirt_nop.h
@@ -0,0 +1,46 @@
+/******************************************************************************
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#ifndef __ASM_PARAVIRT_OPS_H
+#define __ASM_PARAVIRT_OPS_H
+
+#ifndef __ASSEMBLY__
+
+struct paravirt_nop_patch {
+	unsigned long	tag;
+};
+
+void
+paravirt_nop_b_patch_apply(const struct paravirt_nop_patch *start,
+			   const struct paravirt_nop_patch *end);
+
+#endif /* !__ASSEMBLEY__ */
+
+#endif /* __ASM_PARAVIRT_OPS_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "linux"
+ * c-basic-offset: 8
+ * tab-width: 8
+ * indent-tabs-mode: t
+ * End:
+ */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 18/50] ia64/pv_ops: preparation for ia64 intrinsics operations paravirtualization

To make them overridable cleanly, change their prefix from ia64_ to native_
and define ia64_ to native_.
Later ia64_xxx would be redeinfed to pv_ops'ed one.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/asm-ia64/gcc_intrin.h   |   58 +++++++++++++++++-----------------
 include/asm-ia64/intel_intrin.h |   64 +++++++++++++++++++-------------------
 include/asm-ia64/intrinsics.h   |   14 ++++----
 include/asm-ia64/privop.h       |   36 ++++++++++++++++++++++
 4 files changed, 104 insertions(+), 68 deletions(-)

diff --git a/include/asm-ia64/gcc_intrin.h b/include/asm-ia64/gcc_intrin.h
index de2ed2c..31db638 100644
--- a/include/asm-ia64/gcc_intrin.h
+++ b/include/asm-ia64/gcc_intrin.h
@@ -28,7 +28,7 @@ extern void ia64_bad_param_for_getreg (void);
 register unsigned long ia64_r13 asm ("r13") __used;
 #endif
 
-#define ia64_setreg(regnum, val)						\
+#define native_setreg(regnum, val)						\
 ({										\
 	switch (regnum) {							\
 	    case _IA64_REG_PSR_L:						\
@@ -57,7 +57,7 @@ register unsigned long ia64_r13 asm ("r13") __used;
 	}									\
 })
 
-#define ia64_getreg(regnum)							\
+#define native_getreg(regnum)							\
 ({										\
 	__u64 ia64_intri_res;							\
 										\
@@ -94,7 +94,7 @@ register unsigned long ia64_r13 asm ("r13") __used;
 
 #define ia64_hint_pause 0
 
-#define ia64_hint(mode)						\
+#define native_hint(mode)						\
 ({								\
 	switch (mode) {						\
 	case ia64_hint_pause:					\
@@ -381,7 +381,7 @@ register unsigned long ia64_r13 asm ("r13")
__used;
 
 #define ia64_invala() asm volatile ("invala" ::: "memory")
 
-#define ia64_thash(addr)							\
+#define native_thash(addr)							\
 ({										\
 	__u64 ia64_intri_res;							\
 	asm volatile ("thash %0=%1" : "=r"(ia64_intri_res) :
"r" (addr));	\
@@ -401,18 +401,18 @@ register unsigned long ia64_r13 asm ("r13")
__used;
 
 #define ia64_nop(x)	asm volatile ("nop %0"::"i"(x));
 
-#define ia64_itci(addr)	asm volatile ("itc.i %0;;" ::
"r"(addr) : "memory")
+#define native_itci(addr)	asm volatile ("itc.i %0;;" ::
"r"(addr) : "memory")
 
-#define ia64_itcd(addr)	asm volatile ("itc.d %0;;" ::
"r"(addr) : "memory")
+#define native_itcd(addr)	asm volatile ("itc.d %0;;" ::
"r"(addr) : "memory")
 
 
-#define ia64_itri(trnum, addr) asm volatile ("itr.i itr[%0]=%1"				\
+#define native_itri(trnum, addr) asm volatile ("itr.i itr[%0]=%1"			\
 					     :: "r"(trnum), "r"(addr) : "memory")
 
-#define ia64_itrd(trnum, addr) asm volatile ("itr.d dtr[%0]=%1"				\
+#define native_itrd(trnum, addr) asm volatile ("itr.d dtr[%0]=%1"			\
 					     :: "r"(trnum), "r"(addr) : "memory")
 
-#define ia64_tpa(addr)								\
+#define native_tpa(addr)							\
 ({										\
 	__u64 ia64_pa;								\
 	asm volatile ("tpa %0 = %1" : "=r"(ia64_pa) :
"r"(addr) : "memory");	\
@@ -422,22 +422,22 @@ register unsigned long ia64_r13 asm ("r13")
__used;
 #define __ia64_set_dbr(index, val)						\
 	asm volatile ("mov dbr[%0]=%1" :: "r"(index),
"r"(val) : "memory")
 
-#define ia64_set_ibr(index, val)						\
+#define native_set_ibr(index, val)						\
 	asm volatile ("mov ibr[%0]=%1" :: "r"(index),
"r"(val) : "memory")
 
-#define ia64_set_pkr(index, val)						\
+#define native_set_pkr(index, val)						\
 	asm volatile ("mov pkr[%0]=%1" :: "r"(index),
"r"(val) : "memory")
 
-#define ia64_set_pmc(index, val)						\
+#define native_set_pmc(index, val)						\
 	asm volatile ("mov pmc[%0]=%1" :: "r"(index),
"r"(val) : "memory")
 
-#define ia64_set_pmd(index, val)						\
+#define native_set_pmd(index, val)						\
 	asm volatile ("mov pmd[%0]=%1" :: "r"(index),
"r"(val) : "memory")
 
-#define ia64_set_rr(index, val)							\
+#define native_set_rr(index, val)							\
 	asm volatile ("mov rr[%0]=%1" :: "r"(index),
"r"(val) : "memory");
 
-#define ia64_get_cpuid(index)								\
+#define native_get_cpuid(index)								\
 ({											\
 	__u64 ia64_intri_res;								\
 	asm volatile ("mov %0=cpuid[%r1]" : "=r"(ia64_intri_res) :
"rO"(index));	\
@@ -451,21 +451,21 @@ register unsigned long ia64_r13 asm ("r13")
__used;
 	ia64_intri_res;								\
 })
 
-#define ia64_get_ibr(index)							\
+#define native_get_ibr(index)							\
 ({										\
 	__u64 ia64_intri_res;							\
 	asm volatile ("mov %0=ibr[%1]" : "=r"(ia64_intri_res) :
"r"(index));	\
 	ia64_intri_res;								\
 })
 
-#define ia64_get_pkr(index)							\
+#define native_get_pkr(index)							\
 ({										\
 	__u64 ia64_intri_res;							\
 	asm volatile ("mov %0=pkr[%1]" : "=r"(ia64_intri_res) :
"r"(index));	\
 	ia64_intri_res;								\
 })
 
-#define ia64_get_pmc(index)							\
+#define native_get_pmc(index)							\
 ({										\
 	__u64 ia64_intri_res;							\
 	asm volatile ("mov %0=pmc[%1]" : "=r"(ia64_intri_res) :
"r"(index));	\
@@ -473,48 +473,48 @@ register unsigned long ia64_r13 asm ("r13")
__used;
 })
 
 
-#define ia64_get_pmd(index)							\
+#define native_get_pmd(index)							\
 ({										\
 	__u64 ia64_intri_res;							\
 	asm volatile ("mov %0=pmd[%1]" : "=r"(ia64_intri_res) :
"r"(index));	\
 	ia64_intri_res;								\
 })
 
-#define ia64_get_rr(index)							\
+#define native_get_rr(index)							\
 ({										\
 	__u64 ia64_intri_res;							\
 	asm volatile ("mov %0=rr[%1]" : "=r"(ia64_intri_res) :
"r" (index));	\
 	ia64_intri_res;								\
 })
 
-#define ia64_fc(addr)	asm volatile ("fc %0" :: "r"(addr) :
"memory")
+#define native_fc(addr)	asm volatile ("fc %0" :: "r"(addr)
: "memory")
 
 
 #define ia64_sync_i()	asm volatile (";; sync.i" :::
"memory")
 
-#define ia64_ssm(mask)	asm volatile ("ssm %0":: "i"((mask))
: "memory")
-#define ia64_rsm(mask)	asm volatile ("rsm %0":: "i"((mask))
: "memory")
+#define native_ssm(mask)	asm volatile ("ssm %0"::
"i"((mask)) : "memory")
+#define native_rsm(mask)	asm volatile ("rsm %0"::
"i"((mask)) : "memory")
 #define ia64_sum(mask)	asm volatile ("sum %0":: "i"((mask))
: "memory")
 #define ia64_rum(mask)	asm volatile ("rum %0":: "i"((mask))
: "memory")
 
-#define ia64_ptce(addr)	asm volatile ("ptc.e %0" ::
"r"(addr))
+#define native_ptce(addr)	asm volatile ("ptc.e %0" ::
"r"(addr))
 
-#define ia64_ptcga(addr, size)							\
+#define native_ptcga(addr, size)							\
 do {										\
 	asm volatile ("ptc.ga %0,%1" :: "r"(addr),
"r"(size) : "memory");	\
 	ia64_dv_serialize_data();						\
 } while (0)
 
-#define ia64_ptcl(addr, size)							\
+#define native_ptcl(addr, size)							\
 do {										\
 	asm volatile ("ptc.l %0,%1" :: "r"(addr),
"r"(size) : "memory");	\
 	ia64_dv_serialize_data();						\
 } while (0)
 
-#define ia64_ptri(addr, size)						\
+#define native_ptri(addr, size)						\
 	asm volatile ("ptr.i %0,%1" :: "r"(addr),
"r"(size) : "memory")
 
-#define ia64_ptrd(addr, size)						\
+#define native_ptrd(addr, size)						\
 	asm volatile ("ptr.d %0,%1" :: "r"(addr),
"r"(size) : "memory")
 
 /* Values for lfhint in ia64_lfetch and ia64_lfetch_fault */
@@ -596,7 +596,7 @@ do {										\
         }								\
 })
 
-#define ia64_intrin_local_irq_restore(x)			\
+#define native_intrin_local_irq_restore(x)			\
 do {								\
 	asm volatile (";;   cmp.ne p6,p7=%0,r0;;"		\
 		      "(p6) ssm psr.i;"				\
diff --git a/include/asm-ia64/intel_intrin.h b/include/asm-ia64/intel_intrin.h
index a520d10..ab3c8a3 100644
--- a/include/asm-ia64/intel_intrin.h
+++ b/include/asm-ia64/intel_intrin.h
@@ -16,8 +16,8 @@
 		 	 * intrinsic
 		 	 */
 
-#define ia64_getreg		__getReg
-#define ia64_setreg		__setReg
+#define native_getreg		__getReg
+#define native_setreg		__setReg
 
 #define ia64_hint		__hint
 #define ia64_hint_pause		__hint_pause
@@ -33,16 +33,16 @@
 #define ia64_getf_exp		__getf_exp
 #define ia64_shrp		_m64_shrp
 
-#define ia64_tpa		__tpa
+#define native_tpa		__tpa
 #define ia64_invala		__invala
 #define ia64_invala_gr		__invala_gr
 #define ia64_invala_fr		__invala_fr
 #define ia64_nop		__nop
 #define ia64_sum		__sum
-#define ia64_ssm		__ssm
+#define native_ssm		__ssm
 #define ia64_rum		__rum
-#define ia64_rsm		__rsm
-#define ia64_fc 		__fc
+#define native_rsm		__rsm
+#define native_fc 		__fc
 
 #define ia64_ldfs		__ldfs
 #define ia64_ldfd		__ldfd
@@ -80,24 +80,24 @@
 
 #define __ia64_set_dbr(index, val)	\
 		__setIndReg(_IA64_REG_INDR_DBR, index, val)
-#define ia64_set_ibr(index, val)	\
+#define native_set_ibr(index, val)	\
 		__setIndReg(_IA64_REG_INDR_IBR, index, val)
-#define ia64_set_pkr(index, val)	\
+#define native_set_pkr(index, val)	\
 		__setIndReg(_IA64_REG_INDR_PKR, index, val)
-#define ia64_set_pmc(index, val)	\
+#define native_set_pmc(index, val)	\
 		__setIndReg(_IA64_REG_INDR_PMC, index, val)
-#define ia64_set_pmd(index, val)	\
+#define native_set_pmd(index, val)	\
 		__setIndReg(_IA64_REG_INDR_PMD, index, val)
-#define ia64_set_rr(index, val)	\
+#define native_set_rr(index, val)	\
 		__setIndReg(_IA64_REG_INDR_RR, index, val)
 
-#define ia64_get_cpuid(index) 	__getIndReg(_IA64_REG_INDR_CPUID, index)
+#define native_get_cpuid(index)	__getIndReg(_IA64_REG_INDR_CPUID, index)
 #define __ia64_get_dbr(index) 	__getIndReg(_IA64_REG_INDR_DBR, index)
-#define ia64_get_ibr(index) 	__getIndReg(_IA64_REG_INDR_IBR, index)
-#define ia64_get_pkr(index) 	__getIndReg(_IA64_REG_INDR_PKR, index)
-#define ia64_get_pmc(index) 	__getIndReg(_IA64_REG_INDR_PMC, index)
-#define ia64_get_pmd(index)  	__getIndReg(_IA64_REG_INDR_PMD, index)
-#define ia64_get_rr(index) 	__getIndReg(_IA64_REG_INDR_RR, index)
+#define native_get_ibr(index) 	__getIndReg(_IA64_REG_INDR_IBR, index)
+#define native_get_pkr(index) 	__getIndReg(_IA64_REG_INDR_PKR, index)
+#define native_get_pmc(index) 	__getIndReg(_IA64_REG_INDR_PMC, index)
+#define native_get_pmd(index)  	__getIndReg(_IA64_REG_INDR_PMD, index)
+#define native_get_rr(index) 	__getIndReg(_IA64_REG_INDR_RR, index)
 
 #define ia64_srlz_d		__dsrlz
 #define ia64_srlz_i		__isrlz
@@ -119,18 +119,18 @@
 #define ia64_ld8_acq		__ld8_acq
 
 #define ia64_sync_i		__synci
-#define ia64_thash		__thash
-#define ia64_ttag		__ttag
-#define ia64_itcd		__itcd
-#define ia64_itci		__itci
-#define ia64_itrd		__itrd
-#define ia64_itri		__itri
-#define ia64_ptce		__ptce
-#define ia64_ptcl		__ptcl
-#define ia64_ptcg		__ptcg
-#define ia64_ptcga		__ptcga
-#define ia64_ptri		__ptri
-#define ia64_ptrd		__ptrd
+#define native_thash		__thash
+#define native_ttag		__ttag
+#define native_itcd		__itcd
+#define native_itci		__itci
+#define native_itrd		__itrd
+#define native_itri		__itri
+#define native_ptce		__ptce
+#define native_ptcl		__ptcl
+#define native_ptcg		__ptcg
+#define native_ptcga		__ptcga
+#define native_ptri		__ptri
+#define native_ptrd		__ptrd
 #define ia64_dep_mi		_m64_dep_mi
 
 /* Values for lfhint in __lfetch and __lfetch_fault */
@@ -145,13 +145,13 @@
 #define ia64_lfetch_fault	__lfetch_fault
 #define ia64_lfetch_fault_excl	__lfetch_fault_excl
 
-#define ia64_intrin_local_irq_restore(x)		\
+#define native_intrin_local_irq_restore(x)		\
 do {							\
 	if ((x) != 0) {					\
-		ia64_ssm(IA64_PSR_I);			\
+		native_ssm(IA64_PSR_I);			\
 		ia64_srlz_d();				\
 	} else {					\
-		ia64_rsm(IA64_PSR_I);			\
+		native_rsm(IA64_PSR_I);			\
 	}						\
 } while (0)
 
diff --git a/include/asm-ia64/intrinsics.h b/include/asm-ia64/intrinsics.h
index 5800ad0..3a58069 100644
--- a/include/asm-ia64/intrinsics.h
+++ b/include/asm-ia64/intrinsics.h
@@ -18,15 +18,15 @@
 # include <asm/gcc_intrin.h>
 #endif
 
-#define ia64_get_psr_i()	(ia64_getreg(_IA64_REG_PSR) & IA64_PSR_I)
+#define native_get_psr_i()	(native_getreg(_IA64_REG_PSR) & IA64_PSR_I)
 
-#define ia64_set_rr0_to_rr4(val0, val1, val2, val3, val4)	\
+#define native_set_rr0_to_rr4(val0, val1, val2, val3, val4)	\
 do {								\
-	ia64_set_rr(0x0000000000000000UL, (val0));		\
-	ia64_set_rr(0x2000000000000000UL, (val1));		\
-	ia64_set_rr(0x4000000000000000UL, (val2));		\
-	ia64_set_rr(0x6000000000000000UL, (val3));		\
-	ia64_set_rr(0x8000000000000000UL, (val4));		\
+	native_set_rr(0x0000000000000000UL, (val0));		\
+	native_set_rr(0x2000000000000000UL, (val1));		\
+	native_set_rr(0x4000000000000000UL, (val2));		\
+	native_set_rr(0x6000000000000000UL, (val3));		\
+	native_set_rr(0x8000000000000000UL, (val4));		\
 } while (0)
 
 /*
diff --git a/include/asm-ia64/privop.h b/include/asm-ia64/privop.h
index 7b9de4f..b0b74fd 100644
--- a/include/asm-ia64/privop.h
+++ b/include/asm-ia64/privop.h
@@ -16,6 +16,42 @@
 
 /* fallback for native case */
 
+#ifndef IA64_PARAVIRTUALIZED_PRIVOP
+#ifndef __ASSEMBLY
+#define ia64_getreg			native_getreg
+#define ia64_setreg			native_setreg
+#define ia64_hint			native_hint
+#define ia64_thash			native_thash
+#define ia64_itci			native_itci
+#define ia64_itcd			native_itcd
+#define ia64_itri			native_itri
+#define ia64_itrd			native_itrd
+#define ia64_tpa			native_tpa
+#define ia64_set_ibr			native_set_ibr
+#define ia64_set_pkr			native_set_pkr
+#define ia64_set_pmc			native_set_pmc
+#define ia64_set_pmd			native_set_pmd
+#define ia64_set_rr			native_set_rr
+#define ia64_get_cpuid			native_get_cpuid
+#define ia64_get_ibr			native_get_ibr
+#define ia64_get_pkr			native_get_pkr
+#define ia64_get_pmc			native_get_pmc
+#define ia64_get_pmd			native_get_pmd
+#define ia64_get_rr			native_get_rr
+#define ia64_fc				native_fc
+#define ia64_ssm			native_ssm
+#define ia64_rsm			native_rsm
+#define ia64_ptce			native_ptce
+#define ia64_ptcga			native_ptcga
+#define ia64_ptcl			native_ptcl
+#define ia64_ptri			native_ptri
+#define ia64_ptrd			native_ptrd
+#define ia64_get_psr_i			native_get_psr_i
+#define ia64_intrin_local_irq_restore	native_intrin_local_irq_restore
+#define ia64_set_rr0_to_rr4		native_set_rr0_to_rr4
+#endif /* !__ASSEMBLY */
+#endif /* !IA64_PARAVIRTUALIZED_PRIVOP */
+
 #ifndef IA64_PARAVIRTUALIZED_ENTRY
 #define ia64_switch_to			native_switch_to
 #define ia64_leave_syscall		native_leave_syscall
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 19/50] ia64/pv_ops: define ia64 privileged instruction intrinsics for paravirtualized guest kernel.

Make ia64 privileged instruction intrinsics paravirtualizable with binary
patching allowing each pv instances to override each intrinsics.
Mark privileged instructions which needs paravirtualization and allow pv
instance can binary patch at early boot time.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/paravirtentry.S   |   37 +++
 include/asm-ia64/privop.h          |    4 +
 include/asm-ia64/privop_paravirt.h |  587 ++++++++++++++++++++++++++++++++++++
 3 files changed, 628 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/kernel/paravirtentry.S
 create mode 100644 include/asm-ia64/privop_paravirt.h

diff --git a/arch/ia64/kernel/paravirtentry.S b/arch/ia64/kernel/paravirtentry.S
new file mode 100644
index 0000000..013511f
--- /dev/null
+++ b/arch/ia64/kernel/paravirtentry.S
@@ -0,0 +1,37 @@
+/******************************************************************************
+ * linux/arch/ia64/xen/paravirtentry.S
+ *
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <asm/types.h>
+#include <asm/asmmacro.h>
+#include <asm/paravirt_entry.h>
+#include <asm/privop_paravirt.h>
+
+#define BRANCH(sym, type)					\
+	GLOBAL_ENTRY(paravirt_ ## sym) ;			\
+		BR_COND_SPTK_MANY(native_ ## sym, type) ;	\
+	END(paravirt_ ## sym)
+
+	BRANCH(switch_to,		PARAVIRT_ENTRY_SWITCH_TO)
+	BRANCH(leave_syscall,		PARAVIRT_ENTRY_LEAVE_SYSCALL)
+	BRANCH(work_processed_syscall,	PARAVIRT_ENTRY_WORK_PROCESSED_SYSCALL)
+	BRANCH(leave_kernel,		PARAVIRT_ENTRY_LEAVE_KERNEL)
+	BRANCH(pal_call_static,		PARAVIRT_ENTRY_PAL_CALL_STATIC)
diff --git a/include/asm-ia64/privop.h b/include/asm-ia64/privop.h
index b0b74fd..69591e0 100644
--- a/include/asm-ia64/privop.h
+++ b/include/asm-ia64/privop.h
@@ -10,6 +10,10 @@
  *
  */
 
+#ifdef CONFIG_PARAVIRT
+#include <asm/privop_paravirt.h>
+#endif
+
 #ifdef CONFIG_XEN
 #include <asm/xen/privop.h>
 #endif
diff --git a/include/asm-ia64/privop_paravirt.h
b/include/asm-ia64/privop_paravirt.h
new file mode 100644
index 0000000..bd7de70
--- /dev/null
+++ b/include/asm-ia64/privop_paravirt.h
@@ -0,0 +1,587 @@
+/******************************************************************************
+ * privops_paravirt.h
+ *
+ * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#ifndef _ASM_IA64_PRIVOP_PARAVIRT_H
+#define _ASM_IA64_PRIVOP_PARAVIRT_H
+
+#define PARAVIRT_INST_START			0x1
+#define PARAVIRT_INST_RFI			(PARAVIRT_INST_START + 0x0)
+#define PARAVIRT_INST_RSM_DT			(PARAVIRT_INST_START + 0x1)
+#define PARAVIRT_INST_SSM_DT			(PARAVIRT_INST_START + 0x2)
+#define PARAVIRT_INST_COVER			(PARAVIRT_INST_START + 0x3)
+#define PARAVIRT_INST_ITC_D			(PARAVIRT_INST_START + 0x4)
+#define PARAVIRT_INST_ITC_I			(PARAVIRT_INST_START + 0x5)
+#define PARAVIRT_INST_SSM_I			(PARAVIRT_INST_START + 0x6)
+#define PARAVIRT_INST_GET_IVR			(PARAVIRT_INST_START + 0x7)
+#define PARAVIRT_INST_GET_TPR			(PARAVIRT_INST_START + 0x8)
+#define PARAVIRT_INST_SET_TPR			(PARAVIRT_INST_START + 0x9)
+#define PARAVIRT_INST_EOI			(PARAVIRT_INST_START + 0xa)
+#define PARAVIRT_INST_SET_ITM			(PARAVIRT_INST_START + 0xb)
+#define PARAVIRT_INST_THASH			(PARAVIRT_INST_START + 0xc)
+#define PARAVIRT_INST_PTC_GA			(PARAVIRT_INST_START + 0xd)
+#define PARAVIRT_INST_ITR_D			(PARAVIRT_INST_START + 0xe)
+#define PARAVIRT_INST_GET_RR			(PARAVIRT_INST_START + 0xf)
+#define PARAVIRT_INST_SET_RR			(PARAVIRT_INST_START + 0x10)
+#define PARAVIRT_INST_SET_KR			(PARAVIRT_INST_START + 0x11)
+#define PARAVIRT_INST_FC			(PARAVIRT_INST_START + 0x12)
+#define PARAVIRT_INST_GET_CPUID			(PARAVIRT_INST_START + 0x13)
+#define PARAVIRT_INST_GET_PMD			(PARAVIRT_INST_START + 0x14)
+#define PARAVIRT_INST_GET_EFLAG			(PARAVIRT_INST_START + 0x15)
+#define PARAVIRT_INST_SET_EFLAG			(PARAVIRT_INST_START + 0x16)
+#define PARAVIRT_INST_RSM_BE			(PARAVIRT_INST_START + 0x17)
+#define PARAVIRT_INST_GET_PSR			(PARAVIRT_INST_START + 0x18)
+#define PARAVIRT_INST_SET_RR0_TO_RR4		(PARAVIRT_INST_START + 0x19)
+
+#define PARAVIRT_BNDL_START			0x10000000
+#define PARAVIRT_BNDL_SSM_I			(PARAVIRT_BNDL_START + 0x0)
+#define PARAVIRT_BNDL_RSM_I			(PARAVIRT_BNDL_START + 0x1)
+#define PARAVIRT_BNDL_GET_PSR_I			(PARAVIRT_BNDL_START + 0x2)
+#define PARAVIRT_BNDL_INTRIN_LOCAL_IRQ_RESTORE	(PARAVIRT_BNDL_START + 0x3)
+
+/*
+ * struct task_struct* (*ia64_switch_to)(void* next_task);
+ * void *ia64_leave_syscall;
+ * void *ia64_work_processed_syscall
+ * void *ia64_leave_kernel;
+ * struct ia64_pal_retval (*pal_call_static)(u64, u64, u64, u64, u64);
+ */
+
+#define PARAVIRT_ENTRY_START			0x20000000
+#define PARAVIRT_ENTRY_SWITCH_TO		(PARAVIRT_ENTRY_START + 0)
+#define PARAVIRT_ENTRY_LEAVE_SYSCALL		(PARAVIRT_ENTRY_START + 1)
+#define PARAVIRT_ENTRY_WORK_PROCESSED_SYSCALL	(PARAVIRT_ENTRY_START + 2)
+#define PARAVIRT_ENTRY_LEAVE_KERNEL		(PARAVIRT_ENTRY_START + 3)
+#define PARAVIRT_ENTRY_PAL_CALL_STATIC		(PARAVIRT_ENTRY_START + 4)
+
+
+#ifndef __ASSEMBLER__
+
+#include <linux/stringify.h>
+#include <linux/types.h>
+#include <asm/paravirt_alt.h>
+#include <asm/kregs.h> /* for IA64_PSR_I */
+#include <asm/xen/interface.h>
+
+/************************************************/
+/* Instructions paravirtualized for correctness */
+/************************************************/
+/* Note that "ttag" and "cover" are also
privilege-sensitive; "ttag"
+ * is not currently used (though it may be in a long-format VHPT system!) */
+#ifdef ASM_SUPPORTED
+static inline unsigned long
+paravirt_fc(unsigned long addr)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	register __u64 __addr asm ("r8") = addr;
+	asm volatile (paravirt_alt_inst("fc %1", PARAVIRT_INST_THASH):
+		      "=r"(ia64_intri_res): "0"(__addr):
"memory");
+	return ia64_intri_res;
+}
+#define paravirt_fc(addr)	paravirt_fc((unsigned long)addr)
+
+static inline unsigned long
+paravirt_thash(unsigned long addr)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	register __u64 __addr asm ("r8") = addr;
+	asm volatile (paravirt_alt_inst("thash %0=%1", PARAVIRT_INST_THASH):
+		      "=r"(ia64_intri_res): "0"(__addr));
+	return ia64_intri_res;
+}
+
+static inline unsigned long
+paravirt_get_cpuid(int index)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	register __u64 __index asm ("r8") = index;
+	asm volatile (paravirt_alt_inst("mov %0=cpuid[%r1]",
+				   PARAVIRT_INST_GET_CPUID):
+		      "=r"(ia64_intri_res): "0O"(__index));
+	return ia64_intri_res;
+}
+
+static inline unsigned long
+paravirt_get_pmd(int index)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	register __u64 __index asm ("r8") = index;
+	asm volatile (paravirt_alt_inst("mov %0=pmd[%1]",
+					PARAVIRT_INST_GET_PMD):
+		      "=r"(ia64_intri_res): "0"(__index));
+	return ia64_intri_res;
+}
+
+static inline unsigned long
+paravirt_get_eflag(void)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	asm volatile (paravirt_alt_inst("mov %0=ar%1",
+					PARAVIRT_INST_GET_EFLAG):
+		"=r"(ia64_intri_res):
+		"i"(_IA64_REG_AR_EFLAG - _IA64_REG_AR_KR0): "memory");
+	return ia64_intri_res;
+}
+
+static inline void
+paravirt_set_eflag(unsigned long val)
+{
+	register __u64 __val asm ("r8") = val;
+	asm volatile (paravirt_alt_inst("mov ar%0=%1",
+					PARAVIRT_INST_SET_EFLAG)::
+		      "i"(_IA64_REG_AR_EFLAG - _IA64_REG_AR_KR0),
"r"(__val):
+		      "memory");
+}
+
+/************************************************/
+/* Instructions paravirtualized for performance */
+/************************************************/
+
+static inline unsigned long
+paravirt_get_psr(void)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	asm volatile (paravirt_alt_inst("mov %0=psr",
PARAVIRT_INST_GET_PSR):
+		      "=r"(ia64_intri_res));
+	return ia64_intri_res;
+}
+
+static inline unsigned long
+paravirt_get_ivr(void)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	asm volatile (paravirt_alt_inst("mov %0=cr%1",
PARAVIRT_INST_GET_IVR):
+		      "=r"(ia64_intri_res):
+		      "i" (_IA64_REG_CR_IVR - _IA64_REG_CR_DCR));
+	return ia64_intri_res;
+}
+
+static inline unsigned long
+paravirt_get_tpr(void)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	asm volatile (paravirt_alt_inst("mov %0=cr%1",
PARAVIRT_INST_GET_TPR):
+		      "=r"(ia64_intri_res):
+		      "i" (_IA64_REG_CR_TPR - _IA64_REG_CR_DCR));
+	return ia64_intri_res;
+}
+
+static inline void
+paravirt_set_tpr(unsigned long val)
+{
+	register __u64 __val asm ("r8") = val;
+	asm volatile (paravirt_alt_inst("mov cr%0=%1",
PARAVIRT_INST_SET_TPR)::
+		      "i" (_IA64_REG_CR_TPR - _IA64_REG_CR_DCR),
"r"(__val):
+		      "memory");
+}
+
+static inline void
+paravirt_eoi(unsigned long val)
+{
+	register __u64 __val asm ("r8") = val;
+	asm volatile (paravirt_alt_inst("mov cr%0=%1", PARAVIRT_INST_EOI)::
+		      "i" (_IA64_REG_CR_EOI - _IA64_REG_CR_DCR),
"r"(__val):
+		      "memory");
+}
+
+static inline void
+paravirt_set_itm(unsigned long val)
+{
+	register __u64 __val asm ("r8") = val;
+	asm volatile (paravirt_alt_inst("mov cr%0=%1",
PARAVIRT_INST_SET_ITM)::
+		      "i" (_IA64_REG_CR_ITM - _IA64_REG_CR_DCR),
"r"(__val):
+		      "memory");
+}
+
+static inline void
+paravirt_ptcga(unsigned long addr, unsigned long size)
+{
+	register __u64 __addr asm ("r8") = addr;
+	register __u64 __size asm ("r9") = size;
+	asm volatile (paravirt_alt_inst("ptc.ga %0,%1",
PARAVIRT_INST_PTC_GA)::
+		      "r"(__addr), "r"(__size): "memory");
+	ia64_dv_serialize_data();
+}
+
+static inline unsigned long
+paravirt_get_rr(unsigned long index)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	register __u64 __index asm ("r8") = index;
+	asm volatile (paravirt_alt_inst("mov %0=rr[%1]",
PARAVIRT_INST_GET_RR):
+		      "=r"(ia64_intri_res) : "0" (__index));
+	return ia64_intri_res;
+}
+
+static inline void
+paravirt_set_rr(unsigned long index, unsigned long val)
+{
+	register __u64 __index asm ("r8") = index;
+	register __u64 __val asm ("r9") = val;
+	asm volatile (paravirt_alt_inst("mov rr[%0]=%1",
PARAVIRT_INST_SET_RR)::
+		      "r"(__index), "r"(__val): "memory");
+}
+
+static inline void
+paravirt_set_rr0_to_rr4(unsigned long val0, unsigned long val1,
+			unsigned long val2, unsigned long val3,
+			unsigned long val4)
+{
+	register __u64 __val0 asm ("r8") = val0;
+	register __u64 __val1 asm ("r9") = val1;
+	register __u64 __val2 asm ("r10") = val2;
+	register __u64 __val3 asm ("r11") = val3;
+	register __u64 __val4 asm ("r14") = val4;
+	asm volatile (paravirt_alt_inst("\t;;\n"
+					"\t{.mmi\n"
+					"\tmov rr[%0]=%1\n"
+					/*
+					 * without this stop bit
+					 * assembler complains.
+					 */
+					"\t;;\n"
+					"\tmov rr[%2]=%3\n"
+					"\tnop.i 0\n"
+					"\t}\n"
+					"\t{.mmi\n"
+					"\tmov rr[%4]=%5\n"
+					"\tmov rr[%6]=%7\n"
+					"\tnop.i 0\n"
+					"\t}\n"
+					"\tmov rr[%8]=%9;;\n",
+					PARAVIRT_INST_SET_RR0_TO_RR4)::
+		      "r"(0x0000000000000000UL), "r"(__val0),
+		      "r"(0x2000000000000000UL), "r"(__val1),
+		      "r"(0x4000000000000000UL), "r"(__val2),
+		      "r"(0x6000000000000000UL), "r"(__val3),
+		      "r"(0x8000000000000000UL), "r"(__val4) :
+		      "memory");
+}
+
+static inline void
+paravirt_set_kr(unsigned long index, unsigned long val)
+{
+	register __u64 __index asm ("r8") = index - _IA64_REG_AR_KR0;
+	register __u64 __val asm ("r9") = val;
+
+	/*
+	 * asm volatile ("break %0"::
+	 *	      "i"(PARAVIRT_INST_SET_KR), "r"(__index),
"r"(__val));
+	 */
+#ifndef BUILD_BUG_ON
+#define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
+#endif
+	BUILD_BUG_ON(!__builtin_constant_p(__index));
+	switch (index) {
+	case _IA64_REG_AR_KR0:
+		asm volatile (paravirt_alt_inst("mov ar%0=%2",
+					   PARAVIRT_INST_SET_KR)::
+			      "i" (_IA64_REG_AR_KR0 - _IA64_REG_AR_KR0),
+			      "r"(__index), "r"(__val):
+			      "memory");
+		break;
+	case _IA64_REG_AR_KR1:
+		asm volatile (paravirt_alt_inst("mov ar%0=%2",
+					   PARAVIRT_INST_SET_KR)::
+			      "i" (_IA64_REG_AR_KR1 - _IA64_REG_AR_KR0),
+			      "r"(__index), "r"(__val):
+			      "memory");
+		break;
+	case _IA64_REG_AR_KR2:
+		asm volatile (paravirt_alt_inst("mov ar%0=%2",
+					    PARAVIRT_INST_SET_KR)::
+			      "i" (_IA64_REG_AR_KR2 - _IA64_REG_AR_KR0),
+			      "r"(__index), "r"(__val):
+			      "memory");
+		break;
+	case _IA64_REG_AR_KR3:
+		asm volatile (paravirt_alt_inst("mov ar%0=%2",
+					   PARAVIRT_INST_SET_KR)::
+			      "i" (_IA64_REG_AR_KR3 - _IA64_REG_AR_KR0),
+			      "r"(__index), "r"(__val):
+			      "memory");
+		break;
+	case _IA64_REG_AR_KR4:
+		asm volatile (paravirt_alt_inst("mov ar%0=%2",
+					   PARAVIRT_INST_SET_KR)::
+			      "i" (_IA64_REG_AR_KR4 - _IA64_REG_AR_KR0),
+			      "r"(__index), "r"(__val):
+			      "memory");
+		break;
+	case _IA64_REG_AR_KR5:
+		asm volatile (paravirt_alt_inst("mov ar%0=%2",
+					   PARAVIRT_INST_SET_KR)::
+			      "i" (_IA64_REG_AR_KR5 - _IA64_REG_AR_KR0),
+			      "r"(__index), "r"(__val):
+			      "memory");
+		break;
+	case _IA64_REG_AR_KR6:
+		asm volatile (paravirt_alt_inst("mov ar%0=%2",
+					   PARAVIRT_INST_SET_KR)::
+			      "i" (_IA64_REG_AR_KR6 - _IA64_REG_AR_KR0),
+			      "r"(__index), "r"(__val):
+			      "memory");
+		break;
+	case _IA64_REG_AR_KR7:
+		asm volatile (paravirt_alt_inst("mov ar%0=%2",
+					   PARAVIRT_INST_SET_KR)::
+			      "i" (_IA64_REG_AR_KR7 - _IA64_REG_AR_KR0),
+			      "r"(__index), "r"(__val):
+			      "memory");
+		break;
+	default: {
+		extern void compile_error_ar_kr_index_must_be_copmile_time_constant(void);
+		compile_error_ar_kr_index_must_be_copmile_time_constant();
+		break;
+	}
+	}
+}
+#endif /* ASM_SUPPORTED */
+
+static inline unsigned long
+paravirt_getreg(unsigned long regnum)
+{
+	__u64 ia64_intri_res;
+
+	switch (regnum) {
+	case _IA64_REG_PSR:
+		ia64_intri_res = paravirt_get_psr();
+		break;
+	case _IA64_REG_CR_IVR:
+		ia64_intri_res = paravirt_get_ivr();
+		break;
+	case _IA64_REG_CR_TPR:
+		ia64_intri_res = paravirt_get_tpr();
+		break;
+	case _IA64_REG_AR_EFLAG:
+		ia64_intri_res = paravirt_get_eflag();
+		break;
+	default:
+		ia64_intri_res = native_getreg(regnum);
+		break;
+	}
+	return ia64_intri_res;
+ }
+
+static inline void
+paravirt_setreg(unsigned long regnum, unsigned long val)
+{
+	switch (regnum) {
+	case _IA64_REG_AR_KR0 ... _IA64_REG_AR_KR7:
+		paravirt_set_kr(regnum, val);
+		break;
+	case _IA64_REG_CR_ITM:
+		paravirt_set_itm(val);
+		break;
+	case _IA64_REG_CR_TPR:
+		paravirt_set_tpr(val);
+		break;
+	case _IA64_REG_CR_EOI:
+		paravirt_eoi(val);
+		break;
+	case _IA64_REG_AR_EFLAG:
+		paravirt_set_eflag(val);
+		break;
+	default:
+		native_setreg(regnum, val);
+		break;
+	}
+}
+
+#ifdef ASM_SUPPORTED
+
+#define NOP_BUNDLE				\
+	"{\n\t"					\
+	"nop 0\n\t"				\
+	"nop 0\n\t"				\
+	"nop 0\n\t"				\
+	"}\n\t"
+
+static inline void
+paravirt_ssm_i(void)
+{
+	/* five bundles */
+	asm volatile (paravirt_alt_bundle("{\n\t"
+					  "ssm psr.i\n\t"
+					  "nop 0\n\t"
+					  "nop 0\n\t"
+					  "}\n\t"
+					  NOP_BUNDLE
+					  NOP_BUNDLE
+					  NOP_BUNDLE
+					  NOP_BUNDLE,
+					  PARAVIRT_BNDL_SSM_I):::
+		      "r8", "r9", "r10",
+		      "p6", "p7",
+		      "memory");
+}
+
+static inline void
+paravirt_rsm_i(void)
+{
+	/* two budles */
+	asm volatile (paravirt_alt_bundle("{\n\t"
+					  "rsm psr.i\n\t"
+					  "nop 0\n\t"
+					  "nop 0\n\t"
+					  "}\n\t"
+					  NOP_BUNDLE,
+					  PARAVIRT_BNDL_RSM_I):::
+		      "r8", "r9",
+		      "memory");
+}
+
+static inline unsigned long
+paravirt_get_psr_i(void)
+{
+	register unsigned long psr_i asm ("r8");
+	register unsigned long mask asm ("r9");
+
+	/* three bundles */
+	asm volatile (paravirt_alt_bundle("{\n\t"
+					  "mov %0=psr\n\t"
+					  "mov %1=%2\n\t"
+					  ";;\n\t"
+					  "and %0=%0,%1\n\t"
+					  "}\n\t"
+					  NOP_BUNDLE
+					  NOP_BUNDLE,
+					  PARAVIRT_BNDL_GET_PSR_I):
+		      "=r"(psr_i),
+		      "=r"(mask)
+		      :
+		      "i"(IA64_PSR_I)
+		      :
+		      /* "r8", "r9", */
+		      "p6");
+	return psr_i;
+}
+
+static inline void
+paravirt_intrin_local_irq_restore(unsigned long flags)
+{
+	register unsigned long __flags asm ("r8") = flags;
+
+	/* six bundles */
+	asm volatile (paravirt_alt_bundle(";;\n\t"
+					  "{\n\t"
+					  "cmp.ne p6,p7=%0,r0;;\n\t"
+					  "(p6) ssm psr.i;\n\t"
+					  "nop 0\n\t"
+					  "}\n\t"
+					  "{\n\t"
+					  "(p7) rsm psr.i;;\n\t"
+					  "(p6) srlz.d\n\t"
+					  "nop 0\n\t"
+					  "}\n\t"
+					  NOP_BUNDLE
+					  NOP_BUNDLE
+					  NOP_BUNDLE
+					  NOP_BUNDLE,
+					  PARAVIRT_BNDL_INTRIN_LOCAL_IRQ_RESTORE)::
+		      "r"(__flags) :
+		      /* "r8",*/ "r9", "r10", "r11",
+		      "p6", "p7", "p8", "p9",
+		      "memory");
+
+}
+
+#undef NOP_BUNDLE
+
+#endif /* ASM_SUPPORTED */
+
+static inline void
+paravirt_ssm(unsigned long mask)
+{
+	if (mask == IA64_PSR_I)
+		paravirt_ssm_i();
+	else
+		native_ssm(mask);
+}
+
+static inline void
+paravirt_rsm(unsigned long mask)
+{
+	if (mask == IA64_PSR_I)
+		paravirt_rsm_i();
+	else
+		native_rsm(mask);
+}
+
+#if defined(ASM_SUPPORTED) && defined(CONFIG_PARAVIRT_ALT)
+
+#define IA64_PARAVIRTUALIZED_PRIVOP
+
+#define ia64_fc(addr)			paravirt_fc(addr)
+#define ia64_thash(addr)		paravirt_thash(addr)
+#define ia64_get_cpuid(i)		paravirt_get_cpuid(i)
+#define ia64_get_pmd(i)			paravirt_get_pmd(i)
+#define ia64_ptcga(addr, size)		paravirt_ptcga((addr), (size))
+#define ia64_set_rr(index, val)		paravirt_set_rr((index), (val))
+#define ia64_get_rr(index)		paravirt_get_rr(index)
+#define ia64_getreg(regnum)		paravirt_getreg(regnum)
+#define ia64_setreg(regnum, val)	paravirt_setreg((regnum), (val))
+#define ia64_set_rr0_to_rr4(val0, val1, val2, val3, val4)		\
+	paravirt_set_rr0_to_rr4((val0), (val1), (val2), (val3), (val4))
+
+#define ia64_ssm(mask)			paravirt_ssm(mask)
+#define ia64_rsm(mask)			paravirt_rsm(mask)
+#define ia64_get_psr_i()		paravirt_get_psr_i()
+#define ia64_intrin_local_irq_restore(x)	\
+	paravirt_intrin_local_irq_restore(x)
+
+/* the remainder of these are not performance-sensitive so its
+ * OK to not paravirtualize and just take a privop trap and emulate */
+#define ia64_hint			native_hint
+#define ia64_set_pmd			native_set_pmd
+#define ia64_itci			native_itci
+#define ia64_itcd			native_itcd
+#define ia64_itri			native_itri
+#define ia64_itrd			native_itrd
+#define ia64_tpa			native_tpa
+#define ia64_set_ibr			native_set_ibr
+#define ia64_set_pkr			native_set_pkr
+#define ia64_set_pmc			native_set_pmc
+#define ia64_get_ibr			native_get_ibr
+#define ia64_get_pkr			native_get_pkr
+#define ia64_get_pmc			native_get_pmc
+#define ia64_ptce			native_ptce
+#define ia64_ptcl			native_ptcl
+#define ia64_ptri			native_ptri
+#define ia64_ptrd			native_ptrd
+
+#endif /* ASM_SUPPORTED && CONFIG_PARAVIRT_ALT */
+
+#endif /* __ASSEMBLER__*/
+
+/* these routines utilize privilege-sensitive or performance-sensitive
+ * privileged instructions so the code must be replaced with
+ * paravirtualized versions */
+#ifdef CONFIG_PARAVIRT_ENTRY
+#define IA64_PARAVIRTUALIZED_ENTRY
+#define ia64_switch_to			paravirt_switch_to
+#define ia64_work_processed_syscall	paravirt_work_processed_syscall
+#define ia64_leave_syscall		paravirt_leave_syscall
+#define ia64_leave_kernel		paravirt_leave_kernel
+#define ia64_pal_call_static		paravirt_pal_call_static
+#endif /* CONFIG_PARAVIRT_ENTRY */
+
+#endif /* _ASM_IA64_PRIVOP_PARAVIRT_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 20/50] ia64/pv_ops: paravirtualized instructions for hand written assembly code on native.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/inst_native.h |  183 ++++++++++++++++++++++++++++++++++++++++
 1 files changed, 183 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/kernel/inst_native.h

diff --git a/arch/ia64/kernel/inst_native.h b/arch/ia64/kernel/inst_native.h
new file mode 100644
index 0000000..9453083
--- /dev/null
+++ b/arch/ia64/kernel/inst_native.h
@@ -0,0 +1,183 @@
+/******************************************************************************
+ * inst_native.h
+ *
+ * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#define IA64_ASM_PARAVIRTUALIZED_NATIVE
+
+#undef BR_IF_NATIVE
+#define BR_IF_NATIVE(targ, reg, pred)		/* nothing */
+
+#ifdef CONFIG_PARAVIRT_GUEST_ASM_CLOBBER_CHECK
+# define PARAVIRT_POISON	0xdeadbeefbaadf00d
+# define CLOBBER(clob)				\
+	;;					\
+	movl clob = PARAVIRT_POISON;		\
+	;;
+#else
+# define CLOBBER(clob)		/* nothing */
+#endif
+
+#define __paravirt_switch_to			native_switch_to
+#define __paravirt_leave_syscall		native_leave_syscall
+#define __paravirt_work_processed_syscall	native_work_processed_syscall
+#define __paravirt_leave_kernel			native_leave_kernel
+#define __paravirt_pending_syscall_end		ia64_work_pending_syscall_end
+#define __paravirt_work_processed_syscall_target \
+						ia64_work_processed_syscall
+
+#define MOV_FROM_IFA(reg)	\
+	mov reg = cr.ifa
+
+#define MOV_FROM_ITIR(reg)	\
+	mov reg = cr.itir
+
+#define MOV_FROM_ISR(reg)	\
+	mov reg = cr.isr
+
+#define MOV_FROM_IHA(reg)	\
+	mov reg = cr.iha
+
+#define MOV_FROM_IPSR(reg)	\
+	mov reg = cr.ipsr
+
+#define MOV_FROM_IIM(reg)	\
+	mov reg = cr.iim
+
+#define MOV_FROM_IIP(reg)	\
+	mov reg = cr.iip
+
+#define MOV_FROM_IVR(reg, clob)	\
+	mov reg = cr.ivr	\
+	CLOBBER(clob)
+
+#define MOV_FROM_PSR(pred, reg, clob)	\
+(pred)	mov reg = psr			\
+	CLOBBER(clob)
+
+#define MOV_TO_IFA(reg, clob)	\
+	mov cr.ifa = reg	\
+	CLOBBER(clob)
+
+#define MOV_TO_ITIR(pred, reg, clob)	\
+(pred)	mov cr.itir = reg		\
+	CLOBBER(clob)
+
+#define MOV_TO_IHA(pred, reg, clob)	\
+(pred)	mov cr.iha = reg		\
+	CLOBBER(clob)
+
+#define MOV_TO_IPSR(reg, clob)	\
+	mov cr.ipsr = reg	\
+	CLOBBER(clob)
+
+#define MOV_TO_IFS(pred, reg, clob)	\
+(pred)	mov cr.ifs = reg		\
+	CLOBBER(clob)
+
+#define MOV_TO_IIP(reg, clob)	\
+	mov cr.iip = reg	\
+	CLOBBER(clob)
+
+#define MOV_TO_KR(kr, reg, clob0, clob1)	\
+	mov IA64_KR(kr) = reg			\
+	CLOBBER(clob0)				\
+	CLOBBER(clob1)
+
+#define ITC_I(pred, reg, clob)	\
+(pred)	itc.i reg		\
+	CLOBBER(clob)
+
+#define ITC_D(pred, reg, clob)	\
+(pred)	itc.d reg		\
+	CLOBBER(clob)
+
+#define ITC_I_AND_D(pred_i, pred_d, reg, clob)	\
+(pred_i) itc.i reg;				\
+(pred_d) itc.d reg				\
+	CLOBBER(clob)
+
+#define THASH(pred, reg0, reg1, clob)		\
+(pred)	thash reg0 = reg1			\
+	CLOBBER(clob)
+
+#define SSM_PSR_IC_AND_DEFAULT_BITS(clob0, clob1)			\
+	ssm psr.ic | PSR_DEFAULT_BITS					\
+	CLOBBER(clob0)							\
+	CLOBBER(clob1)							\
+	;;								\
+	srlz.i /* guarantee that interruption collectin is on */	\
+	;;
+
+#define SSM_PSR_IC_AND_SRLZ_D(clob0, clob1)	\
+	ssm psr.ic				\
+	CLOBBER(clob0)				\
+	CLOBBER(clob1)				\
+	;;					\
+	srlz.d
+
+#define RSM_PSR_IC(clob)	\
+	rsm psr.ic		\
+	CLOBBER(clob)
+
+#define SSM_PSR_I(pred, clob)	\
+(pred)	ssm psr.i		\
+	CLOBBER(clob)
+
+#define RSM_PSR_I(pred, clob0, clob1)	\
+(pred)	rsm psr.i			\
+	CLOBBER(clob0)			\
+	CLOBBER(clob1)
+
+#define RSM_PSR_I_IC(clob0, clob1, clob2)	\
+	rsm psr.i | psr.ic			\
+	CLOBBER(clob0)				\
+	CLOBBER(clob1)				\
+	CLOBBER(clob2)
+
+#define RSM_PSR_DT		\
+	rsm psr.dt
+
+#define RSM_PSR_DT_AND_SRLZ_I	\
+	rsm psr.dt		\
+	;;			\
+	srlz.i
+
+#define SSM_PSR_DT_AND_SRLZ_I	\
+	ssm psr.dt		\
+	;;			\
+	srlz.i
+
+#define BSW_0(clob0, clob1, clob2)	\
+	bsw.0				\
+	CLOBBER(clob0)			\
+	CLOBBER(clob1)			\
+	CLOBBER(clob2)
+
+#define BSW_1(clob0, clob1)	\
+	bsw.1			\
+	CLOBBER(clob0)		\
+	CLOBBER(clob1)
+
+#define COVER	\
+	cover
+
+#define RFI	\
+	rfi
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 21/50] ia64/pv_ops: header file to switch paravirtualized assembly instructions.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/inst_paravirt.h |   28 ++++++++++++++++++++++++++++
 1 files changed, 28 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/kernel/inst_paravirt.h

diff --git a/arch/ia64/kernel/inst_paravirt.h b/arch/ia64/kernel/inst_paravirt.h
new file mode 100644
index 0000000..689c343
--- /dev/null
+++ b/arch/ia64/kernel/inst_paravirt.h
@@ -0,0 +1,28 @@
+/******************************************************************************
+ * linux/arch/ia64/xen/inst_paravirt.h
+ *
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#ifdef __IA64_ASM_PARAVIRTUALIZED_XEN
+#include "../xen/inst_xen.h"
+#include "../xen/xenminstate.h"
+#else
+#include "inst_native.h"
+#endif
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 22/50] ia64/pv_ops: paravirtualize minstate.h.

This isn't necessary because xen defines its own DO_SAVE_MIN.
But this makes the difference of them smaller.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/minstate.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/kernel/minstate.h b/arch/ia64/kernel/minstate.h
index fc99141..10a412c 100644
--- a/arch/ia64/kernel/minstate.h
+++ b/arch/ia64/kernel/minstate.h
@@ -34,9 +34,9 @@
 	mov r27=ar.rsc;			/* M */							\
 	mov r20=r1;			/* A */							\
 	mov r25=ar.unat;		/* M */							\
-	mov r29=cr.ipsr;		/* M */							\
+	MOV_FROM_IPSR(r29);		/* M */							\
 	mov r26=ar.pfs;			/* I */							\
-	mov r28=cr.iip;			/* M */							\
+	MOV_FROM_IIP(r28);		/* M */							\
 	mov r21=ar.fpsr;		/* M */							\
 	__COVER;			/* B;; (or nothing) */					\
 	;;											\
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 23/50] ia64/pv_ops: paravirtualize arch/ia64/kernel/switch_leave.S

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/switch_leave.S |   80 +++++++++++++++++++++-----------------
 1 files changed, 44 insertions(+), 36 deletions(-)

diff --git a/arch/ia64/kernel/switch_leave.S b/arch/ia64/kernel/switch_leave.S
index 9918160..d6d0f08 100644
--- a/arch/ia64/kernel/switch_leave.S
+++ b/arch/ia64/kernel/switch_leave.S
@@ -44,16 +44,17 @@
 #include <asm/pgtable.h>
 #include <asm/thread_info.h>
 
+#include "inst_paravirt.h"
 #include "minstate.h"
 
-
 /*
  * prev_task <- ia64_switch_to(struct task_struct *next)
  *	With Ingo's new scheduler, interrupts are disabled when this routine
gets
  *	called.  The code starting at .map relies on this.  The rest of the code
  *	doesn't care about the interrupt masking status.
  */
-GLOBAL_ENTRY(native_switch_to)
+GLOBAL_ENTRY(__paravirt_switch_to)
+	BR_IF_NATIVE(native_switch_to, r22, p7)
 	.prologue
 	alloc r16=ar.pfs,1,0,0,0
 	DO_SAVE_SWITCH_STACK
@@ -77,7 +78,7 @@ GLOBAL_ENTRY(native_switch_to)
 	;;
 .done:
 	ld8 sp=[r21]			// load kernel stack pointer of new task
-	mov IA64_KR(CURRENT)=in0	// update "current" application register
+	MOV_TO_KR(CURRENT, in0, r8, r9)		// update "current" application
register
 	mov r8=r13			// return pointer to previously running task
 	mov r13=in0			// set "current" pointer
 	;;
@@ -89,25 +90,30 @@ GLOBAL_ENTRY(native_switch_to)
 	br.ret.sptk.many rp		// boogie on out in new context
 
 .map:
-	rsm psr.ic			// interrupts (psr.i) are already disabled here
+	RSM_PSR_IC(r25)			// interrupts (psr.i) are already disabled here
 	movl r25=PAGE_KERNEL
 	;;
 	srlz.d
 	or r23=r25,r20			// construct PA | page properties
 	mov r25=IA64_GRANULE_SHIFT<<2
 	;;
-	mov cr.itir=r25
-	mov cr.ifa=in0			// VA of next task...
+	MOV_TO_ITIR(p0, r25, r8)
+	MOV_TO_IFA(in0, r8)		// VA of next task...
 	;;
 	mov r25=IA64_TR_CURRENT_STACK
-	mov IA64_KR(CURRENT_STACK)=r26	// remember last page we mapped...
+	MOV_TO_KR(CURRENT_STACK, r26, r8, r9)	// remember last page we mapped...
 	;;
 	itr.d dtr[r25]=r23		// wire in new mapping...
-	ssm psr.ic			// reenable the psr.ic bit
-	;;
-	srlz.d
+	SSM_PSR_IC_AND_SRLZ_D(r8, r9)	// reenable the psr.ic bit
 	br.cond.sptk .done
-END(native_switch_to)
+END(__paravirt_switch_to)
+
+#ifdef IA64_ASM_PARAVIRTUALIZED_XEN
+GLOBAL_ENTRY(xen_work_processed_syscall_with_check)
+	BR_IF_NATIVE(native_work_processed_syscall, r2, p7)
+	br.cond.sptk xen_work_processed_syscall
+END(xen_work_processed_syscall_with_check)
+#endif /* IA64_ASM_PARAVIRTUALIZED_XEN */
 
 /*
  * ia64_leave_syscall(): Same as ia64_leave_kernel, except that it doesn't
@@ -153,8 +159,9 @@ END(native_switch_to)
  *	      ar.csd: cleared
  *	      ar.ssd: cleared
  */
-GLOBAL_ENTRY(native_leave_syscall)
+GLOBAL_ENTRY(__paravirt_leave_syscall)
 	PT_REGS_UNWIND_INFO(0)
+	BR_IF_NATIVE(native_leave_syscall, r22, p7)
 	/*
 	 * work.need_resched etc. mustn't get changed by this CPU before it
returns to
 	 * user- or fsys-mode, hence we disable interrupts early on.
@@ -177,12 +184,12 @@ GLOBAL_ENTRY(native_leave_syscall)
 	;;
 	cmp.eq p6,p0=r21,r0		// p6 <- pUStk || (preempt_count == 0)
 #else /* !CONFIG_PREEMPT */
-(pUStk)	rsm psr.i
+	RSM_PSR_I(pUStk, r2, r18)
 	cmp.eq pLvSys,p0=r0,r0		// pLvSys=1: leave from syscall
 (pUStk)	cmp.eq.unc p6,p0=r0,r0		// p6 <- pUStk
 #endif
-.global native_work_processed_syscall;
-native_work_processed_syscall:
+.global __paravirt_work_processed_syscall;
+__paravirt_work_processed_syscall:
 	adds r2=PT(LOADRS)+16,r12
 	adds r3=PT(AR_BSPSTORE)+16,r12
 	adds r18=TI_FLAGS+IA64_TASK_SIZE,r13
@@ -205,7 +212,7 @@ native_work_processed_syscall:
 (pNonSys) break 0		//      bug check: we shouldn't be here if pNonSys is
TRUE!
 	;;
 	invala			// M0|1 invalidate ALAT
-	rsm psr.i | psr.ic	// M2   turn off interrupts and interruption collection
+	RSM_PSR_I_IC(r28, r29, r30)	// M2   turn off interrupts and interruption
collection
 	cmp.eq p9,p0=r0,r0	// A    set p9 to indicate that we should restore cr.ifs
 
 	ld8 r29=[r2],16		// M0|1 load cr.ipsr
@@ -217,7 +224,7 @@ native_work_processed_syscall:
 (pUStk) add r14=IA64_TASK_THREAD_ON_USTACK_OFFSET,r13
 	;;
 	ld8 r26=[r2],PT(B0)-PT(AR_PFS)	// M0|1 load ar.pfs
-(pKStk)	mov r22=psr			// M2   read PSR now that interrupts are disabled
+	MOV_FROM_PSR(pKStk, r22, r21)	// M2   read PSR now that interrupts are
disabled
 	nop 0
 	;;
 	ld8 r21=[r2],PT(AR_RNAT)-PT(B0) // M0|1 load b0
@@ -246,7 +253,7 @@ native_work_processed_syscall:
 
 	srlz.d				// M0   ensure interruption collection is off (for cover)
 	shr.u r18=r19,16		// I0|1 get byte size of existing "dirty"
partition
-	cover				// B    add current frame into dirty partition & set cr.ifs
+	COVER				// B    add current frame into dirty partition & set cr.ifs
 	;;
 	mov r19=ar.bsp			// M2   get new backing store pointer
 	mov f10=f0			// F    clear f10
@@ -261,10 +268,11 @@ native_work_processed_syscall:
 	mov.m ar.ssd=r0			// M2   clear ar.ssd
 	mov f11=f0			// F    clear f11
 	br.cond.sptk.many rbs_switch	// B
-END(native_leave_syscall)
+END(__paravirt_leave_syscall)
 
-GLOBAL_ENTRY(native_leave_kernel)
+GLOBAL_ENTRY(__paravirt_leave_kernel)
 	PT_REGS_UNWIND_INFO(0)
+	BR_IF_NATIVE(native_leave_kernel, r22, p7)
 	/*
 	 * work.need_resched etc. mustn't get changed by this CPU before it
returns to
 	 * user- or fsys-mode, hence we disable interrupts early on.
@@ -287,7 +295,7 @@ GLOBAL_ENTRY(native_leave_kernel)
 	;;
 	cmp.eq p6,p0=r21,r0		// p6 <- pUStk || (preempt_count == 0)
 #else
-(pUStk)	rsm psr.i
+	RSM_PSR_I(pUStk, r17, r31)
 	cmp.eq p0,pLvSys=r0,r0		// pLvSys=0: leave from kernel
 (pUStk)	cmp.eq.unc p6,p0=r0,r0		// p6 <- pUStk
 #endif
@@ -335,7 +343,7 @@ GLOBAL_ENTRY(native_leave_kernel)
 	mov ar.csd=r30
 	mov ar.ssd=r31
 	;;
-	rsm psr.i | psr.ic	// initiate turning off of interrupt and interruption
collection
+	RSM_PSR_I_IC(r23, r22, r25)	// initiate turning off of interrupt and
interruption collection
 	invala			// invalidate ALAT
 	;;
 	ld8.fill r22=[r2],24
@@ -367,13 +375,13 @@ GLOBAL_ENTRY(native_leave_kernel)
 	mov ar.ccv=r15
 	;;
 	ldf.fill f11=[r2]
-	bsw.0			// switch back to bank 0 (no stop bit required beforehand...)
+	BSW_0(r2, r3, r15)	// switch back to bank 0 (no stop bit required
beforehand...)
 	;;
 (pUStk)	mov r18=IA64_KR(CURRENT)// M2 (12 cycle read latency)
 	adds r16=PT(CR_IPSR)+16,r12
 	adds r17=PT(CR_IIP)+16,r12
 
-(pKStk)	mov r22=psr		// M2 read PSR now that interrupts are disabled
+	MOV_FROM_PSR(pKStk, r22, r29)	// M2 read PSR now that interrupts are disabled
 	nop.i 0
 	nop.i 0
 	;;
@@ -421,7 +429,7 @@ GLOBAL_ENTRY(native_leave_kernel)
 	 * NOTE: alloc, loadrs, and cover can't be predicated.
 	 */
 (pNonSys) br.cond.dpnt dont_preserve_current_frame
-	cover				// add current frame into dirty partition and set cr.ifs
+	COVER				// add current frame into dirty partition and set cr.ifs
 	;;
 	mov r19=ar.bsp			// get new backing store pointer
 rbs_switch:
@@ -524,16 +532,16 @@ skip_rbs_switch:
 (pKStk)	dep r29=r22,r29,21,1	// I0 update ipsr.pp with psr.pp
 (pLvSys)mov r16=r0		// A  clear r16 for leave_syscall, no-op otherwise
 	;;
-	mov cr.ipsr=r29		// M2
+	MOV_TO_IPSR(r29, r25)	// M2
 	mov ar.pfs=r26		// I0
 (pLvSys)mov r17=r0		// A  clear r17 for leave_syscall, no-op otherwise
 
-(p9)	mov cr.ifs=r30		// M2
+	MOV_TO_IFS(p9, r30, r25)// M2
 	mov b0=r21		// I0
 (pLvSys)mov r18=r0		// A  clear r18 for leave_syscall, no-op otherwise
 
 	mov ar.fpsr=r20		// M2
-	mov cr.iip=r28		// M2
+	MOV_TO_IIP(r28, r25)	// M2
 	nop 0
 	;;
 (pUStk)	mov ar.rnat=r24		// M2 must happen with RSE in lazy mode
@@ -542,7 +550,7 @@ skip_rbs_switch:
 
 	mov ar.rsc=r27		// M2
 	mov pr=r31,-1		// I0
-	rfi			// B
+	RFI			// B
 
 	/*
 	 * On entry:
@@ -568,28 +576,28 @@ skip_rbs_switch:
 #endif
 	br.call.spnt.many rp=schedule
 .ret9:	cmp.eq p6,p0=r0,r0				// p6 <- 1
-	rsm psr.i		// disable interrupts
+	RSM_PSR_I(p0, r2, r20)	// disable interrupts
 	;;
 #ifdef CONFIG_PREEMPT
 (pKStk)	adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
 	;;
 (pKStk)	st4 [r20]=r0		// preempt_count() <- 0
 #endif
-(pLvSys)br.cond.sptk.few  ia64_work_pending_syscall_end
+(pLvSys)br.cond.sptk.few  __paravirt_pending_syscall_end
 	br.cond.sptk.many .work_processed_kernel	// re-check
 
 .notify:
 (pUStk)	br.call.spnt.many rp=notify_resume_user
 .ret10:	cmp.ne p6,p0=r0,r0				// p6 <- 0
-(pLvSys)br.cond.sptk.few  ia64_work_pending_syscall_end
+(pLvSys)br.cond.sptk.few  __paravirt_pending_syscall_end
 	br.cond.sptk.many .work_processed_kernel	// don't re-check
 
-.global ia64_work_pending_syscall_end;
-ia64_work_pending_syscall_end:
+.global __paravirt_pending_syscall_end;
+__paravirt_pending_syscall_end:
 	adds r2=PT(R8)+16,r12
 	adds r3=PT(R10)+16,r12
 	;;
 	ld8 r8=[r2]
 	ld8 r10=[r3]
-	br.cond.sptk.many ia64_work_processed_syscall	// re-check
-END(native_leave_kernel)
+	br.cond.sptk.many __paravirt_work_processed_syscall_target	// re-check
+END(__paravirt_leave_kernel)
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 24/50] ia64/pv_ops: paravirtualize arch/ia64/kernel/ivt.S.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/ivt.S |  153 ++++++++++++++++++++++++-----------------------
 1 files changed, 78 insertions(+), 75 deletions(-)

diff --git a/arch/ia64/kernel/ivt.S b/arch/ia64/kernel/ivt.S
index 34f44d8..d1cebe5 100644
--- a/arch/ia64/kernel/ivt.S
+++ b/arch/ia64/kernel/ivt.S
@@ -12,6 +12,13 @@
  *
  * 00/08/23 Asit Mallick <asit.k.mallick at intel.com> TLB handling for
SMP
  * 00/12/20 David Mosberger-Tang <davidm at hpl.hp.com> DTLB/ITLB handler
now uses virtual PT.
+ *
+ * Copyright (C) 2005 Hewlett-Packard Co
+ *	Dan Magenheimer <dan.magenheimer at hp.com>
+ *      Xen paravirtualization
+ * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *                    pv_ops.
  */
 /*
  * This file defines the interruption vector table used by the CPU.
@@ -68,6 +75,7 @@
 # define DBG_FAULT(i)
 #endif
 
+#include "inst_paravirt.h"
 #include "minstate.h"
 
 #define FAULT(n)									\
@@ -102,13 +110,13 @@ ENTRY(vhpt_miss)
 	 *	- the faulting virtual address uses unimplemented address bits
 	 *	- the faulting virtual address has no valid page table mapping
 	 */
-	mov r16=cr.ifa				// get address that caused the TLB miss
+	MOV_FROM_IFA(r16)			// get address that caused the TLB miss
 #ifdef CONFIG_HUGETLB_PAGE
 	movl r18=PAGE_SHIFT
-	mov r25=cr.itir
+	MOV_FROM_ITIR(r25)
 #endif
 	;;
-	rsm psr.dt				// use physical addressing for data
+	RSM_PSR_DT				// use physical addressing for data
 	mov r31=pr				// save the predicate registers
 	mov r19=IA64_KR(PT_BASE)		// get page table base address
 	shl r21=r16,3				// shift bit 60 into sign bit
@@ -168,21 +176,20 @@ ENTRY(vhpt_miss)
 	dep r21=r19,r20,3,(PAGE_SHIFT-3)	// r21=pte_offset(pmd,addr)
 	;;
 (p7)	ld8 r18=[r21]				// read *pte
-	mov r19=cr.isr				// cr.isr bit 32 tells us if this is an insn miss
+	MOV_FROM_ISR(r19)			// cr.isr bit 32 tells us if this is an insn miss
 	;;
 (p7)	tbit.z p6,p7=r18,_PAGE_P_BIT		// page present bit cleared?
-	mov r22=cr.iha				// get the VHPT address that caused the TLB miss
+	MOV_FROM_IHA(r22)			// get the VHPT address that caused the TLB miss
 	;;					// avoid RAW on p7
 (p7)	tbit.nz.unc p10,p11=r19,32		// is it an instruction TLB miss?
 	dep r23=0,r20,0,PAGE_SHIFT		// clear low bits to get page address
 	;;
-(p10)	itc.i r18				// insert the instruction TLB entry
-(p11)	itc.d r18				// insert the data TLB entry
+	ITC_I_AND_D(p10, p11, r18, r24)		// insert the instruction TLB entry and
+						// insert the data TLB entry
 (p6)	br.cond.spnt.many page_fault		// handle bad address/page not present (page
fault)
-	mov cr.ifa=r22
-
+	MOV_TO_IFA(r22, r24)
 #ifdef CONFIG_HUGETLB_PAGE
-(p8)	mov cr.itir=r25				// change to default page-size for VHPT
+	MOV_TO_ITIR(p8, r25, r24)		// change to default page-size for VHPT
 #endif
 
 	/*
@@ -192,7 +199,7 @@ ENTRY(vhpt_miss)
 	 */
 	adds r24=__DIRTY_BITS_NO_ED|_PAGE_PL_0|_PAGE_AR_RW,r23
 	;;
-(p7)	itc.d r24
+	ITC_D(p7, r24, r25)
 	;;
 #ifdef CONFIG_SMP
 	/*
@@ -234,7 +241,7 @@ ENTRY(vhpt_miss)
 #endif
 
 	mov pr=r31,-1				// restore predicate registers
-	rfi
+	RFI
 END(vhpt_miss)
 
 	.org ia64_ivt+0x400
@@ -248,11 +255,11 @@ ENTRY(itlb_miss)
 	 * mode, walk the page table, and then re-execute the PTE read and
 	 * go on normally after that.
 	 */
-	mov r16=cr.ifa				// get virtual address
+	MOV_FROM_IFA(r16)			// get virtual address
 	mov r29=b0				// save b0
 	mov r31=pr				// save predicates
 .itlb_fault:
-	mov r17=cr.iha				// get virtual address of PTE
+	MOV_FROM_IHA(r17)			// get virtual address of PTE
 	movl r30=1f				// load nested fault continuation point
 	;;
 1:	ld8 r18=[r17]				// read *pte
@@ -261,7 +268,7 @@ ENTRY(itlb_miss)
 	tbit.z p6,p0=r18,_PAGE_P_BIT		// page present bit cleared?
 (p6)	br.cond.spnt page_fault
 	;;
-	itc.i r18
+	ITC_I(p0, r18, r19)
 	;;
 #ifdef CONFIG_SMP
 	/*
@@ -278,7 +285,7 @@ ENTRY(itlb_miss)
 (p7)	ptc.l r16,r20
 #endif
 	mov pr=r31,-1
-	rfi
+	RFI
 END(itlb_miss)
 
 	.org ia64_ivt+0x0800
@@ -292,11 +299,11 @@ ENTRY(dtlb_miss)
 	 * mode, walk the page table, and then re-execute the PTE read and
 	 * go on normally after that.
 	 */
-	mov r16=cr.ifa				// get virtual address
+	MOV_FROM_IFA(r16)			// get virtual address
 	mov r29=b0				// save b0
 	mov r31=pr				// save predicates
 dtlb_fault:
-	mov r17=cr.iha				// get virtual address of PTE
+	MOV_FROM_IHA(r17)			// get virtual address of PTE
 	movl r30=1f				// load nested fault continuation point
 	;;
 1:	ld8 r18=[r17]				// read *pte
@@ -305,7 +312,7 @@ dtlb_fault:
 	tbit.z p6,p0=r18,_PAGE_P_BIT		// page present bit cleared?
 (p6)	br.cond.spnt page_fault
 	;;
-	itc.d r18
+	ITC_D(p0, r18, r19)
 	;;
 #ifdef CONFIG_SMP
 	/*
@@ -322,7 +329,7 @@ dtlb_fault:
 (p7)	ptc.l r16,r20
 #endif
 	mov pr=r31,-1
-	rfi
+	RFI
 END(dtlb_miss)
 
 	.org ia64_ivt+0x0c00
@@ -330,9 +337,9 @@ END(dtlb_miss)
 // 0x0c00 Entry 3 (size 64 bundles) Alt ITLB (19)
 ENTRY(alt_itlb_miss)
 	DBG_FAULT(3)
-	mov r16=cr.ifa		// get address that caused the TLB miss
+	MOV_FROM_IFA(r16)	// get address that caused the TLB miss
 	movl r17=PAGE_KERNEL
-	mov r21=cr.ipsr
+	MOV_FROM_IPSR(r21)
 	movl r19=(((1 << IA64_MAX_PHYS_BITS) - 1) & ~0xfff)
 	mov r31=pr
 	;;
@@ -341,9 +348,9 @@ ENTRY(alt_itlb_miss)
 	;;
 	cmp.gt p8,p0=6,r22			// user mode
 	;;
-(p8)	thash r17=r16
+	THASH(p8, r17, r16, r23)
 	;;
-(p8)	mov cr.iha=r17
+	MOV_TO_IHA(p8, r17, r23)
 (p8)	mov r29=b0				// save b0
 (p8)	br.cond.dptk .itlb_fault
 #endif
@@ -358,9 +365,9 @@ ENTRY(alt_itlb_miss)
 	or r19=r19,r18		// set bit 4 (uncached) if the access was to region 6
 (p8)	br.cond.spnt page_fault
 	;;
-	itc.i r19		// insert the TLB entry
+	ITC_I(p0, r19, r18)	// insert the TLB entry
 	mov pr=r31,-1
-	rfi
+	RFI
 END(alt_itlb_miss)
 
 	.org ia64_ivt+0x1000
@@ -368,11 +375,11 @@ END(alt_itlb_miss)
 // 0x1000 Entry 4 (size 64 bundles) Alt DTLB (7,46)
 ENTRY(alt_dtlb_miss)
 	DBG_FAULT(4)
-	mov r16=cr.ifa		// get address that caused the TLB miss
+	MOV_FROM_IFA(r16)	// get address that caused the TLB miss
 	movl r17=PAGE_KERNEL
-	mov r20=cr.isr
+	MOV_FROM_ISR(r20)
 	movl r19=(((1 << IA64_MAX_PHYS_BITS) - 1) & ~0xfff)
-	mov r21=cr.ipsr
+	MOV_FROM_IPSR(r21)
 	mov r31=pr
 	mov r24=PERCPU_ADDR
 	;;
@@ -381,9 +388,9 @@ ENTRY(alt_dtlb_miss)
 	;;
 	cmp.gt p8,p0=6,r22			// access to region 0-5
 	;;
-(p8)	thash r17=r16
+	THASH(p8, r17, r16, r25)
 	;;
-(p8)	mov cr.iha=r17
+	MOV_TO_IHA(r17, r25)
 (p8)	mov r29=b0				// save b0
 (p8)	br.cond.dptk dtlb_fault
 #endif
@@ -402,7 +409,7 @@ ENTRY(alt_dtlb_miss)
 	tbit.nz p9,p0=r20,IA64_ISR_NA_BIT	// is non-access bit on?
 	;;
 (p10)	sub r19=r19,r26
-(p10)	mov cr.itir=r25
+	MOV_TO_ITIR(p10, r25, r24)
 	cmp.ne p8,p0=r0,r23
 (p9)	cmp.eq.or.andcm p6,p7=IA64_ISR_CODE_LFETCH,r22	// check isr.code field
 (p12)	dep r17=-1,r17,4,1			// set ma=UC for region 6 addr
@@ -413,9 +420,9 @@ ENTRY(alt_dtlb_miss)
 	or r19=r19,r17		// insert PTE control bits into r19
 (p6)	mov cr.ipsr=r21
 	;;
-(p7)	itc.d r19		// insert the TLB entry
+	ITC_D(p7, r19, r18)	// insert the TLB entry
 	mov pr=r31,-1
-	rfi
+	RFI
 END(alt_dtlb_miss)
 
 	.org ia64_ivt+0x1400
@@ -444,10 +451,10 @@ ENTRY(nested_dtlb_miss)
 	 *
 	 * Clobbered:	b0, r18, r19, r21, r22, psr.dt (cleared)
 	 */
-	rsm psr.dt				// switch to using physical data addressing
+	RSM_PSR_DT_AND_SRLZ_I			// switch to using physical data addressing
 	mov r19=IA64_KR(PT_BASE)		// get the page table base address
 	shl r21=r16,3				// shift bit 60 into sign bit
-	mov r18=cr.itir
+	MOV_FROM_ITIR(r18)
 	;;
 	shr.u r17=r16,61			// get the region number into r17
 	extr.u r18=r18,2,6			// get the faulting page size
@@ -510,21 +517,15 @@ END(ikey_miss)
 
//-----------------------------------------------------------------------------------
 	// call do_page_fault (predicates are in r31, psr.dt may be off, r16 is
faulting address)
 ENTRY(page_fault)
-	ssm psr.dt
-	;;
-	srlz.i
+	SSM_PSR_DT_AND_SRLZ_I
 	;;
 	SAVE_MIN_WITH_COVER
 	alloc r15=ar.pfs,0,0,3,0
-	mov out0=cr.ifa
-	mov out1=cr.isr
+	MOV_FROM_IFA(out0)
+	MOV_FROM_ISR(out1)
+	SSM_PSR_IC_AND_DEFAULT_BITS(r14, r3)
 	adds r3=8,r2				// set up second base pointer
-	;;
-	ssm psr.ic | PSR_DEFAULT_BITS
-	;;
-	srlz.i					// guarantee that interruption collectin is on
-	;;
-(p15)	ssm psr.i				// restore psr.i
+	SSM_PSR_I(p15, r14)			// restore psr.i
 	movl r14=ia64_leave_kernel
 	;;
 	SAVE_REST
@@ -556,10 +557,10 @@ ENTRY(dirty_bit)
 	 * page table TLB entry isn't present, we take a nested TLB miss hit where
we look
 	 * up the physical address of the L3 PTE and then continue at label 1 below.
 	 */
-	mov r16=cr.ifa				// get the address that caused the fault
+	MOV_FROM_IFA(r16)			// get the address that caused the fault
 	movl r30=1f				// load continuation point in case of nested fault
 	;;
-	thash r17=r16				// compute virtual address of L3 PTE
+	THASH(p0, r17, r16, r18)		// compute virtual address of L3 PTE
 	mov r29=b0				// save b0 in case of nested fault
 	mov r31=pr				// save pr
 #ifdef CONFIG_SMP
@@ -576,7 +577,7 @@ ENTRY(dirty_bit)
 	;;
 (p6)	cmp.eq p6,p7=r26,r18			// Only compare if page is present
 	;;
-(p6)	itc.d r25				// install updated PTE
+	ITC_D(p6, r25, r18)			// install updated PTE
 	;;
 	/*
 	 * Tell the assemblers dependency-violation checker that the above
"itc" instructions
@@ -602,7 +603,7 @@ ENTRY(dirty_bit)
 	itc.d r18				// install updated PTE
 #endif
 	mov pr=r31,-1				// restore pr
-	rfi
+	RFI
 END(dirty_bit)
 
 	.org ia64_ivt+0x2400
@@ -611,7 +612,7 @@ END(dirty_bit)
 ENTRY(iaccess_bit)
 	DBG_FAULT(9)
 	// Like Entry 8, except for instruction access
-	mov r16=cr.ifa				// get the address that caused the fault
+	MOV_FROM_IFA(r16)			// get the address that caused the fault
 	movl r30=1f				// load continuation point in case of nested fault
 	mov r31=pr				// save predicates
 #ifdef CONFIG_ITANIUM
@@ -626,7 +627,7 @@ ENTRY(iaccess_bit)
 (p6)	mov r16=r18				// if so, use cr.iip instead of cr.ifa
 #endif /* CONFIG_ITANIUM */
 	;;
-	thash r17=r16				// compute virtual address of L3 PTE
+	THASH(p0, r17, r16, r18)		// compute virtual address of L3 PTE
 	mov r29=b0				// save b0 in case of nested fault)
 #ifdef CONFIG_SMP
 	mov r28=ar.ccv				// save ar.ccv
@@ -642,7 +643,7 @@ ENTRY(iaccess_bit)
 	;;
 (p6)	cmp.eq p6,p7=r26,r18			// Only if page present
 	;;
-(p6)	itc.i r25				// install updated PTE
+	ITC_I(p6, r25, r26)			// install updated PTE
 	;;
 	/*
 	 * Tell the assemblers dependency-violation checker that the above
"itc" instructions
@@ -668,7 +669,7 @@ ENTRY(iaccess_bit)
 	itc.i r18				// install updated PTE
 #endif /* !CONFIG_SMP */
 	mov pr=r31,-1
-	rfi
+	RFI
 END(iaccess_bit)
 
 	.org ia64_ivt+0x2800
@@ -677,10 +678,10 @@ END(iaccess_bit)
 ENTRY(daccess_bit)
 	DBG_FAULT(10)
 	// Like Entry 8, except for data access
-	mov r16=cr.ifa				// get the address that caused the fault
+	MOV_FROM_IFA(r16)			// get the address that caused the fault
 	movl r30=1f				// load continuation point in case of nested fault
 	;;
-	thash r17=r16				// compute virtual address of L3 PTE
+	THASH(p0, r17, r16, r18)		// compute virtual address of L3 PTE
 	mov r31=pr
 	mov r29=b0				// save b0 in case of nested fault)
 #ifdef CONFIG_SMP
@@ -697,7 +698,7 @@ ENTRY(daccess_bit)
 	;;
 (p6)	cmp.eq p6,p7=r26,r18			// Only if page is present
 	;;
-(p6)	itc.d r25				// install updated PTE
+	ITC_D(p6, r25, r26)			// install updated PTE
 	/*
 	 * Tell the assemblers dependency-violation checker that the above
"itc" instructions
 	 * cannot possibly affect the following loads:
@@ -721,7 +722,7 @@ ENTRY(daccess_bit)
 #endif
 	mov b0=r29				// restore b0
 	mov pr=r31,-1
-	rfi
+	RFI
 END(daccess_bit)
 
 	.org ia64_ivt+0x2c00
@@ -745,10 +746,10 @@ ENTRY(break_fault)
 	 */
 	DBG_FAULT(11)
 	mov.m r16=IA64_KR(CURRENT)		// M2 r16 <- current task (12 cyc)
-	mov r29=cr.ipsr				// M2 (12 cyc)
+	MOV_FROM_IPSR(r29)			// M2 (12 cyc)
 	mov r31=pr				// I0 (2 cyc)
 
-	mov r17=cr.iim				// M2 (2 cyc)
+	MOV_FROM_IIM(r17)			// M2 (2 cyc)
 	mov.m r27=ar.rsc			// M2 (12 cyc)
 	mov r18=__IA64_BREAK_SYSCALL		// A
 
@@ -767,7 +768,7 @@ ENTRY(break_fault)
 	nop.m 0
 	movl r30=sys_call_table			// X
 
-	mov r28=cr.iip				// M2 (2 cyc)
+	MOV_FROM_IIP(r28)			// M2 (2 cyc)
 	cmp.eq p0,p7=r18,r17			// I0 is this a system call?
 (p7)	br.cond.spnt non_syscall		// B  no ->
 	//
@@ -831,10 +832,10 @@ ENTRY(break_fault)
 1:
 	mov ar.rsc=0x3				// M2   set eager mode, pl 0, LE, loadrs=0
 	nop 0
-	bsw.1					// B (6 cyc) regs are saved, switch to bank 1
+	BSW_1(r2, r14)				// B (6 cyc) regs are saved, switch to bank 1
 	;;
 
-	ssm psr.ic | PSR_DEFAULT_BITS		// M2	now it's safe to re-enable
intr.-collection
+	SSM_PSR_IC_AND_DEFAULT_BITS(r3, r16)	// M2	now it's safe to re-enable
intr.-collection
 	movl r3=ia64_ret_from_syscall		// X
 	;;
 
@@ -842,7 +843,7 @@ ENTRY(break_fault)
 	mov rp=r3				// I0   set the real return addr
 (p10)	br.cond.spnt.many ia64_ret_from_syscall	// B    return if bad call-frame
or r15 is a NaT
 
-(p15)	ssm psr.i				// M2   restore psr.i
+	SSM_PSR_I(p15, r16)			// M2   restore psr.i
 (p14)	br.call.sptk.many b6=b6			// B    invoke syscall-handker (ignore return
addr)
 	br.cond.spnt.many ia64_trace_syscall	// B	do syscall-tracing thingamagic
 	// NOT REACHED
@@ -866,7 +867,7 @@ ENTRY(interrupt)
 	mov r31=pr		// prepare to save predicates
 	;;
 	SAVE_MIN_WITH_COVER	// uses r31; defines r2 and r3
-	ssm psr.ic | PSR_DEFAULT_BITS
+	SSM_PSR_IC_AND_DEFAULT_BITS(r3, r14)
 	;;
 	adds r3=8,r2		// set up second base pointer for SAVE_REST
 	srlz.i			// ensure everybody knows psr.ic is back on
@@ -875,7 +876,7 @@ ENTRY(interrupt)
 	;;
 	MCA_RECOVER_RANGE(interrupt)
 	alloc r14=ar.pfs,0,0,2,0 // must be first in an insn group
-	mov out0=cr.ivr		// pass cr.ivr as first arg
+	MOV_FROM_IVR(out0, r8)	// pass cr.ivr as first arg
 	add out1=16,sp		// pass pointer to pt_regs as second arg
 	;;
 	srlz.d			// make sure we see the effect of cr.ivr
@@ -944,6 +945,7 @@ END(interrupt)
 	 *	- ar.fpsr: set to kernel settings
 	 *	-  b6: preserved (same as on entry)
 	 */
+#ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
 GLOBAL_ENTRY(ia64_syscall_setup)
 #if PT(B6) != 0
 # error This code assumes that b6 is the first field in pt_regs.
@@ -1035,6 +1037,7 @@ GLOBAL_ENTRY(ia64_syscall_setup)
 (p10)	mov r8=-EINVAL
 	br.ret.sptk.many b7
 END(ia64_syscall_setup)
+#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */
 
 	.org ia64_ivt+0x3c00
 /////////////////////////////////////////////////////////////////////////////////////////
@@ -1181,10 +1184,10 @@ ENTRY(dispatch_to_fault_handler)
 	SAVE_MIN_WITH_COVER_R19
 	alloc r14=ar.pfs,0,0,5,0
 	mov out0=r15
-	mov out1=cr.isr
-	mov out2=cr.ifa
-	mov out3=cr.iim
-	mov out4=cr.itir
+	MOV_FROM_ISR(out1)
+	MOV_FROM_IFA(out2)
+	MOV_FROM_IIM(out3)
+	MOV_FROM_ITIR(out4)
 	;;
 	ssm psr.ic | PSR_DEFAULT_BITS
 	;;
@@ -1255,8 +1258,8 @@ END(iaccess_rights)
 // 0x5300 Entry 23 (size 16 bundles) Data Access Rights (14,53)
 ENTRY(daccess_rights)
 	DBG_FAULT(23)
-	mov r16=cr.ifa
-	rsm psr.dt
+	MOV_FROM_IFA(r16)
+	RSM_PSR_DT
 	mov r31=pr
 	;;
 	srlz.d
@@ -1352,7 +1355,7 @@ ENTRY(speculation_vector)
 	mov cr.ipsr=r16
 	;;
 
-	rfi				// and go back
+	RFI
 END(speculation_vector)
 
 	.org ia64_ivt+0x5800
@@ -1506,7 +1509,7 @@ ENTRY(ia32_intercept)
 (p6)	br.cond.spnt 1f		// eflags.ac bit didn't change
 	;;
 	mov pr=r31,-1		// restore predicate registers
-	rfi
+	RFI
 
 1:
 #endif	// CONFIG_IA32_SUPPORT
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 25/50] ia64/pv_ops: introduce pv_info

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/Makefile   |    2 +
 arch/ia64/kernel/paravirt.c |   34 ++++++++++++++++++++++++
 include/asm-ia64/paravirt.h |   61 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 97 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/kernel/paravirt.c
 create mode 100644 include/asm-ia64/paravirt.h

diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index 185e0e2..7849bc3 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -41,6 +41,8 @@ obj-$(CONFIG_PARAVIRT_ALT)	+= paravirt_alt.o
 obj-$(CONFIG_PARAVIRT_ENTRY)	+= paravirt_entry.o paravirtentry.o
 obj-$(CONFIG_PARAVIRT_NOP_B_PATCH) += paravirt_nop.o
 
+obj-$(CONFIG_PARAVIRT_GUEST)	+= paravirt.o
+
 obj-$(CONFIG_IA64_ESI)		+= esi.o
 ifneq ($(CONFIG_IA64_ESI),)
 obj-y				+= esi_stub.o	# must be in kernel proper
diff --git a/arch/ia64/kernel/paravirt.c b/arch/ia64/kernel/paravirt.c
new file mode 100644
index 0000000..b31fa91
--- /dev/null
+++ b/arch/ia64/kernel/paravirt.c
@@ -0,0 +1,34 @@
+/******************************************************************************
+ * arch/ia64/kernel/paravirt.c
+ *
+ * Copyright (c) 2006 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <linux/init.h>
+
+#include <asm/paravirt.h>
+
+/***************************************************************************
+ * general info
+ */
+struct pv_info pv_info = {
+	.kernel_rpl = 0,
+	.paravirt_enabled = 0,
+	.name = "bare hardware"
+};
diff --git a/include/asm-ia64/paravirt.h b/include/asm-ia64/paravirt.h
new file mode 100644
index 0000000..c2d4809
--- /dev/null
+++ b/include/asm-ia64/paravirt.h
@@ -0,0 +1,61 @@
+/******************************************************************************
+ * include/asm-ia64/paravirt.h
+ *
+ * Copyright (c) 2006 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+
+#ifndef __ASM_PARAVIRT_H
+#define __ASM_PARAVIRT_H
+
+#ifdef CONFIG_PARAVIRT_GUEST
+
+#ifndef __ASSEMBLY__
+
+/******************************************************************************
+ * general info
+ */
+struct pv_info {
+	unsigned int kernel_rpl;
+        int paravirt_enabled;
+        const char *name;
+};
+
+extern struct pv_info pv_info;
+
+static inline int paravirt_enabled(void)
+{
+	return pv_info.paravirt_enabled;
+}
+
+static inline unsigned int get_kernel_rpl(void)
+{
+	return pv_info.kernel_rpl;
+}
+
+#endif /* __ASSEMBLY__ */
+
+#else
+/* fallback for native case */
+
+/* XXX: TODO */
+
+#endif /* CONFIG_PARAVIRT_GUEST */
+
+#endif /* __ASM_PARAVIRT_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 26/50] ia64/pv_ops: introduce pv_init_ops and its hooks.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/module.c   |   32 +++++++++++
 arch/ia64/kernel/paravirt.c |    8 +++
 arch/ia64/kernel/setup.c    |   14 +++++
 arch/ia64/kernel/smpboot.c  |    2 +
 include/asm-ia64/paravirt.h |  122 ++++++++++++++++++++++++++++++++++++++++++-
 5 files changed, 177 insertions(+), 1 deletions(-)

diff --git a/arch/ia64/kernel/module.c b/arch/ia64/kernel/module.c
index e58f436..edf7cca 100644
--- a/arch/ia64/kernel/module.c
+++ b/arch/ia64/kernel/module.c
@@ -454,6 +454,14 @@ module_frob_arch_sections (Elf_Ehdr *ehdr, Elf_Shdr
*sechdrs, char *secstrings,
 			mod->arch.opd = s;
 		else if (strcmp(".IA_64.unwind", secstrings + s->sh_name) == 0)
 			mod->arch.unwind = s;
+#ifdef CONFIG_PARAVIRT_ALT
+		else if (strcmp(".paravirt_bundles",
+				secstrings + s->sh_name) == 0)
+			mod->arch.paravirt_bundles = s;
+		else if (strcmp(".paravirt_insts",
+				secstrings + s->sh_name) == 0)
+			mod->arch.paravirt_insts = s;
+#endif
 
 	if (!mod->arch.core_plt || !mod->arch.init_plt || !mod->arch.got ||
!mod->arch.opd) {
 		printk(KERN_ERR "%s: sections missing\n", mod->name);
@@ -929,6 +937,30 @@ module_finalize (const Elf_Ehdr *hdr, const Elf_Shdr
*sechdrs, struct module *mo
 	DEBUGP("%s: init: entry=%p\n", __FUNCTION__, mod->init);
 	if (mod->arch.unwind)
 		register_unwind_table(mod);
+#ifdef CONFIG_PARAVIRT_ALT
+	if (mod->arch.paravirt_bundles) {
+		struct paravirt_alt_bundle_patch *start +			(struct paravirt_alt_bundle_patch
*)
+			mod->arch.paravirt_bundles->sh_addr;
+		struct paravirt_alt_bundle_patch *end +			(struct paravirt_alt_bundle_patch
*)
+			(mod->arch.paravirt_bundles->sh_addr +
+			 mod->arch.paravirt_bundles->sh_size);
+
+		paravirt_bundle_patch_module(start, end);
+	}
+	if (mod->arch.paravirt_insts) {
+		struct paravirt_alt_inst_patch *start +			(struct paravirt_alt_inst_patch *)
+			mod->arch.paravirt_insts->sh_addr;
+		struct paravirt_alt_inst_patch *end +			(struct paravirt_alt_inst_patch *)
+			(mod->arch.paravirt_insts->sh_addr +
+			 mod->arch.paravirt_insts->sh_size);
+
+		paravirt_inst_patch_module(start, end);
+	}
+#endif
 	return 0;
 }
 
diff --git a/arch/ia64/kernel/paravirt.c b/arch/ia64/kernel/paravirt.c
index b31fa91..4282b00 100644
--- a/arch/ia64/kernel/paravirt.c
+++ b/arch/ia64/kernel/paravirt.c
@@ -32,3 +32,11 @@ struct pv_info pv_info = {
 	.paravirt_enabled = 0,
 	.name = "bare hardware"
 };
+
+/***************************************************************************
+ * pv_init_ops
+ * initialization hooks.
+ */
+
+struct pv_init_ops pv_init_ops;
+
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index ebd1a09..bfccf54 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -51,6 +51,7 @@
 #include <asm/mca.h>
 #include <asm/meminit.h>
 #include <asm/page.h>
+#include <asm/paravirt.h>
 #include <asm/patch.h>
 #include <asm/pgtable.h>
 #include <asm/processor.h>
@@ -288,6 +289,8 @@ reserve_memory (void)
 	rsvd_region[n].end   = (unsigned long) ia64_imva(_end);
 	n++;
 
+	n += paravirt_reserve_memory(&rsvd_region[n]);
+
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (ia64_boot_param->initrd_start) {
 		rsvd_region[n].start = (unsigned long)__va(ia64_boot_param->initrd_start);
@@ -466,6 +469,8 @@ setup_arch (char **cmdline_p)
 {
 	unw_init();
 
+	paravirt_arch_setup_early();
+
 	ia64_patch_vtop((u64) __start___vtop_patchlist, (u64) __end___vtop_patchlist);
 
 	*cmdline_p = __va(ia64_boot_param->command_line);
@@ -518,6 +523,9 @@ setup_arch (char **cmdline_p)
 	acpi_boot_init();
 #endif
 
+	paravirt_banner();
+	paravirt_arch_setup_console(cmdline_p);
+
 #ifdef CONFIG_VT
 	if (!conswitchp) {
 # if defined(CONFIG_DUMMY_CONSOLE)
@@ -537,11 +545,15 @@ setup_arch (char **cmdline_p)
 #endif
 
 	/* enable IA-64 Machine Check Abort Handling unless disabled */
+	if (paravirt_arch_setup_nomca())
+		nomca = 1;
 	if (!nomca)
 		ia64_mca_init();
 
 	platform_setup(cmdline_p);
+	paravirt_post_platform_setup();
 	paging_init();
+	paravirt_post_paging_init();
 }
 
 /*
@@ -969,6 +981,8 @@ cpu_init (void)
 		max_num_phys_stacked = num_phys_stacked;
 	}
 	platform_cpu_init();
+	paravirt_cpu_init();
+
 	pm_idle = default_idle;
 }
 
diff --git a/arch/ia64/kernel/smpboot.c b/arch/ia64/kernel/smpboot.c
index 32ee597..e7ce751 100644
--- a/arch/ia64/kernel/smpboot.c
+++ b/arch/ia64/kernel/smpboot.c
@@ -50,6 +50,7 @@
 #include <asm/machvec.h>
 #include <asm/mca.h>
 #include <asm/page.h>
+#include <asm/paravirt.h>
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
 #include <asm/processor.h>
@@ -642,6 +643,7 @@ void __devinit smp_prepare_boot_cpu(void)
 	cpu_set(smp_processor_id(), cpu_online_map);
 	cpu_set(smp_processor_id(), cpu_callin_map);
 	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
+	paravirt_post_smp_prepare_boot_cpu();
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/include/asm-ia64/paravirt.h b/include/asm-ia64/paravirt.h
index c2d4809..dd585fc 100644
--- a/include/asm-ia64/paravirt.h
+++ b/include/asm-ia64/paravirt.h
@@ -28,6 +28,8 @@
 
 #ifndef __ASSEMBLY__
 
+#include <asm/meminit.h>
+
 /******************************************************************************
  * general info
  */
@@ -49,12 +51,130 @@ static inline unsigned int get_kernel_rpl(void)
 	return pv_info.kernel_rpl;
 }
 
+/******************************************************************************
+ * initialization hooks.
+ */
+struct rsvd_region;
+
+struct pv_init_ops {
+	void (*banner)(void);
+
+	int (*reserve_memory)(struct rsvd_region *region);
+
+	void (*arch_setup_early)(void);
+	void (*arch_setup_console)(char **cmdline_p);
+	int (*arch_setup_nomca)(void);
+	void (*post_platform_setup)(void);
+	void (*post_paging_init)(void);
+
+	void (*cpu_init)(void);
+
+
+	void (*post_smp_prepare_boot_cpu)(void);
+
+	void (*bundle_patch_module)(struct paravirt_alt_bundle_patch *start,
+				    struct paravirt_alt_bundle_patch *end);
+	void (*inst_patch_module)(struct paravirt_alt_inst_patch *start,
+				  struct paravirt_alt_inst_patch *end);
+};
+
+extern struct pv_init_ops pv_init_ops;
+
+static inline void paravirt_banner(void)
+{
+	if (pv_init_ops.banner)
+		pv_init_ops.banner();
+}
+
+static inline int paravirt_reserve_memory(struct rsvd_region *region)
+{
+	if (pv_init_ops.reserve_memory)
+		return pv_init_ops.reserve_memory(region);
+	return 0;
+}
+
+static inline void paravirt_arch_setup_early(void)
+{
+	if (pv_init_ops.arch_setup_early)
+		pv_init_ops.arch_setup_early();
+}
+
+static inline void paravirt_arch_setup_console(char **cmdline_p)
+{
+	if (pv_init_ops.arch_setup_console)
+		pv_init_ops.arch_setup_console(cmdline_p);
+}
+
+static inline int paravirt_arch_setup_nomca(void)
+{
+	if (pv_init_ops.arch_setup_nomca)
+		return pv_init_ops.arch_setup_nomca();
+	return 0;
+}
+
+static inline void paravirt_post_platform_setup(void)
+{
+	if (pv_init_ops.post_platform_setup)
+		pv_init_ops.post_platform_setup();
+}
+
+static inline void paravirt_post_paging_init(void)
+{
+	if (pv_init_ops.post_paging_init)
+		pv_init_ops.post_paging_init();
+}
+
+static inline void paravirt_cpu_init(void)
+{
+	if (pv_init_ops.cpu_init)
+		pv_init_ops.cpu_init();
+}
+
+static inline void paravirt_post_smp_prepare_boot_cpu(void)
+{
+	if (pv_init_ops.post_smp_prepare_boot_cpu)
+		pv_init_ops.post_smp_prepare_boot_cpu();
+}
+
+static inline void
+paravirt_bundle_patch_module(struct paravirt_alt_bundle_patch *start,
+			     struct paravirt_alt_bundle_patch *end)
+{
+	if (pv_init_ops.bundle_patch_module)
+		pv_init_ops.bundle_patch_module(start, end);
+}
+
+static inline void
+paravirt_inst_patch_module(struct paravirt_alt_inst_patch *start,
+			   struct paravirt_alt_inst_patch *end)
+{
+	if (pv_init_ops.inst_patch_module)
+		pv_init_ops.inst_patch_module(start, end);
+}
+
 #endif /* __ASSEMBLY__ */
 
 #else
 /* fallback for native case */
 
-/* XXX: TODO */
+#ifdef __ASSEMBLY__
+
+#define paravirt_banner()				do { } while (0)
+#define paravirt_reserve_memory(region)			0
+
+#define paravirt_arch_setup_early()			do { } while (0)
+#define paravirt_arch_setup_console(cmdline_p)		do { } while (0)
+#define paravirt_arch_setup_nomca()			0
+#define paravirt_post_platform_setup()			do { } while (0)
+#define paravirt_post_paging_init()			do { } while (0)
+#define paravirt_cpu_init()				do { } while (0)
+#define paravirt_post_smp_prepare_boot_cpu()		do { } while (0)
+
+#define paravirt_bundle_patch_module(start, end)	do { } while (0)
+#define paravirt_inst_patch_module(start, end)		do { } while (0)
+
+#endif /* __ASSEMBLY__ */
+
 
 #endif /* CONFIG_PARAVIRT_GUEST */
 
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 27/50] ia64/pv_ops: introduce pv_iosapic_ops and its hooks.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/iosapic.c  |   43 +++++++++++++++++++++++++++----------------
 arch/ia64/kernel/paravirt.c |   30 ++++++++++++++++++++++++++++++
 include/asm-ia64/iosapic.h  |   18 ++++++++++++++++--
 include/asm-ia64/paravirt.h |   40 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 113 insertions(+), 18 deletions(-)

diff --git a/arch/ia64/kernel/iosapic.c b/arch/ia64/kernel/iosapic.c
index 7b32922..7380d6d 100644
--- a/arch/ia64/kernel/iosapic.c
+++ b/arch/ia64/kernel/iosapic.c
@@ -587,6 +587,15 @@ static inline int irq_is_shared (int irq)
 	return (iosapic_intr_info[irq].count > 1);
 }
 
+struct irq_chip*
+native_iosapic_get_irq_chip(unsigned long trigger)
+{
+	if (trigger == IOSAPIC_EDGE)
+		return &irq_type_iosapic_edge;
+	else
+		return &irq_type_iosapic_level;
+}
+
 static int
 register_intr (unsigned int gsi, int irq, unsigned char delivery,
 	       unsigned long polarity, unsigned long trigger)
@@ -637,13 +646,10 @@ register_intr (unsigned int gsi, int irq, unsigned char
delivery,
 	iosapic_intr_info[irq].dmode    = delivery;
 	iosapic_intr_info[irq].trigger  = trigger;
 
-	if (trigger == IOSAPIC_EDGE)
-		irq_type = &irq_type_iosapic_edge;
-	else
-		irq_type = &irq_type_iosapic_level;
+	irq_type = iosapic_get_irq_chip(trigger);
 
 	idesc = irq_desc + irq;
-	if (idesc->chip != irq_type) {
+	if (irq_type != NULL && idesc->chip != irq_type) {
 		if (idesc->chip != &no_irq_type)
 			printk(KERN_WARNING
 			       "%s: changing vector %d from %s to %s\n",
@@ -976,6 +982,20 @@ iosapic_override_isa_irq (unsigned int isa_irq, unsigned
int gsi,
 }
 
 void __init
+native_iosapic_pcat_compat_init(void)
+{
+	/*
+	 * Disable the compatibility mode interrupts (8259 style),
+	 * needs IN/OUT support enabled.
+	 */
+	printk(KERN_INFO
+	       "%s: Disabling PC-AT compatible 8259 interrupts\n",
+	       __FUNCTION__);
+	outb(0xff, 0xA1);
+	outb(0xff, 0x21);
+}
+
+void __init
 iosapic_system_init (int system_pcat_compat)
 {
 	int irq;
@@ -989,17 +1009,8 @@ iosapic_system_init (int system_pcat_compat)
 	}
 
 	pcat_compat = system_pcat_compat;
-	if (pcat_compat) {
-		/*
-		 * Disable the compatibility mode interrupts (8259 style),
-		 * needs IN/OUT support enabled.
-		 */
-		printk(KERN_INFO
-		       "%s: Disabling PC-AT compatible 8259 interrupts\n",
-		       __FUNCTION__);
-		outb(0xff, 0xA1);
-		outb(0xff, 0x21);
-	}
+	if (pcat_compat)
+		iosapic_pcat_compat_init();
 }
 
 static inline int
diff --git a/arch/ia64/kernel/paravirt.c b/arch/ia64/kernel/paravirt.c
index 4282b00..7e6a2d0 100644
--- a/arch/ia64/kernel/paravirt.c
+++ b/arch/ia64/kernel/paravirt.c
@@ -22,6 +22,12 @@
 
 #include <linux/init.h>
 
+#include <linux/compiler.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/types.h>
+
+#include <asm/iosapic.h>
 #include <asm/paravirt.h>
 
 /***************************************************************************
@@ -40,3 +46,27 @@ struct pv_info pv_info = {
 
 struct pv_init_ops pv_init_ops;
 
+/***************************************************************************
+ * pv_iosapic_ops
+ * iosapic read/write hooks.
+ */
+
+static unsigned int
+native_iosapic_read(char __iomem *iosapic, unsigned int reg)
+{
+	return __native_iosapic_read(iosapic, reg);
+}
+
+static void
+native_iosapic_write(char __iomem *iosapic, unsigned int reg, u32 val)
+{
+	__native_iosapic_write(iosapic, reg, val);
+}
+
+struct pv_iosapic_ops pv_iosapic_ops = {
+	.pcat_compat_init = native_iosapic_pcat_compat_init,
+	.get_irq_chip = native_iosapic_get_irq_chip,
+
+	.__read = native_iosapic_read,
+	.__write = native_iosapic_write,
+};
diff --git a/include/asm-ia64/iosapic.h b/include/asm-ia64/iosapic.h
index a3a4288..73ee754 100644
--- a/include/asm-ia64/iosapic.h
+++ b/include/asm-ia64/iosapic.h
@@ -55,13 +55,27 @@
 
 #define NR_IOSAPICS			256
 
-static inline unsigned int __iosapic_read(char __iomem *iosapic, unsigned int
reg)
+#ifdef CONFIG_PARAVIRT_GUEST
+#include <asm/paravirt.h>
+#else
+#define iosapic_pcat_compat_init	native_iosapic_pcat_compat_init
+#define __iosapic_read			__native_iosapic_read
+#define __iosapic_write			__native_iosapic_write
+#define iosapic_get_irq_chip		native_iosapic_get_irq_chip
+#endif
+
+extern void __init native_iosapic_pcat_compat_init(void);
+extern struct irq_chip *native_iosapic_get_irq_chip(unsigned long trigger);
+
+static inline unsigned int
+__native_iosapic_read(char __iomem *iosapic, unsigned int reg)
 {
 	writel(reg, iosapic + IOSAPIC_REG_SELECT);
 	return readl(iosapic + IOSAPIC_WINDOW);
 }
 
-static inline void __iosapic_write(char __iomem *iosapic, unsigned int reg, u32
val)
+static inline void
+__native_iosapic_write(char __iomem *iosapic, unsigned int reg, u32 val)
 {
 	writel(reg, iosapic + IOSAPIC_REG_SELECT);
 	writel(val, iosapic + IOSAPIC_WINDOW);
diff --git a/include/asm-ia64/paravirt.h b/include/asm-ia64/paravirt.h
index dd585fc..9efeda9 100644
--- a/include/asm-ia64/paravirt.h
+++ b/include/asm-ia64/paravirt.h
@@ -152,6 +152,46 @@ paravirt_inst_patch_module(struct paravirt_alt_inst_patch
*start,
 		pv_init_ops.inst_patch_module(start, end);
 }
 
+/******************************************************************************
+ * replacement of iosapic operations.
+ */
+
+struct pv_iosapic_ops {
+	void (*pcat_compat_init)(void);
+
+	struct irq_chip *(*get_irq_chip)(unsigned long trigger);
+
+	unsigned int (*__read)(char __iomem *iosapic, unsigned int reg);
+	void (*__write)(char __iomem *iosapic, unsigned int reg, u32 val);
+};
+
+extern struct pv_iosapic_ops pv_iosapic_ops;
+
+static inline void
+iosapic_pcat_compat_init(void)
+{
+	if (pv_iosapic_ops.pcat_compat_init)
+		pv_iosapic_ops.pcat_compat_init();
+}
+
+static inline struct irq_chip*
+iosapic_get_irq_chip(unsigned long trigger)
+{
+	return pv_iosapic_ops.get_irq_chip(trigger);
+}
+
+static inline unsigned int
+__iosapic_read(char __iomem *iosapic, unsigned int reg)
+{
+	return pv_iosapic_ops.__read(iosapic, reg);
+}
+
+static inline void
+__iosapic_write(char __iomem *iosapic, unsigned int reg, u32 val)
+{
+	return pv_iosapic_ops.__write(iosapic, reg, val);
+}
+
 #endif /* __ASSEMBLY__ */
 
 #else
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 28/50] ia64/pv_ops: introduce pv_irq_ops and its hooks.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/irq_ia64.c |   21 ++++++++++----
 arch/ia64/kernel/paravirt.c |   22 +++++++++++++++
 include/asm-ia64/hw_irq.h   |   20 ++++++++++---
 include/asm-ia64/paravirt.h |   63 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 115 insertions(+), 11 deletions(-)

diff --git a/arch/ia64/kernel/irq_ia64.c b/arch/ia64/kernel/irq_ia64.c
index 2b8cf6e..5259faa 100644
--- a/arch/ia64/kernel/irq_ia64.c
+++ b/arch/ia64/kernel/irq_ia64.c
@@ -196,7 +196,7 @@ static void clear_irq_vector(int irq)
 }
 
 int
-assign_irq_vector (int irq)
+native_assign_irq_vector (int irq)
 {
 	unsigned long flags;
 	int vector, cpu;
@@ -222,7 +222,7 @@ assign_irq_vector (int irq)
 }
 
 void
-free_irq_vector (int vector)
+native_free_irq_vector (int vector)
 {
 	if (vector < IA64_FIRST_DEVICE_VECTOR ||
 	    vector > IA64_LAST_DEVICE_VECTOR)
@@ -623,7 +623,7 @@ static struct irqaction tlb_irqaction = {
 #endif
 
 void
-register_percpu_irq (ia64_vector vec, struct irqaction *action)
+native_register_percpu_irq (ia64_vector vec, struct irqaction *action)
 {
 	irq_desc_t *desc;
 	unsigned int irq;
@@ -638,13 +638,21 @@ register_percpu_irq (ia64_vector vec, struct irqaction
*action)
 }
 
 void __init
+native_init_IRQ_early(void)
+{
+#ifdef CONFIG_SMP
+	register_percpu_irq(IA64_IPI_RESCHEDULE, &resched_irqaction);
+	register_percpu_irq(IA64_IPI_LOCAL_TLB_FLUSH, &tlb_irqaction);
+#endif
+}
+
+void __init
 init_IRQ (void)
 {
+	paravirt_init_IRQ_early();
 	register_percpu_irq(IA64_SPURIOUS_INT_VECTOR, NULL);
 #ifdef CONFIG_SMP
 	register_percpu_irq(IA64_IPI_VECTOR, &ipi_irqaction);
-	register_percpu_irq(IA64_IPI_RESCHEDULE, &resched_irqaction);
-	register_percpu_irq(IA64_IPI_LOCAL_TLB_FLUSH, &tlb_irqaction);
 #if defined(CONFIG_IA64_GENERIC) || defined(CONFIG_IA64_DIG)
 	if (vector_domain_type != VECTOR_DOMAIN_NONE) {
 		BUG_ON(IA64_FIRST_DEVICE_VECTOR != IA64_IRQ_MOVE_VECTOR);
@@ -657,10 +665,11 @@ init_IRQ (void)
 	pfm_init_percpu();
 #endif
 	platform_irq_init();
+	paravirt_init_IRQ_late();
 }
 
 void
-ia64_send_ipi (int cpu, int vector, int delivery_mode, int redirect)
+native_send_ipi (int cpu, int vector, int delivery_mode, int redirect)
 {
 	void __iomem *ipi_addr;
 	unsigned long ipi_data;
diff --git a/arch/ia64/kernel/paravirt.c b/arch/ia64/kernel/paravirt.c
index 7e6a2d0..ce0b23b 100644
--- a/arch/ia64/kernel/paravirt.c
+++ b/arch/ia64/kernel/paravirt.c
@@ -70,3 +70,25 @@ struct pv_iosapic_ops pv_iosapic_ops = {
 	.__read = native_iosapic_read,
 	.__write = native_iosapic_write,
 };
+
+/***************************************************************************
+ * pv_irq_ops
+ * irq operations
+ */
+
+void
+ia64_send_ipi(int cpu, int vector, int delivery_mode, int redirect)
+{
+	pv_irq_ops.send_ipi(cpu, vector, delivery_mode, redirect);
+}
+
+struct pv_irq_ops pv_irq_ops = {
+	.init_IRQ_early = native_init_IRQ_early,
+
+	.assign_irq_vector = native_assign_irq_vector,
+	.free_irq_vector = native_free_irq_vector,
+	.register_percpu_irq = native_register_percpu_irq,
+
+	.send_ipi = native_send_ipi,
+	.resend_irq = native_resend_irq,
+};
diff --git a/include/asm-ia64/hw_irq.h b/include/asm-ia64/hw_irq.h
index 76366dc..678efec 100644
--- a/include/asm-ia64/hw_irq.h
+++ b/include/asm-ia64/hw_irq.h
@@ -104,13 +104,23 @@ DECLARE_PER_CPU(int[IA64_NUM_VECTORS], vector_irq);
 
 extern struct hw_interrupt_type irq_type_ia64_lsapic;	/* CPU-internal interrupt
controller */
 
+#ifdef CONFIG_PARAVIRT_GUEST
+#include <asm/paravirt.h>
+#else
+#define assign_irq_vector	native_assign_irq_vector
+#define free_irq_vector		native_free_irq_vector
+#define ia64_send_ipi		native_send_ipi
+#define ia64_resend_irq		native_resend_irq
+#endif
+
+extern void native_init_IRQ_early(void);
 extern int bind_irq_vector(int irq, int vector, cpumask_t domain);
-extern int assign_irq_vector (int irq);	/* allocate a free vector */
-extern void free_irq_vector (int vector);
+extern int native_assign_irq_vector (int irq);	/* allocate a free vector */
+extern void native_free_irq_vector (int vector);
 extern int reserve_irq_vector (int vector);
 extern void __setup_vector_irq(int cpu);
-extern void ia64_send_ipi (int cpu, int vector, int delivery_mode, int
redirect);
-extern void register_percpu_irq (ia64_vector vec, struct irqaction *action);
+extern void native_send_ipi (int cpu, int vector, int delivery_mode, int
redirect);
+extern void native_register_percpu_irq (ia64_vector vec, struct irqaction
*action);
 extern int check_irq_used (int irq);
 extern void destroy_and_reserve_irq (unsigned int irq);
 
@@ -122,7 +132,7 @@ static inline int irq_prepare_move(int irq, int cpu) {
return 0; }
 static inline void irq_complete_move(unsigned int irq) {}
 #endif
 
-static inline void ia64_resend_irq(unsigned int vector)
+static inline void native_resend_irq(unsigned int vector)
 {
 	platform_send_ipi(smp_processor_id(), vector, IA64_IPI_DM_INT, 0);
 }
diff --git a/include/asm-ia64/paravirt.h b/include/asm-ia64/paravirt.h
index 9efeda9..ace6653 100644
--- a/include/asm-ia64/paravirt.h
+++ b/include/asm-ia64/paravirt.h
@@ -28,6 +28,7 @@
 
 #ifndef __ASSEMBLY__
 
+#include <asm/hw_irq.h>
 #include <asm/meminit.h>
 
 /******************************************************************************
@@ -192,6 +193,65 @@ __iosapic_write(char __iomem *iosapic, unsigned int reg,
u32 val)
 	return pv_iosapic_ops.__write(iosapic, reg, val);
 }
 
+/******************************************************************************
+ * replacement of irq operations.
+ */
+
+struct pv_irq_ops {
+	void (*init_IRQ_early)(void);
+	void (*init_IRQ_late)(void);
+
+	int (*assign_irq_vector)(int irq);
+	void (*free_irq_vector)(int vector);
+
+	void (*register_percpu_irq)(ia64_vector vec,
+				    struct irqaction *action);
+
+	void (*send_ipi)(int cpu, int vector, int delivery_mode, int redirect);
+	void (*resend_irq)(unsigned int vector);
+};
+
+extern struct pv_irq_ops pv_irq_ops;
+
+static inline void
+paravirt_init_IRQ_early(void)
+{
+	pv_irq_ops.init_IRQ_early();
+}
+
+static inline void
+paravirt_init_IRQ_late(void)
+{
+	if (pv_irq_ops.init_IRQ_late)
+		pv_irq_ops.init_IRQ_late();
+}
+
+static inline int
+assign_irq_vector(int irq)
+{
+	return pv_irq_ops.assign_irq_vector(irq);
+}
+
+static inline void
+free_irq_vector(int vector)
+{
+	return pv_irq_ops.free_irq_vector(vector);
+}
+
+static inline void
+register_percpu_irq(ia64_vector vec, struct irqaction *action)
+{
+	pv_irq_ops.register_percpu_irq(vec, action);
+}
+
+void ia64_send_ipi(int cpu, int vector, int delivery_mode, int redirect);
+
+static inline void
+ia64_resend_irq(unsigned int vector)
+{
+	pv_irq_ops.resend_irq(vector);
+}
+
 #endif /* __ASSEMBLY__ */
 
 #else
@@ -213,6 +273,9 @@ __iosapic_write(char __iomem *iosapic, unsigned int reg, u32
val)
 #define paravirt_bundle_patch_module(start, end)	do { } while (0)
 #define paravirt_inst_patch_module(start, end)		do { } while (0)
 
+#define paravirt_init_IRQ_early()			do { } while (0)
+#define paravirt_init_IRQ_late()			do { } while (0)
+
 #endif /* __ASSEMBLY__ */
 
 
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 29/50] ia64/xen: increase IA64_MAX_RSVD_REGIONS.

Xenlinux/ia64 needs to reserve one more region passed from xen hypervisor
as start info.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/asm-ia64/meminit.h |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/asm-ia64/meminit.h b/include/asm-ia64/meminit.h
index f93308f..8de94e2 100644
--- a/include/asm-ia64/meminit.h
+++ b/include/asm-ia64/meminit.h
@@ -18,10 +18,11 @@
  * 	- crash dumping code reserved region
  * 	- Kernel memory map built from EFI memory map
  * 	- ELF core header
+ *	- xen start info if CONFIG_XEN
  *
  * More could be added if necessary
  */
-#define IA64_MAX_RSVD_REGIONS 8
+#define IA64_MAX_RSVD_REGIONS 9
 
 struct rsvd_region {
 	unsigned long start;	/* virtual address of beginning of element */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 30/50] ia64/xen: introduce synch bitops which is necessary for ia64/xen support.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/asm-ia64/sync_bitops.h |   59 ++++++++++++++++++++++++++++++++++++++++
 1 files changed, 59 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-ia64/sync_bitops.h

diff --git a/include/asm-ia64/sync_bitops.h b/include/asm-ia64/sync_bitops.h
new file mode 100644
index 0000000..f56cd90
--- /dev/null
+++ b/include/asm-ia64/sync_bitops.h
@@ -0,0 +1,59 @@
+#ifndef _ASM_IA64_SYNC_BITOPS_H
+#define _ASM_IA64_SYNC_BITOPS_H
+
+/*
+ * Copyright 1992, Linus Torvalds.
+ * Heavily modified to provide guaranteed strong synchronisation
+ * when communicating with Xen or other guest OSes running on other CPUs.
+ */
+
+static inline void sync_set_bit(int nr, volatile void *addr)
+{
+	set_bit(nr, addr);
+}
+
+static inline void sync_clear_bit(int nr, volatile void *addr)
+{
+	clear_bit(nr, addr);
+}
+
+static inline void sync_change_bit(int nr, volatile void *addr)
+{
+	change_bit(nr, addr);
+}
+
+static inline int sync_test_and_set_bit(int nr, volatile void *addr)
+{
+	return test_and_set_bit(nr, addr);
+}
+
+static inline int sync_test_and_clear_bit(int nr, volatile void *addr)
+{
+	return test_and_clear_bit(nr, addr);
+}
+
+static inline int sync_test_and_change_bit(int nr, volatile void *addr)
+{
+	return test_and_change_bit(nr, addr);
+}
+
+static inline int sync_const_test_bit(int nr, const volatile void *addr)
+{
+	return test_bit(nr, addr);
+}
+
+static inline int sync_var_test_bit(int nr, volatile void *addr)
+{
+	return test_bit(nr, addr);
+}
+
+#define sync_cmpxchg	ia64_cmpxchg4_acq
+
+#define sync_test_bit(nr,addr)			\
+	(__builtin_constant_p(nr) ?		\
+	 sync_const_test_bit((nr), (addr)) :	\
+	 sync_var_test_bit((nr), (addr)))
+
+#define sync_cmpxchg_subword sync_cmpxchg
+
+#endif /* _ASM_IA64_SYNC_BITOPS_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 31/50] ia64/xen: import xen hypercall header file for domU

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/asm-ia64/xen/interface.h |  585 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 585 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-ia64/xen/interface.h

diff --git a/include/asm-ia64/xen/interface.h b/include/asm-ia64/xen/interface.h
new file mode 100644
index 0000000..4cb4515
--- /dev/null
+++ b/include/asm-ia64/xen/interface.h
@@ -0,0 +1,585 @@
+/******************************************************************************
+ * arch-ia64/hypervisor-if.h
+ *
+ * Guest OS interface to IA64 Xen.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the
"Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _ASM_IA64_XEN_INTERFACE_H
+#define _ASM_IA64_XEN_INTERFACE_H
+
+#define __DEFINE_GUEST_HANDLE(name, type) \
+    typedef struct { type *p; } __guest_handle_ ## name
+
+#define DEFINE_GUEST_HANDLE_STRUCT(name) \
+    __DEFINE_GUEST_HANDLE(name, struct name)
+#define DEFINE_GUEST_HANDLE(name)       __DEFINE_GUEST_HANDLE(name, name)
+#define GUEST_HANDLE(name)              __guest_handle_ ## name
+#define GUEST_HANDLE_64(name)           GUEST_HANDLE(name)
+#define set_xen_guest_handle(hnd, val)  do { (hnd).p = val; } while (0)
+
+#ifndef __ASSEMBLY__
+/* Guest handles for primitive C types. */
+__DEFINE_GUEST_HANDLE(uchar, unsigned char);
+__DEFINE_GUEST_HANDLE(uint,  unsigned int);
+__DEFINE_GUEST_HANDLE(ulong, unsigned long);
+__DEFINE_GUEST_HANDLE(u64,   unsigned long);
+DEFINE_GUEST_HANDLE(char);
+DEFINE_GUEST_HANDLE(int);
+DEFINE_GUEST_HANDLE(long);
+DEFINE_GUEST_HANDLE(void);
+
+typedef unsigned long xen_pfn_t;
+DEFINE_GUEST_HANDLE(xen_pfn_t);
+#define PRI_xen_pfn "lx"
+#endif
+
+/* Arch specific VIRQs definition */
+#define VIRQ_ITC        VIRQ_ARCH_0 /* V. Virtual itc timer */
+#define VIRQ_MCA_CMC    VIRQ_ARCH_1 /* MCA cmc interrupt */
+#define VIRQ_MCA_CPE    VIRQ_ARCH_2 /* MCA cpe interrupt */
+
+/* Maximum number of virtual CPUs in multi-processor guests. */
+/* keep sizeof(struct shared_page) <= PAGE_SIZE.
+ * this is checked in arch/ia64/xen/hypervisor.c. */
+#define MAX_VIRT_CPUS 64
+
+#ifndef __ASSEMBLY__
+
+#define INVALID_MFN             (~0UL)
+
+struct pt_fpreg {
+    union {
+        unsigned long bits[2];
+        long double __dummy;    /* force 16-byte alignment */
+    } u;
+};
+
+union vac {
+    unsigned long value;
+    struct {
+        int a_int:1;
+        int a_from_int_cr:1;
+        int a_to_int_cr:1;
+        int a_from_psr:1;
+        int a_from_cpuid:1;
+        int a_cover:1;
+        int a_bsw:1;
+        long reserved:57;
+    };
+};
+
+union vdc {
+    unsigned long value;
+    struct {
+        int d_vmsw:1;
+        int d_extint:1;
+        int d_ibr_dbr:1;
+        int d_pmc:1;
+        int d_to_pmd:1;
+        int d_itm:1;
+        long reserved:58;
+    };
+};
+
+struct mapped_regs {
+    union vac   vac;
+    union vdc   vdc;
+    unsigned long  virt_env_vaddr;
+    unsigned long  reserved1[29];
+    unsigned long  vhpi;
+    unsigned long  reserved2[95];
+    union {
+        unsigned long  vgr[16];
+        unsigned long bank1_regs[16]; /* bank1 regs (r16-r31) when bank0 active
*/
+    };
+    union {
+        unsigned long  vbgr[16];
+        unsigned long bank0_regs[16]; /* bank0 regs (r16-r31) when bank1 active
*/
+    };
+    unsigned long  vnat;
+    unsigned long  vbnat;
+    unsigned long  vcpuid[5];
+    unsigned long  reserved3[11];
+    unsigned long  vpsr;
+    unsigned long  vpr;
+    unsigned long  reserved4[76];
+    union {
+        unsigned long  vcr[128];
+        struct {
+            unsigned long dcr;          /* CR0 */
+            unsigned long itm;
+            unsigned long iva;
+            unsigned long rsv1[5];
+            unsigned long pta;          /* CR8 */
+            unsigned long rsv2[7];
+            unsigned long ipsr;         /* CR16 */
+            unsigned long isr;
+            unsigned long rsv3;
+            unsigned long iip;
+            unsigned long ifa;
+            unsigned long itir;
+            unsigned long iipa;
+            unsigned long ifs;
+            unsigned long iim;          /* CR24 */
+            unsigned long iha;
+            unsigned long rsv4[38];
+            unsigned long lid;          /* CR64 */
+            unsigned long ivr;
+            unsigned long tpr;
+            unsigned long eoi;
+            unsigned long irr[4];
+            unsigned long itv;          /* CR72 */
+            unsigned long pmv;
+            unsigned long cmcv;
+            unsigned long rsv5[5];
+            unsigned long lrr0;         /* CR80 */
+            unsigned long lrr1;
+            unsigned long rsv6[46];
+        };
+    };
+    union {
+        unsigned long  reserved5[128];
+        struct {
+            unsigned long precover_ifs;
+            unsigned long unat;         /* not sure if this is needed until
+                                           NaT arch is done */
+            int interrupt_collection_enabled;   /* virtual psr.ic */
+            /* virtual interrupt deliverable flag is evtchn_upcall_mask in
+             * shared info area now. interrupt_mask_addr is the address
+             * of evtchn_upcall_mask for current vcpu
+             */
+            unsigned char *interrupt_mask_addr;
+            int pending_interruption;
+            unsigned char vpsr_pp;
+            unsigned char vpsr_dfh;
+            unsigned char hpsr_dfh;
+            unsigned char hpsr_mfh;
+            unsigned long reserved5_1[4];
+            int metaphysical_mode;      /* 1 = use metaphys mapping
+                                           0 = use virtual */
+            int banknum;                /* 0 or 1, which virtual register
+                                           bank is active */
+            unsigned long rrs[8];       /* region registers */
+            unsigned long krs[8];       /* kernel registers */
+            unsigned long tmp[16];      /* temp registers
+                                           (e.g. for hyperprivops) */
+        };
+    };
+};
+
+struct vpd {
+    struct mapped_regs vpd_low;
+    unsigned long  reserved6[3456];
+    unsigned long  vmm_avail[128];
+    unsigned long  reserved7[4096];
+};
+
+struct arch_vcpu_info {
+    /* nothing */
+};
+
+/*
+ * This structure is used for magic page in domain pseudo physical address
+ * space and the result of XENMEM_machine_memory_map.
+ * As the XENMEM_machine_memory_map result,
+ * xen_memory_map::nr_entries indicates the size in bytes
+ * including struct xen_ia64_memmap_info. Not the number of entries.
+ */
+struct xen_ia64_memmap_info {
+    uint64_t efi_memmap_size;       /* size of EFI memory map */
+    uint64_t efi_memdesc_size;      /* size of an EFI memory map descriptor */
+    uint32_t efi_memdesc_version;   /* memory descriptor version */
+    void *memdesc[0];               /* array of efi_memory_desc_t */
+};
+
+struct arch_shared_info {
+    /* PFN of the start_info page.  */
+    unsigned long start_info_pfn;
+
+    /* Interrupt vector for event channel.  */
+    int evtchn_vector;
+
+    /* PFN of memmap_info page */
+    unsigned int memmap_info_num_pages; /* currently only = 1 case is
+                                           supported. */
+    unsigned long memmap_info_pfn;
+
+    uint64_t pad[31];
+};
+
+typedef unsigned long xen_callback_t;
+
+struct ia64_tr_entry {
+    unsigned long pte;
+    unsigned long itir;
+    unsigned long vadr;
+    unsigned long rid;
+};
+DEFINE_GUEST_HANDLE_STRUCT(ia64_tr_entry);
+
+struct vcpu_tr_regs {
+    struct ia64_tr_entry itrs[12];
+    struct ia64_tr_entry dtrs[12];
+};
+
+union vcpu_ar_regs {
+    unsigned long ar[128];
+    struct {
+        unsigned long kr[8];
+        unsigned long rsv1[8];
+        unsigned long rsc;
+        unsigned long bsp;
+        unsigned long bspstore;
+        unsigned long rnat;
+        unsigned long rsv2;
+        unsigned long fcr;
+        unsigned long rsv3[2];
+        unsigned long eflag;
+        unsigned long csd;
+        unsigned long ssd;
+        unsigned long cflg;
+        unsigned long fsr;
+        unsigned long fir;
+        unsigned long fdr;
+        unsigned long rsv4;
+        unsigned long ccv;      /* 32 */
+        unsigned long rsv5[3];
+        unsigned long unat;
+        unsigned long rsv6[3];
+        unsigned long fpsr;
+        unsigned long rsv7[3];
+        unsigned long itc;
+        unsigned long rsv8[3];
+        unsigned long ign1[16];
+        unsigned long pfs;      /* 64 */
+        unsigned long lc;
+        unsigned long ec;
+        unsigned long rsv9[45];
+        unsigned long ign2[16];
+    };
+};
+
+union vcpu_cr_regs {
+    unsigned long cr[128];
+    struct {
+        unsigned long dcr;      /* CR0 */
+        unsigned long itm;
+        unsigned long iva;
+        unsigned long rsv1[5];
+        unsigned long pta;      /* CR8 */
+        unsigned long rsv2[7];
+        unsigned long ipsr;     /* CR16 */
+        unsigned long isr;
+        unsigned long rsv3;
+        unsigned long iip;
+        unsigned long ifa;
+        unsigned long itir;
+        unsigned long iipa;
+        unsigned long ifs;
+        unsigned long iim;      /* CR24 */
+        unsigned long iha;
+        unsigned long rsv4[38];
+        unsigned long lid;      /* CR64 */
+        unsigned long ivr;
+        unsigned long tpr;
+        unsigned long eoi;
+        unsigned long irr[4];
+        unsigned long itv;      /* CR72 */
+        unsigned long pmv;
+        unsigned long cmcv;
+        unsigned long rsv5[5];
+        unsigned long lrr0;     /* CR80 */
+        unsigned long lrr1;
+        unsigned long rsv6[46];
+    };
+};
+
+struct vcpu_guest_context_regs {
+        unsigned long r[32];
+        unsigned long b[8];
+        unsigned long bank[16];
+        unsigned long ip;
+        unsigned long psr;
+        unsigned long cfm;
+        unsigned long pr;
+        unsigned int nats;      /* NaT bits for r1-r31.  */
+        unsigned int bnats;     /* Nat bits for banked registers.  */
+        union vcpu_ar_regs ar;
+        union vcpu_cr_regs cr;
+        struct pt_fpreg f[128];
+        unsigned long dbr[8];
+        unsigned long ibr[8];
+        unsigned long rr[8];
+        unsigned long pkr[16];
+
+        /* FIXME: cpuid,pmd,pmc */
+
+        unsigned long xip;
+        unsigned long xpsr;
+        unsigned long xfs;
+        unsigned long xr[4];
+
+        struct vcpu_tr_regs tr;
+
+        /* Physical registers in case of debug event.  */
+        unsigned long excp_iipa;
+        unsigned long excp_ifa;
+        unsigned long excp_isr;
+        unsigned int excp_vector;
+
+        /*
+         * The rbs is intended to be the image of the stacked registers still
+         * in the cpu (not yet stored in memory).  It is laid out as if it
+         * were written in memory at a 512 (64*8) aligned address + offset.
+         * rbs_voff is (offset / 8).  rbs_nat contains NaT bits for the
+         * remaining rbs registers.  rbs_rnat contains NaT bits for in memory
+         * rbs registers.
+         * Note: loadrs is 2**14 bytes == 2**11 slots.
+         */
+        unsigned int rbs_voff;
+        unsigned long rbs[2048];
+        unsigned long rbs_rnat;
+
+        /*
+         * RSE.N_STACKED_PHYS via PAL_RSE_INFO
+         * Strictly this isn't cpu context, but this value is necessary
+         * for domain save/restore. So is here.
+         */
+        unsigned long num_phys_stacked;
+};
+
+struct vcpu_guest_context {
+#define VGCF_EXTRA_REGS (1UL << 1)	/* Set extra regs.  */
+#define VGCF_SET_CR_IRR (1UL << 2)	/* Set cr_irr[0:3]. */
+    unsigned long flags;       /* VGCF_* flags */
+
+    struct vcpu_guest_context_regs regs;
+
+    unsigned long event_callback_ip;
+
+    /* xen doesn't share privregs pages with hvm domain so that this member
+     * doesn't make sense for hvm domain.
+     * ~0UL is already used for INVALID_P2M_ENTRY. */
+#define VGC_PRIVREGS_HVM       (~(-2UL))
+    unsigned long privregs_pfn;
+};
+DEFINE_GUEST_HANDLE_STRUCT(vcpu_guest_context);
+
+/* dom0 vp op */
+#define __HYPERVISOR_ia64_dom0vp_op     __HYPERVISOR_arch_0
+/*  Map io space in machine address to dom0 physical address space.
+    Currently physical assigned address equals to machine address.  */
+#define IA64_DOM0VP_ioremap             0
+
+/* Convert a pseudo physical page frame number to the corresponding
+   machine page frame number. If no page is assigned, INVALID_MFN or
+   GPFN_INV_MASK is returned depending on domain's non-vti/vti mode.  */
+#define IA64_DOM0VP_phystomach          1
+
+/* Convert a machine page frame number to the corresponding pseudo physical
+   page frame number of the caller domain.  */
+#define IA64_DOM0VP_machtophys          3
+
+/* Reserved for future use.  */
+#define IA64_DOM0VP_iounmap             4
+
+/* Unmap and free pages contained in the specified pseudo physical region.  */
+#define IA64_DOM0VP_zap_physmap         5
+
+/* Assign machine page frame to dom0's pseudo physical address space.  */
+#define IA64_DOM0VP_add_physmap         6
+
+/* expose the p2m table into domain */
+#define IA64_DOM0VP_expose_p2m          7
+
+/* xen perfmon */
+#define IA64_DOM0VP_perfmon             8
+
+/* gmfn version of IA64_DOM0VP_add_physmap */
+#define IA64_DOM0VP_add_physmap_with_gmfn       9
+
+/* get fpswa revision */
+#define IA64_DOM0VP_fpswa_revision      10
+
+/* Add an I/O port space range */
+#define IA64_DOM0VP_add_io_space        11
+
+/* expose the foreign domain's p2m table into privileged domain */
+#define IA64_DOM0VP_expose_foreign_p2m  12
+#define         IA64_DOM0VP_EFP_ALLOC_PTE       0x1 /* allocate p2m table */
+
+/* unexpose the foreign domain's p2m table into privileged domain */
+#define IA64_DOM0VP_unexpose_foreign_p2m        13
+
+/* replace this page with newly allocated one and track tlb insert on it. */
+#define IA64_DOM0VP_tlb_track_page      32
+
+/* assign a page with newly allocated one and track tlb insert on it.
+   if page is already assigned to pseudo physical address it results
+   in error. */
+#define IA64_DOM0VP_tlb_add_track_page  33
+
+/* disable tlb traking of this page */
+#define IA64_DOM0VP_tlb_untrack_page    34
+
+
+/* flags for page assignement to pseudo physical address space */
+#define _ASSIGN_readonly                0
+#define ASSIGN_readonly                 (1UL << _ASSIGN_readonly)
+#define ASSIGN_writable                 (0UL << _ASSIGN_readonly) /*
dummy flag */
+/* Internal only: memory attribute must be WC/UC/UCE.  */
+#define _ASSIGN_nocache                 1
+#define ASSIGN_nocache                  (1UL << _ASSIGN_nocache)
+/* tlb tracking */
+#define _ASSIGN_tlb_track               2
+#define ASSIGN_tlb_track                (1UL << _ASSIGN_tlb_track)
+/* Internal only: associated with PGC_allocated bit */
+#define _ASSIGN_pgc_allocated           3
+#define ASSIGN_pgc_allocated            (1UL << _ASSIGN_pgc_allocated)
+
+/* This structure has the same layout of struct ia64_boot_param, defined in
+   <asm/system.h>.  It is redefined here to ease use.  */
+struct xen_ia64_boot_param {
+	unsigned long command_line;	/* physical address of cmd line args */
+	unsigned long efi_systab;	/* physical address of EFI system table */
+	unsigned long efi_memmap;	/* physical address of EFI memory map */
+	unsigned long efi_memmap_size;	/* size of EFI memory map */
+	unsigned long efi_memdesc_size;	/* size of an EFI memory map descriptor */
+	unsigned int  efi_memdesc_version;	/* memory descriptor version */
+	struct {
+		unsigned short num_cols;	/* number of columns on console.  */
+		unsigned short num_rows;	/* number of rows on console.  */
+		unsigned short orig_x;	/* cursor's x position */
+		unsigned short orig_y;	/* cursor's y position */
+	} console_info;
+	unsigned long fpswa;		/* physical address of the fpswa interface */
+	unsigned long initrd_start;
+	unsigned long initrd_size;
+	unsigned long domain_start;	/* va where the boot time domain begins */
+	unsigned long domain_size;	/* how big is the boot domain */
+};
+
+#endif /* !__ASSEMBLY__ */
+
+/* Size of the shared_info area (this is not related to page size).  */
+#define XSI_SHIFT			14
+#define XSI_SIZE			(1 << XSI_SHIFT)
+/* Log size of mapped_regs area (64 KB - only 4KB is used).  */
+#define XMAPPEDREGS_SHIFT		12
+#define XMAPPEDREGS_SIZE		(1 << XMAPPEDREGS_SHIFT)
+/* Offset of XASI (Xen arch shared info) wrt XSI_BASE.  */
+#define XMAPPEDREGS_OFS			XSI_SIZE
+
+/* Hyperprivops.  */
+#define HYPERPRIVOP_START		0x1
+#define HYPERPRIVOP_RFI			(HYPERPRIVOP_START + 0x0)
+#define HYPERPRIVOP_RSM_DT		(HYPERPRIVOP_START + 0x1)
+#define HYPERPRIVOP_SSM_DT		(HYPERPRIVOP_START + 0x2)
+#define HYPERPRIVOP_COVER		(HYPERPRIVOP_START + 0x3)
+#define HYPERPRIVOP_ITC_D		(HYPERPRIVOP_START + 0x4)
+#define HYPERPRIVOP_ITC_I		(HYPERPRIVOP_START + 0x5)
+#define HYPERPRIVOP_SSM_I		(HYPERPRIVOP_START + 0x6)
+#define HYPERPRIVOP_GET_IVR		(HYPERPRIVOP_START + 0x7)
+#define HYPERPRIVOP_GET_TPR		(HYPERPRIVOP_START + 0x8)
+#define HYPERPRIVOP_SET_TPR		(HYPERPRIVOP_START + 0x9)
+#define HYPERPRIVOP_EOI			(HYPERPRIVOP_START + 0xa)
+#define HYPERPRIVOP_SET_ITM		(HYPERPRIVOP_START + 0xb)
+#define HYPERPRIVOP_THASH		(HYPERPRIVOP_START + 0xc)
+#define HYPERPRIVOP_PTC_GA		(HYPERPRIVOP_START + 0xd)
+#define HYPERPRIVOP_ITR_D		(HYPERPRIVOP_START + 0xe)
+#define HYPERPRIVOP_GET_RR		(HYPERPRIVOP_START + 0xf)
+#define HYPERPRIVOP_SET_RR		(HYPERPRIVOP_START + 0x10)
+#define HYPERPRIVOP_SET_KR		(HYPERPRIVOP_START + 0x11)
+#define HYPERPRIVOP_FC			(HYPERPRIVOP_START + 0x12)
+#define HYPERPRIVOP_GET_CPUID		(HYPERPRIVOP_START + 0x13)
+#define HYPERPRIVOP_GET_PMD		(HYPERPRIVOP_START + 0x14)
+#define HYPERPRIVOP_GET_EFLAG		(HYPERPRIVOP_START + 0x15)
+#define HYPERPRIVOP_SET_EFLAG		(HYPERPRIVOP_START + 0x16)
+#define HYPERPRIVOP_RSM_BE		(HYPERPRIVOP_START + 0x17)
+#define HYPERPRIVOP_GET_PSR		(HYPERPRIVOP_START + 0x18)
+#define HYPERPRIVOP_SET_RR0_TO_RR4	(HYPERPRIVOP_START + 0x19)
+#define HYPERPRIVOP_MAX			(0x1a)
+
+/* Fast and light hypercalls.  */
+#define __HYPERVISOR_ia64_fast_eoi	__HYPERVISOR_arch_1
+
+/* Extra debug features.  */
+#define __HYPERVISOR_ia64_debug_op  __HYPERVISOR_arch_2
+
+/* Xencomm macros.  */
+#define XENCOMM_INLINE_MASK 0xf800000000000000UL
+#define XENCOMM_INLINE_FLAG 0x8000000000000000UL
+
+#ifndef __ASSEMBLY__
+
+/*
+ * Optimization features.
+ * The hypervisor may do some special optimizations for guests. This hypercall
+ * can be used to switch on/of these special optimizations.
+ */
+#define __HYPERVISOR_opt_feature	0x700UL
+
+#define XEN_IA64_OPTF_OFF	0x0
+#define XEN_IA64_OPTF_ON	0x1
+
+/*
+ * If this feature is switched on, the hypervisor inserts the
+ * tlb entries without calling the guests traphandler.
+ * This is useful in guests using region 7 for identity mapping
+ * like the linux kernel does.
+ */
+#define XEN_IA64_OPTF_IDENT_MAP_REG7    1
+
+/* Identity mapping of region 4 addresses in HVM. */
+#define XEN_IA64_OPTF_IDENT_MAP_REG4    2
+
+/* Identity mapping of region 5 addresses in HVM. */
+#define XEN_IA64_OPTF_IDENT_MAP_REG5    3
+
+#define XEN_IA64_OPTF_IDENT_MAP_NOT_SET  (0)
+
+struct xen_ia64_opt_feature {
+	unsigned long cmd;		/* Which feature */
+	unsigned char on;		/* Switch feature on/off */
+	union {
+		struct {
+				/* The page protection bit mask of the pte.
+				 * This will be or'ed with the pte. */
+			unsigned long pgprot;
+			unsigned long key;	/* A protection key for itir. */
+		};
+	};
+};
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_IA64_XEN_INTERFACE_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 32/50] ia64/xen: define xen assembler constants which will be used later.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/asm-offsets.c |   25 ++++++++++++++
 include/asm-ia64/xen/privop.h  |   73 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 98 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-ia64/xen/privop.h

diff --git a/arch/ia64/kernel/asm-offsets.c b/arch/ia64/kernel/asm-offsets.c
index 0aebc6f..1a81c64 100644
--- a/arch/ia64/kernel/asm-offsets.c
+++ b/arch/ia64/kernel/asm-offsets.c
@@ -278,4 +278,29 @@ void foo(void)
 		offsetof (struct itc_jitter_data_t, itc_jitter));
 	DEFINE(IA64_ITC_LASTCYCLE_OFFSET,
 		offsetof (struct itc_jitter_data_t, itc_lastcycle));
+
+#ifdef CONFIG_XEN
+	BLANK();
+
+#define DEFINE_MAPPED_REG_OFS(sym, field) \
+	DEFINE(sym, (XMAPPEDREGS_OFS + offsetof(struct mapped_regs, field)))
+
+	DEFINE_MAPPED_REG_OFS(XSI_PSR_I_ADDR_OFS, interrupt_mask_addr);
+	DEFINE_MAPPED_REG_OFS(XSI_IPSR_OFS, ipsr);
+	DEFINE_MAPPED_REG_OFS(XSI_IIP_OFS, iip);
+	DEFINE_MAPPED_REG_OFS(XSI_IFS_OFS, ifs);
+	DEFINE_MAPPED_REG_OFS(XSI_PRECOVER_IFS_OFS, precover_ifs);
+	DEFINE_MAPPED_REG_OFS(XSI_ISR_OFS, isr);
+	DEFINE_MAPPED_REG_OFS(XSI_IFA_OFS, ifa);
+	DEFINE_MAPPED_REG_OFS(XSI_IIPA_OFS, iipa);
+	DEFINE_MAPPED_REG_OFS(XSI_IIM_OFS, iim);
+	DEFINE_MAPPED_REG_OFS(XSI_IHA_OFS, iha);
+	DEFINE_MAPPED_REG_OFS(XSI_ITIR_OFS, itir);
+	DEFINE_MAPPED_REG_OFS(XSI_PSR_IC_OFS, interrupt_collection_enabled);
+	DEFINE_MAPPED_REG_OFS(XSI_BANKNUM_OFS, banknum);
+	DEFINE_MAPPED_REG_OFS(XSI_BANK0_R16_OFS, bank0_regs[0]);
+	DEFINE_MAPPED_REG_OFS(XSI_BANK1_R16_OFS, bank1_regs[0]);
+	DEFINE_MAPPED_REG_OFS(XSI_B0NATS_OFS, vbnat);
+	DEFINE_MAPPED_REG_OFS(XSI_B1NATS_OFS, vnat);
+#endif /* CONFIG_XEN */
 }
diff --git a/include/asm-ia64/xen/privop.h b/include/asm-ia64/xen/privop.h
new file mode 100644
index 0000000..dd3e5ec
--- /dev/null
+++ b/include/asm-ia64/xen/privop.h
@@ -0,0 +1,73 @@
+#ifndef _ASM_IA64_XEN_PRIVOP_H
+#define _ASM_IA64_XEN_PRIVOP_H
+
+/*
+ * Copyright (C) 2005 Hewlett-Packard Co
+ *	Dan Magenheimer <dan.magenheimer at hp.com>
+ *
+ * Paravirtualizations of privileged operations for Xen/ia64
+ *
+ *
+ * inline privop and paravirt_alt support
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ */
+
+#ifndef __ASSEMBLY__
+#include <linux/types.h>		/* arch-ia64.h requires uint64_t */
+#include <linux/stringify.h>
+#endif
+#include <asm/xen/interface.h>
+
+/* At 1 MB, before per-cpu space but still addressable using addl instead
+   of movl. */
+#define XSI_BASE			0xfffffffffff00000
+
+/* Address of mapped regs.  */
+#define XMAPPEDREGS_BASE		(XSI_BASE + XSI_SIZE)
+
+#ifdef __ASSEMBLY__
+#define XEN_HYPER_RFI			break HYPERPRIVOP_RFI
+#define XEN_HYPER_RSM_PSR_DT		break HYPERPRIVOP_RSM_DT
+#define XEN_HYPER_SSM_PSR_DT		break HYPERPRIVOP_SSM_DT
+#define XEN_HYPER_COVER			break HYPERPRIVOP_COVER
+#define XEN_HYPER_ITC_D			break HYPERPRIVOP_ITC_D
+#define XEN_HYPER_ITC_I			break HYPERPRIVOP_ITC_I
+#define XEN_HYPER_SSM_I			break HYPERPRIVOP_SSM_I
+#define XEN_HYPER_GET_IVR		break HYPERPRIVOP_GET_IVR
+#define XEN_HYPER_GET_TPR		break HYPERPRIVOP_GET_TPR
+#define XEN_HYPER_SET_TPR		break HYPERPRIVOP_SET_TPR
+#define XEN_HYPER_EOI			break HYPERPRIVOP_EOI
+#define XEN_HYPER_SET_ITM		break HYPERPRIVOP_SET_ITM
+#define XEN_HYPER_THASH			break HYPERPRIVOP_THASH
+#define XEN_HYPER_PTC_GA		break HYPERPRIVOP_PTC_GA
+#define XEN_HYPER_ITR_D			break HYPERPRIVOP_ITR_D
+#define XEN_HYPER_GET_RR		break HYPERPRIVOP_GET_RR
+#define XEN_HYPER_SET_RR		break HYPERPRIVOP_SET_RR
+#define XEN_HYPER_SET_KR		break HYPERPRIVOP_SET_KR
+#define XEN_HYPER_FC			break HYPERPRIVOP_FC
+#define XEN_HYPER_GET_CPUID		break HYPERPRIVOP_GET_CPUID
+#define XEN_HYPER_GET_PMD		break HYPERPRIVOP_GET_PMD
+#define XEN_HYPER_GET_EFLAG		break HYPERPRIVOP_GET_EFLAG
+#define XEN_HYPER_SET_EFLAG		break HYPERPRIVOP_SET_EFLAG
+#define XEN_HYPER_GET_PSR		break HYPERPRIVOP_GET_PSR
+#define XEN_HYPER_SET_RR0_TO_RR4	break HYPERPRIVOP_SET_RR0_TO_RR4
+
+#define XSI_IFS				(XSI_BASE + XSI_IFS_OFS)
+#define XSI_PRECOVER_IFS		(XSI_BASE + XSI_PRECOVER_IFS_OFS)
+#define XSI_IFA				(XSI_BASE + XSI_IFA_OFS)
+#define XSI_ISR				(XSI_BASE + XSI_ISR_OFS)
+#define XSI_IIM				(XSI_BASE + XSI_IIM_OFS)
+#define XSI_ITIR			(XSI_BASE + XSI_ITIR_OFS)
+#define XSI_PSR_I_ADDR			(XSI_BASE + XSI_PSR_I_ADDR_OFS)
+#define XSI_PSR_IC			(XSI_BASE + XSI_PSR_IC_OFS)
+#define XSI_IPSR			(XSI_BASE + XSI_IPSR_OFS)
+#define XSI_IIP				(XSI_BASE + XSI_IIP_OFS)
+#define XSI_B1NAT			(XSI_BASE + XSI_B1NATS_OFS)
+#define XSI_BANK1_R16			(XSI_BASE + XSI_BANK1_R16_OFS)
+#define XSI_BANKNUM			(XSI_BASE + XSI_BANKNUM_OFS)
+#define XSI_IHA				(XSI_BASE + XSI_IHA_OFS)
+#endif
+
+#endif /* _ASM_IA64_XEN_PRIVOP_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 33/50] ia64/xen: detect xen environment at early boot time and do minimal initialization.

Currently it detects by checking psr.cpl != 0.
It's ok for now, but more abstraction would be needed later like x86.
Presumably extending booting protocol (i.e. extending struct ia64_boot_param)
or multi entry points depending on hypervisor would be necessary.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/head.S           |    6 ++++
 arch/ia64/xen/xensetup.S          |   40 +++++++++++++++++++++++++++
 include/asm-ia64/xen/hypervisor.h |   55 +++++++++++++++++++++++++++++++++++++
 3 files changed, 101 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/xensetup.S
 create mode 100644 include/asm-ia64/xen/hypervisor.h

diff --git a/arch/ia64/kernel/head.S b/arch/ia64/kernel/head.S
index d3a41d5..2f8d770 100644
--- a/arch/ia64/kernel/head.S
+++ b/arch/ia64/kernel/head.S
@@ -367,6 +367,12 @@ start_ap:
 	;;
 (isBP)	st8 [r2]=r28		// save the address of the boot param area passed by the
bootloader
 
+#ifdef CONFIG_XEN
+	//  Note: isBP is used by the subprogram.
+	br.call.sptk.many rp=early_xen_setup
+	;;
+#endif
+
 #ifdef CONFIG_SMP
 (isAP)	br.call.sptk.many rp=start_secondary
 .ret0:
diff --git a/arch/ia64/xen/xensetup.S b/arch/ia64/xen/xensetup.S
new file mode 100644
index 0000000..17ad297
--- /dev/null
+++ b/arch/ia64/xen/xensetup.S
@@ -0,0 +1,40 @@
+/*
+ * Support routines for Xen
+ *
+ * Copyright (C) 2005 Dan Magenheimer <dan.magenheimer at hp.com>
+ */
+
+#include <asm/processor.h>
+#include <asm/asmmacro.h>
+
+	.section .data.read_mostly
+	.align 8
+	.global running_on_xen
+running_on_xen:
+	data4 0
+	.previous
+
+#define isBP	p3	// are we the Bootstrap Processor?
+
+	.text
+GLOBAL_ENTRY(early_xen_setup)
+	mov r8=ar.rsc		// Initialized in head.S
+(isBP)	movl r9=running_on_xen;;
+	extr.u r8=r8,2,2;;	// Extract pl fields
+	cmp.eq p7,p0=r8,r0	// p7: !running on xen
+	mov r8=1		// booleanize.
+(p7)	br.ret.sptk.many rp;;
+(isBP)	st4 [r9]=r8
+	movl r10=xen_ivt;;
+
+	mov cr.iva=r10
+
+	/* Set xsi base.  */
+#define FW_HYPERCALL_SET_SHARED_INFO_VA			0x600
+(isBP)	mov r2=FW_HYPERCALL_SET_SHARED_INFO_VA
+(isBP)	movl r28=XSI_BASE;;
+(isBP)	break 0x1000;;
+
+	br.ret.sptk.many rp
+	;;
+END(early_xen_setup)
diff --git a/include/asm-ia64/xen/hypervisor.h
b/include/asm-ia64/xen/hypervisor.h
new file mode 100644
index 0000000..78c5635
--- /dev/null
+++ b/include/asm-ia64/xen/hypervisor.h
@@ -0,0 +1,55 @@
+/******************************************************************************
+ * hypervisor.h
+ *
+ * Linux-specific hypervisor handling.
+ *
+ * Copyright (c) 2002-2004, K A Fraser
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software
without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef _ASM_IA64_XEN_HYPERVISOR_H
+#define _ASM_IA64_XEN_HYPERVISOR_H
+
+#ifdef CONFIG_XEN
+/* running_on_xen is set before executing any C code by early_xen_setup */
+extern const int running_on_xen;
+#define is_running_on_xen()			(running_on_xen)
+#else /* CONFIG_XEN */
+# ifdef CONFIG_VMX_GUEST
+#  define is_running_on_xen()			(1)
+# else /* CONFIG_VMX_GUEST */
+#  define is_running_on_xen()			(0)
+# endif /* CONFIG_VMX_GUEST */
+#endif /* CONFIG_XEN */
+
+#ifdef CONFIG_XEN_PRIVILEGED_GUEST
+#define is_initial_xendomain()						\
+	(is_running_on_xen() ? xen_start_info->flags & SIF_INITDOMAIN : 0)
+#else
+#define is_initial_xendomain() 0
+#endif
+
+#endif /* _ASM_IA64_XEN_HYPERVISOR_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 34/50] ia64/xen: helper functions for xen fault handlers.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/xenivt.S |   59 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 59 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/xenivt.S

diff --git a/arch/ia64/xen/xenivt.S b/arch/ia64/xen/xenivt.S
new file mode 100644
index 0000000..99bb37a
--- /dev/null
+++ b/arch/ia64/xen/xenivt.S
@@ -0,0 +1,59 @@
+/*
+ * arch/ia64/xen/ivt.S
+ *
+ * Copyright (C) 2005 Hewlett-Packard Co
+ *	Dan Magenheimer <dan.magenheimer at hp.com>
+ *
+ * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *                    pv_ops.
+ */
+
+#include <asm/asmmacro.h>
+#include <asm/kregs.h>
+#include <asm/pgtable.h>
+
+#define __IA64_ASM_PARAVIRTUALIZED_XEN
+#include "inst_xen.h"
+#include "xenminstate.h"
+#include "../kernel/minstate.h"
+
+	.section .text,"ax"
+GLOBAL_ENTRY(xen_event_callback)
+	mov r31=pr		// prepare to save predicates
+	;;
+	SAVE_MIN_WITH_COVER	// uses r31; defines r2 and r3
+	;;
+	movl r3=XSI_PSR_IC
+	mov r14=1
+	;;
+	st4 [r3]=r14
+	;;
+	adds r3=8,r2		// set up second base pointer for SAVE_REST
+	srlz.i			// ensure everybody knows psr.ic is back on
+	;;
+	SAVE_REST
+	;;
+1:
+	alloc r14=ar.pfs,0,0,1,0 // must be first in an insn group
+	add out0=16,sp		// pass pointer to pt_regs as first arg
+	;;
+	br.call.sptk.many b0=xen_evtchn_do_upcall
+	;;
+	movl r20=XSI_PSR_I_ADDR
+	;;
+	ld8 r20=[r20]
+	;;
+	adds r20=-1,r20		// vcpu_info->evtchn_upcall_pending
+	;;
+	ld1 r20=[r20]
+	;;
+	cmp.ne p6,p0=r20,r0	// if there are pending events,
+	(p6) br.spnt.few 1b	// call evtchn_do_upcall again.
+	br.sptk.many ia64_leave_kernel
+END(xen_event_callback)
+
+GLOBAL_ENTRY(xen_bsw1)
+	XEN_BSW_1(r14)
+	br.ret.sptk.many b0
+END(xen_bsw1)
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 35/50] ia64/pv_ops/xen: paravirtualized instructions for hand written assembly code.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/inst_xen.h |  503 ++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 503 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/inst_xen.h

diff --git a/arch/ia64/xen/inst_xen.h b/arch/ia64/xen/inst_xen.h
new file mode 100644
index 0000000..51b4f82
--- /dev/null
+++ b/arch/ia64/xen/inst_xen.h
@@ -0,0 +1,503 @@
+/******************************************************************************
+ * inst_xen.h
+ *
+ * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#define IA64_ASM_PARAVIRTUALIZED_XEN
+
+#define ia64_ivt				xen_ivt
+
+#define __paravirt_switch_to			xen_switch_to
+#define __paravirt_leave_syscall		xen_leave_syscall
+#define __paravirt_work_processed_syscall	xen_work_processed_syscall
+#define __paravirt_leave_kernel			xen_leave_kernel
+#define __paravirt_pending_syscall_end		xen_work_pending_syscall_end
+#define __paravirt_work_processed_syscall_target \
+						xen_work_processed_syscall
+
+#define MOV_FROM_IFA(reg)	\
+	movl reg = XSI_IFA;	\
+	;;			\
+	ld8 reg = [reg]
+
+#define MOV_FROM_ITIR(reg)	\
+	movl reg = XSI_ITIR;	\
+	;;			\
+	ld8 reg = [reg]
+
+#define MOV_FROM_ISR(reg)	\
+	movl reg = XSI_ISR;	\
+	;;			\
+	ld8 reg = [reg]
+
+#define MOV_FROM_IHA(reg)	\
+	movl reg = XSI_IHA;	\
+	;;			\
+	ld8 reg = [reg]
+
+#define MOV_FROM_IPSR(reg)	\
+	movl reg = XSI_IPSR;	\
+	;;			\
+	ld8 reg = [reg]
+
+#define MOV_FROM_IIM(reg)	\
+	movl reg = XSI_IIM;	\
+	;;			\
+	ld8 reg = [reg]
+
+#define MOV_FROM_IIP(reg)	\
+	movl reg = XSI_IIP;	\
+	;;			\
+	ld8 reg = [reg]
+
+.macro __MOV_FROM_IVR reg, clob
+	.ifc "\reg", "r8"
+		XEN_HYPER_GET_IVR
+		.exitm
+	.endif
+	.ifc "\clob", "r8"
+		XEN_HYPER_GET_IVR
+		;;
+		mov \reg = r8
+		.exitm
+	.endif
+	.ifc "\reg", "\clob"
+		.error "it should be reg \reg != clob \clob"
+	.endif
+
+	mov \clob = r8
+	;;
+	XEN_HYPER_GET_IVR
+	;;
+	mov \reg = r8
+	;;
+	mov r8 = \clob
+.endm
+#define MOV_FROM_IVR(reg, clob)	__MOV_FROM_IVR reg, clob
+
+.macro __MOV_FROM_PSR pred, reg, clob
+	.ifc "\reg", "r8"
+		(\pred)	XEN_HYPER_GET_PSR;
+		.exitm
+	.endif
+	.ifc "\clob", "r8"
+		(\pred)	XEN_HYPER_GET_PSR
+		;;
+		(\pred)	mov \reg = r8
+		.exitm
+	.endif
+
+	(\pred)	mov \clob = r8
+	(\pred)	XEN_HYPER_GET_PSR
+	;;
+	(\pred)	mov \reg = r8
+	(\pred)	mov r8 = \clob
+.endm
+#define MOV_FROM_PSR(pred, reg, clob)	__MOV_FROM_PSR pred, reg, clob
+
+
+#define MOV_TO_IFA(reg, clob)	\
+	movl clob = XSI_IFA;	\
+	;;			\
+	st8 [clob] = reg	\
+
+#define MOV_TO_ITIR(pred, reg, clob)	\
+(pred)	movl clob = XSI_ITIR;		\
+	;;				\
+(pred)	st8 [clob] = reg
+
+#define MOV_TO_IHA(pred, reg, clob)	\
+(pred)	movl clob = XSI_IHA;		\
+	;;				\
+(pred)	st8 [clob] = reg
+
+#define MOV_TO_IPSR(reg, clob)	\
+	movl clob = XSI_IPSR;	\
+	;;			\
+	st8 [clob] = reg;	\
+	;;
+
+#define MOV_TO_IFS(pred, reg, clob)	\
+(pred)	movl clob = XSI_IFS;		\
+	;;				\
+(pred)	st8 [clob] = reg;		\
+	;;
+
+#define MOV_TO_IIP(reg, clob)	\
+	movl clob = XSI_IIP;	\
+	;;			\
+	st8 [clob] = reg
+
+.macro ____MOV_TO_KR kr, reg, clob0, clob1
+	.ifc "\clob0", "r9"
+		.error "clob0 \clob0 must not be r9"
+	.endif
+	.ifc "\clob1", "r8"
+		.error "clob1 \clob1 must not be r8"
+	.endif
+
+	.ifnc "\reg", "r9"
+		.ifnc "\clob1", "r9"
+			mov \clob1 = r9
+		.endif
+		mov r9 = \reg
+	.endif
+	.ifnc "\clob0", "r8"
+		mov \clob0 = r8
+	.endif
+	mov r8 = \kr
+	;;
+	XEN_HYPER_SET_KR
+
+	.ifnc "\reg", "r9"
+		.ifnc "\clob1", "r9"
+			mov r9 = \clob1
+		.endif
+	.endif
+	.ifnc "\clob0", "r8"
+		mov r8 = \clob0
+	.endif
+.endm
+
+.macro __MOV_TO_KR kr, reg, clob0, clob1
+	.ifc "\clob0", "r9"
+		____MOV_TO_KR \kr, \reg, \clob1, \clob0
+		.exitm
+	.endif
+	.ifc "\clob1", "r8"
+		____MOV_TO_KR \kr, \reg, \clob1, \clob0
+		.exitm
+	.endif
+
+	____MOV_TO_KR \kr, \reg, \clob0, \clob1
+.endm
+
+#define MOV_TO_KR(kr, reg, clob0, clob1) \
+	__MOV_TO_KR IA64_KR_ ## kr, reg, clob0, clob1
+
+
+.macro __ITC_I pred, reg, clob
+	.ifc "\reg", "r8"
+		(\pred)	XEN_HYPER_ITC_I
+		.exitm
+	.endif
+	.ifc "\clob", "r8"
+		(\pred)	mov r8 = \reg
+		;;
+		(\pred)	XEN_HYPER_ITC_I
+		.exitm
+	.endif
+
+	(\pred)	mov \clob = r8
+	(\pred)	mov r8 = \reg
+	;;
+	(\pred)	XEN_HYPER_ITC_I
+	;;
+	(\pred)	mov r8 = \clob
+	;;
+.endm
+#define ITC_I(pred, reg, clob)	__ITC_I pred, reg, clob
+
+.macro __ITC_D pred, reg, clob
+	.ifc "\reg", "r8"
+		(\pred)	XEN_HYPER_ITC_D
+		;;
+		.exitm
+	.endif
+	.ifc "\clob", "r8"
+		(\pred)	mov r8 = \reg
+		;;
+		(\pred)	XEN_HYPER_ITC_D
+		;;
+		.exitm
+	.endif
+
+	(\pred)	mov \clob = r8
+	(\pred)	mov r8 = \reg
+	;;
+	(\pred)	XEN_HYPER_ITC_D
+	;;
+	(\pred)	mov r8 = \clob
+	;;
+.endm
+#define ITC_D(pred, reg, clob)	__ITC_D pred, reg, clob
+
+.macro __ITC_I_AND_D pred_i, pred_d, reg, clob
+	.ifc "\reg", "r8"
+		(\pred_i)XEN_HYPER_ITC_I
+		;;
+		(\pred_d)XEN_HYPER_ITC_D
+		;;
+		.exitm
+	.endif
+	.ifc "\clob", "r8"
+		mov r8 = \reg
+		;;
+		(\pred_i)XEN_HYPER_ITC_I
+		;;
+		(\pred_d)XEN_HYPER_ITC_D
+		;;
+		.exitm
+	.endif
+
+	mov \clob = r8
+	mov r8 = \reg
+	;;
+	(\pred_i)XEN_HYPER_ITC_I
+	;;
+	(\pred_d)XEN_HYPER_ITC_D
+	;;
+	mov r8 = \clob
+	;;
+.endm
+#define ITC_I_AND_D(pred_i, pred_d, reg, clob) \
+	__ITC_I_AND_D pred_i, pred_d, reg, clob
+
+.macro __THASH pred, reg0, reg1, clob
+	.ifc "\reg0", "r8"
+		(\pred)	mov r8 = \reg1
+		(\pred)	XEN_HYPER_THASH
+		.exitm
+	.endc
+	.ifc "\reg1", "r8"
+		(\pred)	XEN_HYPER_THASH
+		;;
+		(\pred)	mov \reg0 = r8
+		;;
+		.exitm
+	.endif
+	.ifc "\clob", "r8"
+		(\pred)	mov r8 = \reg1
+		(\pred)	XEN_HYPER_THASH
+		;;
+		(\pred)	mov \reg0 = r8
+		;;
+		.exitm
+	.endif
+
+	(\pred)	mov \clob = r8
+	(\pred)	mov r8 = \reg1
+	(\pred)	XEN_HYPER_THASH
+	;;
+	(\pred)	mov \reg0 = r8
+	(\pred)	mov r8 = \clob
+	;;
+.endm
+#define THASH(pred, reg0, reg1, clob) __THASH pred, reg0, reg1, clob
+
+#define SSM_PSR_IC_AND_DEFAULT_BITS(clob0, clob1)	\
+	mov clob0 = 1;					\
+	movl clob1 = XSI_PSR_IC;			\
+	;;						\
+	st4 [clob1] = clob0				\
+	;;
+
+#define SSM_PSR_IC_AND_SRLZ_D(clob0, clob1)	\
+	;;					\
+	srlz.d;					\
+	mov clob1 = 1;				\
+	movl clob0=XSI_PSR_IC;			\
+	;;					\
+	st4 [clob0] = clob1
+
+#define RSM_PSR_IC(clob)	\
+	movl clob = XSI_PSR_IC;	\
+	;;			\
+	st4 [clob] = r0;	\
+	;;
+
+/* pred will be clobbered */
+#define MASK_TO_PEND_OFS    (-1)
+#define SSM_PSR_I(pred, clob)						\
+(pred)	movl clob = XSI_PSR_I_ADDR					\
+	;;								\
+(pred)	ld8 clob = [clob]						\
+	;;								\
+	/* if (pred) vpsr.i = 1 */					\
+	/* if (pred) (vcpu->vcpu_info->evtchn_upcall_mask)=0 */		\
+(pred)	st1 [clob] = r0, MASK_TO_PEND_OFS				\
+	;;								\
+	/* if (vcpu->vcpu_info->evtchn_upcall_pending) */		\
+(pred)	ld1 clob = [clob]						\
+	;;								\
+(pred)	cmp.ne pred, p0 = clob, r0					\
+	;;								\
+(pred)	XEN_HYPER_SSM_I	/* do areal ssm psr.i */
+
+#define RSM_PSR_I(pred, clob0, clob1)	\
+	movl clob0 = XSI_PSR_I_ADDR;	\
+	mov clob1 = 1;			\
+	;;				\
+	ld8 clob0 = [clob0];		\
+	;;				\
+(pred)	st1 [clob0] = clob1
+
+#define RSM_PSR_I_IC(clob0, clob1, clob2)		\
+	movl clob0 = XSI_PSR_I_ADDR;			\
+	movl clob1 = XSI_PSR_IC;			\
+	;;						\
+	ld8 clob0 = [clob0];				\
+	mov clob2 = 1;					\
+	;;						\
+	/* note: clears both vpsr.i and vpsr.ic! */	\
+	st1 [clob0] = clob2;				\
+	st4 [clob1] = r0;				\
+	;;
+
+#define RSM_PSR_DT		\
+	XEN_HYPER_RSM_PSR_DT
+
+#define RSM_PSR_DT_AND_SRLZ_I	\
+	XEN_HYPER_RSM_PSR_DT
+
+#define SSM_PSR_DT_AND_SRLZ_I	\
+	XEN_HYPER_SSM_PSR_DT
+
+#define BSW_0(clob0, clob1, clob2)			\
+	;;						\
+	/* r16-r31 all now hold bank1 values */		\
+	mov clob2 = ar.unat;				\
+	movl clob0 = XSI_BANK1_R16;			\
+	movl clob1 = XSI_BANK1_R16 + 8;			\
+	;;						\
+.mem.offset 0, 0; st8.spill [clob0] = r16, 16;		\
+.mem.offset 8, 0; st8.spill [clob1] = r17, 16;		\
+	;;						\
+.mem.offset 0, 0; st8.spill [clob0] = r18, 16;		\
+.mem.offset 8, 0; st8.spill [clob1] = r19, 16;		\
+	;;						\
+.mem.offset 0, 0; st8.spill [clob0] = r20, 16;		\
+.mem.offset 8, 0; st8.spill [clob1] = r21, 16;		\
+	;;						\
+.mem.offset 0, 0; st8.spill [clob0] = r22, 16;		\
+.mem.offset 8, 0; st8.spill [clob1] = r23, 16;		\
+	;;						\
+.mem.offset 0, 0; st8.spill [clob0] = r24, 16;		\
+.mem.offset 8, 0; st8.spill [clob1] = r25, 16;		\
+	;;						\
+.mem.offset 0, 0; st8.spill [clob0] = r26, 16;		\
+.mem.offset 8, 0; st8.spill [clob1] = r27, 16;		\
+	;;						\
+.mem.offset 0, 0; st8.spill [clob0] = r28, 16;		\
+.mem.offset 8, 0; st8.spill [clob1] = r29, 16;		\
+	;;						\
+.mem.offset 0, 0; st8.spill [clob0] = r30, 16;		\
+.mem.offset 8, 0; st8.spill [clob1] = r31, 16;		\
+	;;						\
+	mov clob1 = ar.unat;				\
+	movl clob0 = XSI_B1NAT;				\
+	;;						\
+	st8 [clob0] = clob1;				\
+	mov ar.unat = clob2;				\
+	movl clob0 = XSI_BANKNUM;			\
+	;;						\
+	st4 [clob0] = r0
+
+
+	/* FIXME: THIS CODE IS NOT NaT SAFE! */
+#define XEN_BSW_1(clob)			\
+	mov clob = ar.unat;		\
+	movl r30 = XSI_B1NAT;		\
+	;;				\
+	ld8 r30 = [r30];		\
+	;;				\
+	mov ar.unat = r30;		\
+	movl r30 = XSI_BANKNUM;		\
+	mov r31 = 1;			\
+	;;				\
+	st4 [r30] = r31;		\
+	movl r30 = XSI_BANK1_R16;	\
+	movl r31 = XSI_BANK1_R16+8;	\
+	;;				\
+	ld8.fill r16 = [r30], 16;	\
+	ld8.fill r17 = [r31], 16;	\
+	;;				\
+	ld8.fill r18 = [r30], 16;	\
+	ld8.fill r19 = [r31], 16;	\
+	;;				\
+	ld8.fill r20 = [r30], 16;	\
+	ld8.fill r21 = [r31], 16;	\
+	;;				\
+	ld8.fill r22 = [r30], 16;	\
+	ld8.fill r23 = [r31], 16;	\
+	;;				\
+	ld8.fill r24 = [r30], 16;	\
+	ld8.fill r25 = [r31], 16;	\
+	;;				\
+	ld8.fill r26 = [r30], 16;	\
+	ld8.fill r27 = [r31], 16;	\
+	;;				\
+	ld8.fill r28 = [r30], 16;	\
+	ld8.fill r29 = [r31], 16;	\
+	;;				\
+	ld8.fill r30 = [r30];		\
+	ld8.fill r31 = [r31];		\
+	;;				\
+	mov ar.unat = clob
+
+/* xen_bsw1 clobbers clob1 = r14 */
+.macro ____BSW_1 clob0, clob1
+	.ifc "\clob0", "r14"
+		.error "clob0 \clob0 must not be r14"
+	.endif
+	.ifnc "\clob1", "r14"
+		.error "clob1 \clob1 must be r14"
+	.endif
+	.ifc "\clob0", "\clob1"
+		.error "it must be clob0 \clob0 != clob1 \clob1"
+	.endif
+
+	mov \clob0 = b0
+	br.call.sptk b0 = xen_bsw1
+	;;
+	mov b0 = \clob0
+	;;
+.endm
+
+.macro __BSW_1 clob0, clob1
+	.ifc "\clob0", "r14"
+		____BSW_1 \clob1, \clob0
+		.exitm
+	.endif
+	.ifc "\clob1", "r14"
+		____BSW_1 \clob0, \clob1
+		.exitm
+	.endif
+	.ifc "\clob0", "\clob1"
+		.error "it must be clob0 \clob0 != clob1 \clob1"
+	.endif
+
+	.warning "use r14 as second argument \clob0 \clob1"
+	mov \clob1 = r14
+	____BSW_1 \clob0, r14
+	mov r14 = \clob1
+.endm
+
+/* in place code generating causes lack of space */
+/* #define BSW_1(clob0, clob1)	XEN_BSW_1(clob1) */
+#define BSW_1(clob0, clob1)	__BSW_1 clob0, clob1
+
+
+#define COVER	\
+	XEN_HYPER_COVER
+
+#define RFI			\
+	XEN_HYPER_RFI;		\
+	dv_serialize_data
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 36/50] ia64/pv_ops/xen: paravirtualize DO_SAVE_MIN.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/xenminstate.h |  137 +++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 137 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/xenminstate.h

diff --git a/arch/ia64/xen/xenminstate.h b/arch/ia64/xen/xenminstate.h
new file mode 100644
index 0000000..67bbf79
--- /dev/null
+++ b/arch/ia64/xen/xenminstate.h
@@ -0,0 +1,137 @@
+#ifdef __IA64_ASM_PARAVIRTUALIZED_XEN
+/*
+ * DO_SAVE_MIN switches to the kernel stacks (if necessary) and saves
+ * the minimum state necessary that allows us to turn psr.ic back
+ * on.
+ *
+ * Assumed state upon entry:
+ *	psr.ic: off
+ *	r31:	contains saved predicates (pr)
+ *
+ * Upon exit, the state is as follows:
+ *	psr.ic: off
+ *	 r2 = points to &pt_regs.r16
+ *	 r8 = contents of ar.ccv
+ *	 r9 = contents of ar.csd
+ *	r10 = contents of ar.ssd
+ *	r11 = FPSR_DEFAULT
+ *	r12 = kernel sp (kernel virtual address)
+ *	r13 = points to current task_struct (kernel virtual address)
+ *	p15 = TRUE if psr.i is set in cr.ipsr
+ *	predicate registers (other than p2, p3, and p15), b6, r3, r14, r15:
+ *		preserved
+ * CONFIG_XEN note: p6/p7 are not preserved
+ *
+ * Note that psr.ic is NOT turned on by this macro.  This is so that
+ * we can pass interruption state as arguments to a handler.
+ */
+#define DO_SAVE_MIN(__COVER,SAVE_IFS,EXTRA)							\
+	mov r16=IA64_KR(CURRENT);	/* M */							\
+	mov r27=ar.rsc;			/* M */							\
+	mov r20=r1;			/* A */							\
+	mov r25=ar.unat;		/* M */							\
+	MOV_FROM_IPSR(r29);		/* M */							\
+	mov r26=ar.pfs;			/* I */							\
+	MOV_FROM_IIP(r28);		/* M */							\
+	mov r21=ar.fpsr;		/* M */							\
+	__COVER;			/* B;; (or nothing) */					\
+	;;											\
+	adds r16=IA64_TASK_THREAD_ON_USTACK_OFFSET,r16;						\
+	;;											\
+	ld1 r17=[r16];				/* load current->thread.on_ustack flag */	\
+	st1 [r16]=r0;				/* clear current->thread.on_ustack flag */	\
+	adds r1=-IA64_TASK_THREAD_ON_USTACK_OFFSET,r16						\
+	/* switch from user to kernel RBS: */							\
+	;;											\
+	invala;				/* M */							\
+	/* SAVE_IFS;*/ /* see xen special handling below */					\
+	cmp.eq pKStk,pUStk=r0,r17;		/* are we in kernel mode already? */		\
+	;;											\
+(pUStk)	mov ar.rsc=0;		/* set enforced lazy mode, pl 0, little-endian, loadrs=0
*/	\
+	;;											\
+(pUStk)	mov.m r24=ar.rnat;									\
+(pUStk)	addl r22=IA64_RBS_OFFSET,r1;			/* compute base of RBS */		\
+(pKStk) mov r1=sp;					/* get sp  */				\
+	;;											\
+(pUStk) lfetch.fault.excl.nt1 [r22];								\
+(pUStk)	addl r1=IA64_STK_OFFSET-IA64_PT_REGS_SIZE,r1;	/* compute base of memory
stack */	\
+(pUStk)	mov r23=ar.bspstore;				/* save ar.bspstore */			\
+	;;											\
+(pUStk)	mov ar.bspstore=r22;				/* switch to kernel RBS */		\
+(pKStk) addl r1=-IA64_PT_REGS_SIZE,r1;			/* if in kernel mode, use sp (r12) */
\
+	;;											\
+(pUStk)	mov r18=ar.bsp;										\
+(pUStk)	mov ar.rsc=0x3;		/* set eager mode, pl 0, little-endian, loadrs=0 */		\
+	adds r17=2*L1_CACHE_BYTES,r1;		/* really: biggest cache-line size */		\
+	adds r16=PT(CR_IPSR),r1;								\
+	;;											\
+	lfetch.fault.excl.nt1 [r17],L1_CACHE_BYTES;						\
+	st8 [r16]=r29;		/* save cr.ipsr */						\
+	;;											\
+	lfetch.fault.excl.nt1 [r17];								\
+	tbit.nz p15,p0=r29,IA64_PSR_I_BIT;							\
+	mov r29=b0										\
+	;;											\
+	adds r16=PT(R8),r1;	/* initialize first base pointer */				\
+	adds r17=PT(R9),r1;	/* initialize second base pointer */				\
+(pKStk)	mov r18=r0;		/* make sure r18 isn't NaT */					\
+	;;											\
+.mem.offset 0,0; st8.spill [r16]=r8,16;								\
+.mem.offset 8,0; st8.spill [r17]=r9,16;								\
+        ;;											\
+.mem.offset 0,0; st8.spill [r16]=r10,24;							\
+.mem.offset 8,0; st8.spill [r17]=r11,24;							\
+        ;;											\
+	/* xen special handling for possibly lazy cover */					\
+	/* XXX: SAVE_MIN case in dispatch_ia32_handler: mov r30=r0 */				\
+	movl r8=XSI_PRECOVER_IFS;								\
+	;;											\
+	ld8 r30=[r8];										\
+	;;											\
+	st8 [r16]=r28,16;	/* save cr.iip */						\
+	st8 [r17]=r30,16;	/* save cr.ifs */						\
+(pUStk)	sub r18=r18,r22;	/* r18=RSE.ndirty*8 */						\
+	mov r8=ar.ccv;										\
+	mov r9=ar.csd;										\
+	mov r10=ar.ssd;										\
+	movl r11=FPSR_DEFAULT;   /* L-unit */							\
+	;;											\
+	st8 [r16]=r25,16;	/* save ar.unat */						\
+	st8 [r17]=r26,16;	/* save ar.pfs */						\
+	shl r18=r18,16;		/* compute ar.rsc to be used for "loadrs" */			\
+	;;											\
+	st8 [r16]=r27,16;	/* save ar.rsc */						\
+(pUStk)	st8 [r17]=r24,16;	/* save ar.rnat */						\
+(pKStk)	adds r17=16,r17;	/* skip over ar_rnat field */					\
+	;;			/* avoid RAW on r16 & r17 */					\
+(pUStk)	st8 [r16]=r23,16;	/* save ar.bspstore */						\
+	st8 [r17]=r31,16;	/* save predicates */						\
+(pKStk)	adds r16=16,r16;	/* skip over ar_bspstore field */				\
+	;;											\
+	st8 [r16]=r29,16;	/* save b0 */							\
+	st8 [r17]=r18,16;	/* save ar.rsc value for "loadrs" */				\
+	cmp.eq pNonSys,pSys=r0,r0	/* initialize pSys=0, pNonSys=1 */			\
+	;;											\
+.mem.offset 0,0; st8.spill [r16]=r20,16;	/* save original r1 */				\
+.mem.offset 8,0; st8.spill [r17]=r12,16;							\
+	adds r12=-16,r1;	/* switch to kernel memory stack (with 16 bytes of scratch)
*/	\
+	;;											\
+.mem.offset 0,0; st8.spill [r16]=r13,16;							\
+.mem.offset 8,0; st8.spill [r17]=r21,16;	/* save ar.fpsr */				\
+	mov r13=IA64_KR(CURRENT);	/* establish `current' */				\
+	;;											\
+.mem.offset 0,0; st8.spill [r16]=r15,16;							\
+.mem.offset 8,0; st8.spill [r17]=r14,16;							\
+	;;											\
+.mem.offset 0,0; st8.spill [r16]=r2,16;								\
+.mem.offset 8,0; st8.spill [r17]=r3,16;								\
+	;;											\
+	EXTRA;											\
+	BSW_1(r2, r14);										\
+	adds r2=IA64_PT_REGS_R16_OFFSET,r1;							\
+	;;											\
+	movl r1=__gp;		/* establish kernel global pointer */				\
+	;;											\
+	/*bsw.1;*/		/* switch back to bank 1 (must be last in insn group) */	\
+	;;
+#endif /* __IA64_ASM_PARAVIRTUALIZED_XEN */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 37/50] ia64/pv_ops/xen: multi compile switch_leave.S and ivt.S for xen.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/Makefile |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index 7849bc3..a80dd3f 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -83,3 +83,18 @@ $(obj)/gate-data.o: $(obj)/gate.so
 #
 AFLAGS_ivt.o += -D__IA64_ASM_PARAVIRTUALIZED_NATIVE
 AFLAGS_switch_leave.o += -D__IA64_ASM_PARAVIRTUALIZED_NATIVE
+
+# xen multi compile
+$(obj)/xen_%.o: $(src)/%.S FORCE
+	$(call if_changed_dep,as_o_S)
+
+#
+# xenivt.o, xen_switch_leave.o
+#
+obj-$(CONFIG_XEN) += xen_ivt.o xen_switch_leave.o
+ifeq ($(CONFIG_XEN), y)
+targets += xen_ivt.o xen_switch_leave.o
+$(obj)/build-in.o: xen_ivt.o xen_switch_leave.o
+endif
+AFLAGS_xen_ivt.o += -D__IA64_ASM_PARAVIRTUALIZED_XEN
+AFLAGS_xen_switch_leave.o += -D__IA64_ASM_PARAVIRTUALIZED_XEN
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 38/50] ia64/xen: paravirtualize pal_call_static().

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/xenpal.S |   76 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 76 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/xenpal.S

diff --git a/arch/ia64/xen/xenpal.S b/arch/ia64/xen/xenpal.S
new file mode 100644
index 0000000..0e05210
--- /dev/null
+++ b/arch/ia64/xen/xenpal.S
@@ -0,0 +1,76 @@
+/*
+ * ia64/xen/xenpal.S
+ *
+ * Alternate PAL  routines for Xen.  Heavily leveraged from
+ *   ia64/kernel/pal.S
+ *
+ * Copyright (C) 2005 Hewlett-Packard Co
+ *	Dan Magenheimer <dan.magenheimer at .hp.com>
+ */
+
+#include <asm/asmmacro.h>
+#include <asm/processor.h>
+#include <asm/paravirt_nop.h>
+
+GLOBAL_ENTRY(xen_pal_call_static)
+#ifdef CONFIG_XEN
+	BR_IF_NATIVE(native_pal_call_static, r22, p7)
+#endif
+	.prologue ASM_UNW_PRLG_RP|ASM_UNW_PRLG_PFS, ASM_UNW_PRLG_GRSAVE(5)
+	alloc loc1 = ar.pfs,4,5,0,0
+	movl loc2 = pal_entry_point
+1:	{
+	  mov r28 = in0
+	  mov r29 = in1
+	  mov r8 = ip
+	}
+	;;
+	ld8 loc2 = [loc2]		// loc2 <- entry point
+	adds r8 = 1f-1b,r8
+	mov loc4=ar.rsc			// save RSE configuration
+	;;
+	mov ar.rsc=0			// put RSE in enforced lazy, LE mode
+#ifdef CONFIG_XEN
+	mov r9 = r8
+	XEN_HYPER_GET_PSR
+	;;
+	mov loc3 = r8
+	mov r8 = r9
+	;;
+#else
+	mov loc3 = psr
+#endif
+	mov loc0 = rp
+	.body
+	mov r30 = in2
+
+#ifdef CONFIG_XEN
+	// this is low priority for paravirtualization, but is called
+	// from the idle loop so confuses privop counting
+	movl r31=XSI_PSR_I_ADDR
+	;;
+	ld8 r31=[r31]
+	mov r22=1
+	;;
+	st1 [r31]=r22
+	;;
+	mov r31 = in3
+	mov b7 = loc2
+	;;
+#else
+	mov r31 = in3
+	mov b7 = loc2
+
+(p7)	rsm psr.i
+	;;
+#endif
+	mov rp = r8
+	br.cond.sptk.many b7
+1:	mov psr.l = loc3
+	mov ar.rsc = loc4		// restore RSE configuration
+	mov ar.pfs = loc1
+	mov rp = loc0
+	;;
+	srlz.d				// seralize restoration of psr.l
+	br.ret.sptk.many b0
+END(xen_pal_call_static)
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 39/50] ia64/xen: introduce xen paravirtualized intrinsic operations for privileged instruction.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/hypercall.S     |  124 ++++++++++
 include/asm-ia64/xen/privop.h |  512 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 636 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/hypercall.S

diff --git a/arch/ia64/xen/hypercall.S b/arch/ia64/xen/hypercall.S
new file mode 100644
index 0000000..a96f278
--- /dev/null
+++ b/arch/ia64/xen/hypercall.S
@@ -0,0 +1,124 @@
+/*
+ * Support routines for Xen hypercalls
+ *
+ * Copyright (C) 2005 Dan Magenheimer <dan.magenheimer at hp.com>
+ */
+
+#include <asm/asmmacro.h>
+#include <asm/intrinsics.h>
+
+#ifdef __INTEL_COMPILER
+# undef ASM_SUPPORTED
+#else
+# define ASM_SUPPORTED
+#endif
+
+#ifndef ASM_SUPPORTED
+GLOBAL_ENTRY(xen_get_psr)
+	XEN_HYPER_GET_PSR
+	br.ret.sptk.many rp
+	;;
+END(xen_get_psr)
+
+GLOBAL_ENTRY(xen_get_ivr)
+	XEN_HYPER_GET_IVR
+	br.ret.sptk.many rp
+	;;
+END(xen_get_ivr)
+
+GLOBAL_ENTRY(xen_get_tpr)
+	XEN_HYPER_GET_TPR
+	br.ret.sptk.many rp
+	;;
+END(xen_get_tpr)
+
+GLOBAL_ENTRY(xen_set_tpr)
+	mov r8=r32
+	XEN_HYPER_SET_TPR
+	br.ret.sptk.many rp
+	;;
+END(xen_set_tpr)
+
+GLOBAL_ENTRY(xen_eoi)
+	mov r8=r32
+	XEN_HYPER_EOI
+	br.ret.sptk.many rp
+	;;
+END(xen_eoi)
+
+GLOBAL_ENTRY(xen_thash)
+	mov r8=r32
+	XEN_HYPER_THASH
+	br.ret.sptk.many rp
+	;;
+END(xen_thash)
+
+GLOBAL_ENTRY(xen_set_itm)
+	mov r8=r32
+	XEN_HYPER_SET_ITM
+	br.ret.sptk.many rp
+	;;
+END(xen_set_itm)
+
+GLOBAL_ENTRY(xen_ptcga)
+	mov r8=r32
+	mov r9=r33
+	XEN_HYPER_PTC_GA
+	br.ret.sptk.many rp
+	;;
+END(xen_ptcga)
+
+GLOBAL_ENTRY(xen_get_rr)
+	mov r8=r32
+	XEN_HYPER_GET_RR
+	br.ret.sptk.many rp
+	;;
+END(xen_get_rr)
+
+GLOBAL_ENTRY(xen_set_rr)
+	mov r8=r32
+	mov r9=r33
+	XEN_HYPER_SET_RR
+	br.ret.sptk.many rp
+	;;
+END(xen_set_rr)
+
+GLOBAL_ENTRY(xen_set_kr)
+	mov r8=r32
+	mov r9=r33
+	XEN_HYPER_SET_KR
+	br.ret.sptk.many rp
+END(xen_set_kr)
+
+GLOBAL_ENTRY(xen_fc)
+	mov r8=r32
+	XEN_HYPER_FC
+	br.ret.sptk.many rp
+END(xen_fc)
+
+GLOBAL_ENTRY(xen_get_cpuid)
+	mov r8=r32
+	XEN_HYPER_GET_CPUID
+	br.ret.sptk.many rp
+END(xen_get_cpuid)
+
+GLOBAL_ENTRY(xen_get_pmd)
+	mov r8=r32
+	XEN_HYPER_GET_PMD
+	br.ret.sptk.many rp
+END(xen_get_pmd)
+
+#ifdef CONFIG_IA32_SUPPORT
+GLOBAL_ENTRY(xen_get_eflag)
+	XEN_HYPER_GET_EFLAG
+	br.ret.sptk.many rp
+END(xen_get_eflag)
+
+// some bits aren't set if pl!=0, see SDM vol1 3.1.8
+GLOBAL_ENTRY(xen_set_eflag)
+	mov r8=r32
+	XEN_HYPER_SET_EFLAG
+	br.ret.sptk.many rp
+END(xen_set_eflag)
+#endif /* CONFIG_IA32_SUPPORT */
+#endif /* ASM_SUPPORTED */
diff --git a/include/asm-ia64/xen/privop.h b/include/asm-ia64/xen/privop.h
index dd3e5ec..95e8e8a 100644
--- a/include/asm-ia64/xen/privop.h
+++ b/include/asm-ia64/xen/privop.h
@@ -70,4 +70,516 @@
 #define XSI_IHA				(XSI_BASE + XSI_IHA_OFS)
 #endif
 
+#ifndef __ASSEMBLY__
+#define	XEN_HYPER_SSM_I		asm("break %0" : : "i"
(HYPERPRIVOP_SSM_I))
+#define	XEN_HYPER_GET_IVR	asm("break %0" : : "i"
(HYPERPRIVOP_GET_IVR))
+
+/************************************************/
+/* Instructions paravirtualized for correctness */
+/************************************************/
+
+/* "fc" and "thash" are privilege-sensitive instructions,
meaning they
+ *  may have different semantics depending on whether they are executed
+ *  at PL0 vs PL!=0.  When paravirtualized, these instructions mustn't
+ *  be allowed to execute directly, lest incorrect semantics result. */
+#ifdef ASM_SUPPORTED
+static inline void
+xen_fc(unsigned long addr)
+{
+	register __u64 __addr asm ("r8") = addr;
+	asm volatile ("break %0":: "i"(HYPERPRIVOP_FC),
"r"(__addr));
+}
+
+static inline unsigned long
+xen_thash(unsigned long addr)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	register __u64 __addr asm ("r8") = addr;
+	asm volatile ("break %1":
+		      "=r"(ia64_intri_res):
+		      "i"(HYPERPRIVOP_THASH), "0"(__addr));
+	return ia64_intri_res;
+}
+#else
+extern void xen_fc(unsigned long addr);
+extern unsigned long xen_thash(unsigned long addr);
+#endif
+
+/* Note that "ttag" and "cover" are also
privilege-sensitive; "ttag"
+ * is not currently used (though it may be in a long-format VHPT system!)
+ * and the semantics of cover only change if psr.ic is off which is very
+ * rare (and currently non-existent outside of assembly code */
+
+/* There are also privilege-sensitive registers.  These registers are
+ * readable at any privilege level but only writable at PL0. */
+#ifdef ASM_SUPPORTED
+static inline unsigned long
+xen_get_cpuid(int index)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	register __u64 __index asm ("r8") = index;
+	asm volatile ("break %1":
+		      "=r"(ia64_intri_res):
+		      "i"(HYPERPRIVOP_GET_CPUID), "0"(__index));
+	return ia64_intri_res;
+}
+
+static inline unsigned long
+xen_get_pmd(int index)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	register __u64 __index asm ("r8") = index;
+	asm volatile ("break %1":
+		      "=r"(ia64_intri_res):
+		      "i"(HYPERPRIVOP_GET_PMD), "0O"(__index));
+	return ia64_intri_res;
+}
+#else
+extern unsigned long xen_get_cpuid(int index);
+extern unsigned long xen_get_pmd(int index);
+#endif
+
+#ifdef ASM_SUPPORTED
+static inline unsigned long
+xen_get_eflag(void)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	asm volatile ("break %1":
+		      "=r"(ia64_intri_res): "i"(HYPERPRIVOP_GET_EFLAG));
+	return ia64_intri_res;
+}
+
+static inline void
+xen_set_eflag(unsigned long val)
+{
+	register __u64 __val asm ("r8") = val;
+	asm volatile ("break %0":: "i"(HYPERPRIVOP_SET_EFLAG),
"r"(__val));
+}
+#else
+extern unsigned long xen_get_eflag(void);	/* see xen_ia64_getreg */
+extern void xen_set_eflag(unsigned long);	/* see xen_ia64_setreg */
+#endif
+
+/************************************************/
+/* Instructions paravirtualized for performance */
+/************************************************/
+
+/* Xen uses memory-mapped virtual privileged registers for access to many
+ * performance-sensitive privileged registers.  Some, like the processor
+ * status register (psr), are broken up into multiple memory locations.
+ * Others, like "pend", are abstractions based on privileged
registers.
+ * "Pend" is guaranteed to be set if reading cr.ivr would return a
+ * (non-spurious) interrupt. */
+#define XEN_MAPPEDREGS ((struct mapped_regs *)XMAPPEDREGS_BASE)
+
+#define XSI_PSR_I			\
+	(*XEN_MAPPEDREGS->interrupt_mask_addr)
+#define xen_get_virtual_psr_i()		\
+	(!XSI_PSR_I)
+#define xen_set_virtual_psr_i(_val)	\
+	({ XSI_PSR_I = (uint8_t)(_val) ? 0 : 1; })
+#define xen_set_virtual_psr_ic(_val)	\
+	({ XEN_MAPPEDREGS->interrupt_collection_enabled = _val ? 1 : 0; })
+#define xen_get_virtual_pend()		\
+	(*(((uint8_t *)XEN_MAPPEDREGS->interrupt_mask_addr) - 1))
+
+/* Hyperprivops are "break" instructions with a well-defined API.
+ * In particular, the virtual psr.ic bit must be off; in this way
+ * it is guaranteed to never conflict with a linux break instruction.
+ * Normally, this is done in a xen stub but this one is frequent enough
+ * that we inline it */
+#define xen_hyper_ssm_i()						\
+({									\
+	XEN_HYPER_SSM_I;						\
+})
+
+/* turning off interrupts can be paravirtualized simply by writing
+ * to a memory-mapped virtual psr.i bit (implemented as a 16-bit bool) */
+#define xen_rsm_i()							\
+do {									\
+	xen_set_virtual_psr_i(0);					\
+	barrier();							\
+} while (0)
+
+/* turning on interrupts is a bit more complicated.. write to the
+ * memory-mapped virtual psr.i bit first (to avoid race condition),
+ * then if any interrupts were pending, we have to execute a hyperprivop
+ * to ensure the pending interrupt gets delivered; else we're done! */
+#define xen_ssm_i()							\
+do {									\
+	int old = xen_get_virtual_psr_i();				\
+	xen_set_virtual_psr_i(1);					\
+	barrier();							\
+	if (!old && xen_get_virtual_pend())				\
+		xen_hyper_ssm_i();					\
+} while (0)
+
+#define xen_ia64_intrin_local_irq_restore(x)				\
+do {									\
+     if (is_running_on_xen()) {						\
+	     if ((x) & IA64_PSR_I)					\
+		     xen_ssm_i();					\
+	     else							\
+		     xen_rsm_i();					\
+     } else {								\
+	     native_intrin_local_irq_restore((x));			\
+     }									\
+} while (0)
+
+#define	xen_get_psr_i()							\
+({									\
+									\
+	(is_running_on_xen()) ?						\
+		(xen_get_virtual_psr_i() ? IA64_PSR_I : 0)		\
+		: native_get_psr_i()					\
+})
+
+#define xen_ia64_ssm(mask)						\
+do {									\
+	if ((mask) == IA64_PSR_I) {					\
+		if (is_running_on_xen())				\
+			xen_ssm_i(); 					\
+		else							\
+			native_ssm(mask);				\
+	} else {							\
+		native_ssm(mask);					\
+	}								\
+} while (0)
+
+#define xen_ia64_rsm(mask)						\
+do {									\
+	if ((mask) == IA64_PSR_I) {					\
+		if (is_running_on_xen())				\
+			xen_rsm_i(); 					\
+		else							\
+			native_rsm(mask);				\
+	} else {							\
+		native_rsm(mask);					\
+	}								\
+} while (0)
+
+/* Although all privileged operations can be left to trap and will
+ * be properly handled by Xen, some are frequent enough that we use
+ * hyperprivops for performance. */
+
+#ifndef ASM_SUPPORTED
+extern unsigned long xen_get_psr(void);
+extern unsigned long xen_get_ivr(void);
+extern unsigned long xen_get_tpr(void);
+extern void xen_set_itm(unsigned long);
+extern void xen_set_tpr(unsigned long);
+extern void xen_eoi(unsigned long);
+extern void xen_set_rr(unsigned long index, unsigned long val);
+extern unsigned long xen_get_rr(unsigned long index);
+extern void xen_set_kr(unsigned long index, unsigned long val);
+extern void xen_ptcga(unsigned long addr, unsigned long size);
+#else
+static inline unsigned long
+xen_get_psr(void)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	asm volatile ("break %1":
+		      "=r"(ia64_intri_res): "i"(HYPERPRIVOP_GET_PSR));
+	return ia64_intri_res;
+}
+
+static inline unsigned long
+xen_get_ivr(void)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	asm volatile ("break %1":
+		      "=r"(ia64_intri_res): "i"(HYPERPRIVOP_GET_IVR));
+	return ia64_intri_res;
+}
+
+static inline unsigned long
+xen_get_tpr(void)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	asm volatile ("break %1":
+		      "=r"(ia64_intri_res): "i"(HYPERPRIVOP_GET_TPR));
+	return ia64_intri_res;
+}
+
+static inline void
+xen_set_tpr(unsigned long val)
+{
+	register __u64 __val asm ("r8") = val;
+	asm volatile ("break %0"::
+		      "i"(HYPERPRIVOP_GET_TPR), "r"(__val));
+}
+
+static inline void
+xen_eoi(unsigned long val)
+{
+	register __u64 __val asm ("r8") = val;
+	asm volatile ("break %0"::
+		      "i"(HYPERPRIVOP_EOI), "r"(__val));
+}
+
+static inline void
+xen_set_itm(unsigned long val)
+{
+	register __u64 __val asm ("r8") = val;
+	asm volatile ("break %0":: "i"(HYPERPRIVOP_SET_ITM),
"r"(__val));
+}
+
+static inline void
+xen_ptcga(unsigned long addr, unsigned long size)
+{
+	register __u64 __addr asm ("r8") = addr;
+	register __u64 __size asm ("r9") = size;
+	asm volatile ("break %0"::
+		      "i"(HYPERPRIVOP_PTC_GA), "r"(__addr),
"r"(__size));
+}
+
+static inline unsigned long
+xen_get_rr(unsigned long index)
+{
+	register __u64 ia64_intri_res asm ("r8");
+	register __u64 __index asm ("r8") = index;
+	asm volatile ("break %1":
+		      "=r"(ia64_intri_res):
+		      "i"(HYPERPRIVOP_GET_RR), "0"(__index));
+	return ia64_intri_res;
+}
+
+static inline void
+xen_set_rr(unsigned long index, unsigned long val)
+{
+	register __u64 __index asm ("r8") = index;
+	register __u64 __val asm ("r9") = val;
+	asm volatile ("break %0"::
+		      "i"(HYPERPRIVOP_SET_RR), "r"(__index),
"r"(__val));
+}
+
+static inline void
+xen_set_rr0_to_rr4(unsigned long val0, unsigned long val1,
+		   unsigned long val2, unsigned long val3, unsigned long val4)
+{
+	register __u64 __val0 asm ("r8") = val0;
+	register __u64 __val1 asm ("r9") = val1;
+	register __u64 __val2 asm ("r10") = val2;
+	register __u64 __val3 asm ("r11") = val3;
+	register __u64 __val4 asm ("r14") = val4;
+	asm volatile ("break %0" ::
+		      "i"(HYPERPRIVOP_SET_RR0_TO_RR4),
+		      "r"(__val0), "r"(__val1),
+		      "r"(__val2), "r"(__val3), "r"(__val4));
+}
+
+static inline void
+xen_set_kr(unsigned long index, unsigned long val)
+{
+	register __u64 __index asm ("r8") = index;
+	register __u64 __val asm ("r9") = val;
+	asm volatile ("break %0"::
+		      "i"(HYPERPRIVOP_SET_KR), "r"(__index),
"r"(__val));
+}
+#endif
+
+/* Note: It may look wrong to test for is_running_on_xen() in each case.
+ * However regnum is always a constant so, as written, the compiler
+ * eliminates the switch statement, whereas is_running_on_xen() must be
+ * tested dynamically. */
+#define xen_ia64_getreg(regnum)						\
+({									\
+	__u64 ia64_intri_res;						\
+									\
+	switch (regnum) {						\
+	case _IA64_REG_PSR:						\
+		ia64_intri_res = (is_running_on_xen()) ?		\
+			xen_get_psr() :					\
+			native_getreg(regnum);				\
+		break;							\
+	case _IA64_REG_CR_IVR:						\
+		ia64_intri_res = (is_running_on_xen()) ?		\
+			xen_get_ivr() :					\
+			native_getreg(regnum);				\
+		break;							\
+	case _IA64_REG_CR_TPR:						\
+		ia64_intri_res = (is_running_on_xen()) ?		\
+			xen_get_tpr() :					\
+			native_getreg(regnum);				\
+		break;							\
+	case _IA64_REG_AR_EFLAG:					\
+		ia64_intri_res = (is_running_on_xen()) ?		\
+			xen_get_eflag() :				\
+			native_getreg(regnum);				\
+		break;							\
+	default:							\
+		ia64_intri_res = native_getreg(regnum);			\
+		break;							\
+	}								\
+	ia64_intri_res;							\
+})
+
+#define xen_ia64_setreg(regnum, val)					\
+({									\
+	switch (regnum) {						\
+	case _IA64_REG_AR_KR0 ... _IA64_REG_AR_KR7:			\
+		(is_running_on_xen()) ?					\
+			xen_set_kr(((regnum)-_IA64_REG_AR_KR0), (val)) :\
+			native_setreg((regnum), (val));			\
+		break;							\
+	case _IA64_REG_CR_ITM:						\
+		(is_running_on_xen()) ?					\
+			xen_set_itm(val) :				\
+			native_setreg((regnum), (val));			\
+		break;							\
+	case _IA64_REG_CR_TPR:						\
+		(is_running_on_xen()) ?					\
+			xen_set_tpr(val) :				\
+			native_setreg((regnum), (val));			\
+		break;							\
+	case _IA64_REG_CR_EOI:						\
+		(is_running_on_xen()) ?					\
+			xen_eoi(val) :					\
+			native_setreg((regnum), (val));			\
+		break;							\
+	case _IA64_REG_AR_EFLAG:					\
+		(is_running_on_xen()) ?					\
+			xen_set_eflag(val) :				\
+			native_setreg((regnum), (val));			\
+		break;							\
+	default:							\
+		native_setreg((regnum), (val));				\
+		break;							\
+	}								\
+})
+
+#if defined(ASM_SUPPORTED) && !defined(CONFIG_PARAVIRT_ALT)
+
+#define IA64_PARAVIRTUALIZED_PRIVOP
+
+#define ia64_fc(addr)							\
+do {									\
+	if (is_running_on_xen())					\
+		xen_fc((unsigned long)(addr));				\
+	else								\
+		native_fc(addr);					\
+} while (0)
+
+#define ia64_thash(addr)						\
+({									\
+	unsigned long ia64_intri_res;					\
+	if (is_running_on_xen())					\
+		ia64_intri_res =					\
+			xen_thash((unsigned long)(addr));		\
+	else								\
+		ia64_intri_res = native_thash(addr);			\
+	ia64_intri_res;							\
+})
+
+#define ia64_get_cpuid(i)						\
+({									\
+	unsigned long ia64_intri_res;					\
+	if (is_running_on_xen())					\
+		ia64_intri_res = xen_get_cpuid(i);			\
+	else								\
+		ia64_intri_res = native_get_cpuid(i);			\
+	ia64_intri_res;							\
+})
+
+#define ia64_get_pmd(i)							\
+({									\
+	unsigned long ia64_intri_res;					\
+	if (is_running_on_xen())					\
+		ia64_intri_res = xen_get_pmd(i);			\
+	else								\
+		ia64_intri_res = native_get_pmd(i);			\
+	ia64_intri_res;							\
+})
+
+
+#define ia64_ptcga(addr, size)						\
+do {									\
+	if (is_running_on_xen())					\
+		xen_ptcga((addr), (size));				\
+	else								\
+		native_ptcga((addr), (size));				\
+} while (0)
+
+#define ia64_set_rr(index, val)						\
+do {									\
+	if (is_running_on_xen())					\
+		xen_set_rr((index), (val));				\
+	else								\
+		native_set_rr((index), (val));				\
+} while (0)
+
+#define ia64_get_rr(index)						\
+({									\
+	__u64 ia64_intri_res;						\
+	if (is_running_on_xen())					\
+		ia64_intri_res = xen_get_rr((index));			\
+	else								\
+		ia64_intri_res = native_get_rr((index));		\
+	ia64_intri_res;							\
+})
+
+#define ia64_set_rr0_to_rr4(val0, val1, val2, val3, val4)		\
+do {									\
+	if (is_running_on_xen())					\
+		xen_set_rr0_to_rr4((val0), (val1), (val2),		\
+				   (val3), (val4));			\
+	else								\
+		native_set_rr0_to_rr4((val0), (val1), (val2),		\
+				      (val3), (val4));			\
+} while (0)
+
+#define ia64_getreg			xen_ia64_getreg
+#define ia64_setreg			xen_ia64_setreg
+#define ia64_ssm			xen_ia64_ssm
+#define ia64_rsm			xen_ia64_rsm
+#define ia64_intrin_local_irq_restore	xen_ia64_intrin_local_irq_restore
+#define ia64_get_psr_i			xen_get_psr_i
+
+/* the remainder of these are not performance-sensitive so its
+ * OK to not paravirtualize and just take a privop trap and emulate */
+#define ia64_hint			native_hint
+#define ia64_set_pmd			native_set_pmd
+#define ia64_itci			native_itci
+#define ia64_itcd			native_itcd
+#define ia64_itri			native_itri
+#define ia64_itrd			native_itrd
+#define ia64_tpa			native_tpa
+#define ia64_set_ibr			native_set_ibr
+#define ia64_set_pkr			native_set_pkr
+#define ia64_set_pmc			native_set_pmc
+#define ia64_get_ibr			native_get_ibr
+#define ia64_get_pkr			native_get_pkr
+#define ia64_get_pmc			native_get_pmc
+#define ia64_ptce			native_ptce
+#define ia64_ptcl			native_ptcl
+#define ia64_ptri			native_ptri
+#define ia64_ptrd			native_ptrd
+
+#endif /* ASM_SUPPORTED && !CONFIG_PARAVIRT_ALT */
+
+#endif /* !__ASSEMBLY__ */
+
+/* these routines utilize privilege-sensitive or performance-sensitive
+ * privileged instructions so the code must be replaced with
+ * paravirtualized versions */
+#ifndef CONFIG_PARAVIRT_ENTRY
+#define IA64_PARAVIRTUALIZED_ENTRY
+#define ia64_switch_to			xen_switch_to
+#define ia64_leave_syscall		xen_leave_syscall
+#define ia64_work_processed_syscall	xen_work_processed_syscall_with_check
+#define ia64_leave_kernel		xen_leave_kernel
+#define ia64_pal_call_static		xen_pal_call_static
+#endif /* !CONFIG_PARAVIRT_ENTRY */
+
+#ifdef CONFIG_XEN
+#ifdef __ASSEMBLY__
+#define BR_IF_NATIVE(target, reg, pred)		\
+	.body ;					\
+	movl reg=running_on_xen;; ;		\
+	ld4 reg=[reg];; ;			\
+	cmp.eq pred,p0=reg,r0 ;			\
+	(pred)	br.cond.sptk.many target;;
+#endif /* __ASSEMBLY__ */
+#endif
+
 #endif /* _ASM_IA64_XEN_PRIVOP_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 40/50] ia64/pv_ops/xen: xen privileged instruction intrinsics with binary patch.

With binary patching, make intrinsics paravirtualization hypervisor neutral.
So far xen intrinsics doesn't allow another hypervisor.
Binary patch marked privileged operations which needs paravirtualization
if running on xen at early boot time.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/Makefile        |    7 +
 arch/ia64/xen/paravirt_xen.c  |  242 +++++++++++++++++++++++++++++++++++
 arch/ia64/xen/privops_asm.S   |  221 ++++++++++++++++++++++++++++++++
 arch/ia64/xen/privops_c.c     |  279 +++++++++++++++++++++++++++++++++++++++++
 arch/ia64/xen/xensetup.S      |   10 ++
 include/asm-ia64/xen/privop.h |   24 ++++
 6 files changed, 783 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/Makefile
 create mode 100644 arch/ia64/xen/paravirt_xen.c
 create mode 100644 arch/ia64/xen/privops_asm.S
 create mode 100644 arch/ia64/xen/privops_c.c

diff --git a/arch/ia64/xen/Makefile b/arch/ia64/xen/Makefile
new file mode 100644
index 0000000..c219358
--- /dev/null
+++ b/arch/ia64/xen/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for Xen components
+#
+
+obj-$(CONFIG_PARAVIRT_ALT) += paravirt_xen.o privops_asm.o privops_c.o
+obj-$(CONFIG_PARAVIRT_NOP_B_PATCH) += paravirt_xen.o
+obj-$(CONFIG_PARAVIRT_ENTRY) += paravirt_xen.o
diff --git a/arch/ia64/xen/paravirt_xen.c b/arch/ia64/xen/paravirt_xen.c
new file mode 100644
index 0000000..57b9dfd
--- /dev/null
+++ b/arch/ia64/xen/paravirt_xen.c
@@ -0,0 +1,242 @@
+/******************************************************************************
+ * linux/arch/ia64/xen/paravirt_xen.c
+ *
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/init.h>
+#include <asm/intrinsics.h>
+#include <asm/bugs.h>
+#include <asm/kprobes.h> /* for bundle_t */
+#include <asm/paravirt_core.h>
+
+#ifdef CONFIG_PARAVIRT_ALT
+struct xen_alt_bundle_patch_elem {
+	const void	*sbundle;
+	const void	*ebundle;
+	unsigned long	type;
+};
+
+static unsigned long __init_or_module
+__xen_alt_bundle_patch(void *sbundle, void *ebundle, unsigned long type)
+{
+	extern const struct xen_alt_bundle_patch_elem xen_alt_bundle_array[];
+	extern const unsigned long xen_alt_bundle_array_size;
+
+	unsigned long used = 0;
+	unsigned long i;
+
+	BUG_ON((((unsigned long)sbundle) % sizeof(bundle_t)) != 0);
+	BUG_ON((((unsigned long)ebundle) % sizeof(bundle_t)) != 0);
+
+	for (i = 0;
+	     i < xen_alt_bundle_array_size / sizeof(xen_alt_bundle_array[0]);
+	     i++) {
+		const struct xen_alt_bundle_patch_elem *p +			&xen_alt_bundle_array[i];
+		if (p->type == type) {
+			used = p->ebundle - p->sbundle;
+			BUG_ON(used > ebundle - sbundle);
+			memcpy(sbundle, p->sbundle, used);
+			break;
+		}
+	}
+
+	return used;
+}
+
+static void __init
+xen_alt_bundle_patch(void)
+{
+	extern struct paravirt_alt_bundle_patch __start_paravirt_bundles[];
+	extern struct paravirt_alt_bundle_patch __stop_paravirt_bundles[];
+
+	paravirt_alt_bundle_patch_apply(__start_paravirt_bundles,
+					__stop_paravirt_bundles,
+					&__xen_alt_bundle_patch);
+}
+
+#ifdef CONFIG_MODULES
+void
+xen_alt_bundle_patch_module(struct paravirt_alt_bundle_patch *start,
+			    struct paravirt_alt_bundle_patch *end)
+{
+	if (is_running_on_xen())
+		paravirt_alt_bundle_patch_apply(start, end,
+						&__xen_alt_bundle_patch);
+}
+#endif /* CONFIG_MODULES */
+
+
+/*
+ * all the native instructions of hyperprivops are M-form or I-form
+ * mov ar.<imm>=r1	I26, M29
+ * mov r1=ar.<imm>	I28, M31
+ * mov r1=cr.<imm>	M32
+ * mov cr.<imm>=r1	M33
+ * mov r1=psr		M36
+ * mov indirect<r1>=r2	M42
+ * mov r1=indirect<r2>	M43
+ * ptc.ga		M45
+ * thash r1=r2		M46
+ *
+ * break.{m, i} instrucitions format are same.
+ * So we can safely replace all signle instruction which is target of
+ * hyperpviops with break.{m, i} imm21 hyperprivops.
+ */
+
+struct xen_alt_inst_patch_elem {
+	unsigned long stag;
+	unsigned long etag;
+	unsigned long type;
+};
+
+unsigned long
+__xen_alt_inst_patch(unsigned long stag, unsigned long etag,
+		     unsigned long type)
+{
+	extern const struct xen_alt_inst_patch_elem xen_alt_inst_array[];
+	extern const unsigned long xen_alt_inst_array_size;
+
+	unsigned long dest_tag = stag;
+	unsigned long i;
+
+	for (i = 0;
+	     i < xen_alt_inst_array_size / sizeof(xen_alt_inst_array[0]);
+	     i++) {
+		const struct xen_alt_inst_patch_elem *p +			&xen_alt_inst_array[i];
+		if (p->type == type) {
+			unsigned long src_tag;
+
+			for (src_tag = p->stag;
+			     src_tag < p->etag;
+			     src_tag = paravirt_get_next_tag(src_tag)) {
+				const cmp_inst_t inst +					paravirt_read_inst(src_tag);
+				paravirt_write_inst(dest_tag, inst);
+
+				BUG_ON(dest_tag >= etag);
+				dest_tag = paravirt_get_next_tag(dest_tag);
+			}
+			break;
+		}
+	}
+
+	return dest_tag;
+}
+
+void
+xen_alt_inst_patch(void)
+{
+	extern struct paravirt_alt_inst_patch __start_paravirt_insts[];
+	extern struct paravirt_alt_inst_patch __stop_paravirt_insts[];
+
+	paravirt_alt_inst_patch_apply(__start_paravirt_insts,
+				      __stop_paravirt_insts,
+				      &__xen_alt_inst_patch);
+}
+
+#ifdef CONFIG_MODULES
+void
+xen_alt_inst_patch_module(struct paravirt_alt_inst_patch *start,
+			  struct paravirt_alt_inst_patch *end)
+{
+	if (is_running_on_xen())
+		paravirt_alt_inst_patch_apply(start, end,
+					      &__xen_alt_inst_patch);
+}
+#endif
+
+#else
+#define xen_alt_bundle_patch()	do { } while (0)
+#define xen_alt_inst_patch()	do { } while (0)
+#endif /* CONFIG_PARAVIRT_ALT */
+
+
+#ifdef CONFIG_PARAVIRT_NOP_B_PATCH
+#include <asm/paravirt_nop.h>
+static void __init
+xen_nop_b_patch(void)
+{
+	extern const struct paravirt_nop_patch __start_paravirt_nop_b[];
+	extern const struct paravirt_nop_patch __stop_paravirt_nop_b[];
+
+	paravirt_nop_b_patch_apply(__start_paravirt_nop_b,
+				   __stop_paravirt_nop_b);
+}
+#else
+#define xen_nop_b_patch()	do { } while (0)
+#endif
+
+
+#ifdef CONFIG_PARAVIRT_ENTRY
+
+#include <asm/paravirt_entry.h>
+
+extern void *xen_switch_to;
+extern void *xen_leave_syscall;
+extern void *xen_leave_kernel;
+extern void *xen_pal_call_static;
+extern void *xen_work_processed_syscall;
+
+const static struct paravirt_entry xen_entries[] __initdata = {
+	{&xen_switch_to,		PARAVIRT_ENTRY_SWITCH_TO},
+	{&xen_leave_syscall,		PARAVIRT_ENTRY_LEAVE_SYSCALL},
+	{&xen_leave_kernel,		PARAVIRT_ENTRY_LEAVE_KERNEL},
+	{&xen_pal_call_static,		PARAVIRT_ENTRY_PAL_CALL_STATIC},
+	{&xen_work_processed_syscall,	PARAVIRT_ENTRY_WORK_PROCESSED_SYSCALL},
+};
+
+void __init
+xen_entry_patch(void)
+{
+	extern const struct paravirt_entry_patch __start_paravirt_entry[];
+	extern const struct paravirt_entry_patch __stop_paravirt_entry[];
+
+	paravirt_entry_patch_apply(__start_paravirt_entry,
+				   __stop_paravirt_entry,
+				   xen_entries,
+				   sizeof(xen_entries)/sizeof(xen_entries[0]));
+}
+#else
+#define xen_entry_patch()	do { } while (0)
+#endif
+
+
+void __init
+xen_paravirt_patch(void)
+{
+	xen_alt_bundle_patch();
+	xen_alt_inst_patch();
+	xen_nop_b_patch();
+	xen_entry_patch();
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "linux"
+ * c-basic-offset: 8
+ * tab-width: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/arch/ia64/xen/privops_asm.S b/arch/ia64/xen/privops_asm.S
new file mode 100644
index 0000000..40e400e
--- /dev/null
+++ b/arch/ia64/xen/privops_asm.S
@@ -0,0 +1,221 @@
+/******************************************************************************
+ * linux/arch/ia64/xen/privop_s.S
+ *
+ * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <asm/intrinsics.h>
+#include <linux/init.h>
+#include <asm/paravirt_alt.h>
+
+#ifdef CONFIG_MODULES
+#define __INIT_OR_MODULE	.text
+#define __INITDATA_OR_MODULE	.data
+#else
+#define __INIT_OR_MODULE	__INIT
+#define __INITDATA_OR_MODULE	__INITDATA
+#endif /* CONFIG_MODULES */
+
+	__INIT_OR_MODULE
+	.align 32
+	.proc nop_b_inst_bundle
+	.global nop_b_inst_bundle
+nop_b_inst_bundle:
+	{
+		nop.b 0
+		nop.b 0
+		nop.b 0
+	}
+	.endp nop_b_inst_bundle
+	__FINIT
+
+	/* NOTE: nop.[mfi] has same format */
+	__INIT_OR_MODULE
+	.align 32
+	.proc nop_mfi_inst_bundle
+	.global nop_mfi_inst_bundle
+nop_mfi_inst_bundle:
+	{
+		nop.m 0
+		nop.f 0
+		nop.i 0
+	}
+	.endp nop_mfi_inst_bundle
+	__FINIT
+
+	__INIT_OR_MODULE
+	.align 32
+	.proc nop_bundle
+	.global nop_bundle
+nop_bundle:
+nop_bundle_start:
+	{
+		nop 0
+		nop 0
+		nop 0
+	}
+nop_bundle_end:
+	.endp nop_bundle
+	__FINIT
+
+	__INITDATA_OR_MODULE
+	.align 8
+	.global nop_bundle_size
+nop_bundle_size:
+	data8	nop_bundle_end - nop_bundle_start
+
+#define DEFINE_PRIVOP(name, instr)					\
+	.align 32;							\
+	.proc  xen_ ## name ## _instr;					\
+	xen_ ## name ## _instr:;					\
+	xen_ ## name ## _instr_start:;					\
+	{;								\
+	[xen_ ## name ## _stag:]					\
+		instr;							\
+	[xen_ ## name ## _etag:]					\
+		nop 0;							\
+		nop 0;							\
+	};								\
+	xen_ ## name ## _instr_end:;					\
+	.endp  xen_ ## name ## _instr;
+
+	__INIT_OR_MODULE
+	DEFINE_PRIVOP(rfi,		XEN_HYPER_RFI)
+	DEFINE_PRIVOP(rsm_psr_dt,	XEN_HYPER_RSM_PSR_DT)
+	DEFINE_PRIVOP(ssm_psr_dt,	XEN_HYPER_SSM_PSR_DT)
+	DEFINE_PRIVOP(cover,		XEN_HYPER_COVER)
+	DEFINE_PRIVOP(itc_d,		XEN_HYPER_ITC_D)
+	DEFINE_PRIVOP(itc_i,		XEN_HYPER_ITC_I)
+	DEFINE_PRIVOP(ssm_i,		XEN_HYPER_SSM_I)
+	DEFINE_PRIVOP(get_ivr,		XEN_HYPER_GET_IVR)
+	DEFINE_PRIVOP(get_tpr,		XEN_HYPER_GET_TPR)
+	DEFINE_PRIVOP(set_tpr,		XEN_HYPER_SET_TPR)
+	DEFINE_PRIVOP(eoi,		XEN_HYPER_EOI)
+	DEFINE_PRIVOP(set_itm,		XEN_HYPER_SET_ITM)
+	DEFINE_PRIVOP(thash,		XEN_HYPER_THASH)
+	DEFINE_PRIVOP(ptc_ga,		XEN_HYPER_PTC_GA)
+	DEFINE_PRIVOP(itr_d,		XEN_HYPER_ITR_D)
+	DEFINE_PRIVOP(get_rr,		XEN_HYPER_GET_RR)
+	DEFINE_PRIVOP(set_rr,		XEN_HYPER_SET_RR)
+	DEFINE_PRIVOP(set_kr,		XEN_HYPER_SET_KR)
+	DEFINE_PRIVOP(fc,		XEN_HYPER_FC)
+	DEFINE_PRIVOP(get_cpuid,	XEN_HYPER_GET_CPUID)
+	DEFINE_PRIVOP(get_pmd,		XEN_HYPER_GET_PMD)
+	DEFINE_PRIVOP(get_eflag,	XEN_HYPER_GET_EFLAG)
+	DEFINE_PRIVOP(set_eflag,	XEN_HYPER_SET_EFLAG)
+	DEFINE_PRIVOP(get_psr,		XEN_HYPER_GET_PSR)
+	DEFINE_PRIVOP(set_rr0_to_rr4,	XEN_HYPER_SET_RR0_TO_RR4)
+	__FINIT
+
+
+#define PARAVIRT_ALT_BUNDLE_ELEM(name, type)				\
+	data8 xen_ ## name ## _instr_start;				\
+	data8 xen_ ## name ## _instr_end;				\
+	data8 type;
+
+	__INITDATA_OR_MODULE
+	.align 8
+	.global xen_alt_bundle_array
+xen_alt_bundle_array:
+xen_alt_bundle_array_start:
+	PARAVIRT_ALT_BUNDLE_ELEM(rfi,		PARAVIRT_INST_RFI)
+	PARAVIRT_ALT_BUNDLE_ELEM(rsm_psr_dt,	PARAVIRT_INST_RSM_DT)
+	PARAVIRT_ALT_BUNDLE_ELEM(ssm_psr_dt,	PARAVIRT_INST_SSM_DT)
+	PARAVIRT_ALT_BUNDLE_ELEM(cover,		PARAVIRT_INST_COVER)
+	PARAVIRT_ALT_BUNDLE_ELEM(itc_d,		PARAVIRT_INST_ITC_D)
+	PARAVIRT_ALT_BUNDLE_ELEM(itc_i,		PARAVIRT_INST_ITC_I)
+	PARAVIRT_ALT_BUNDLE_ELEM(ssm_i,		PARAVIRT_INST_SSM_I)
+	PARAVIRT_ALT_BUNDLE_ELEM(get_ivr,	PARAVIRT_INST_GET_IVR)
+	PARAVIRT_ALT_BUNDLE_ELEM(get_tpr,	PARAVIRT_INST_GET_TPR)
+	PARAVIRT_ALT_BUNDLE_ELEM(set_tpr,	PARAVIRT_INST_SET_TPR)
+	PARAVIRT_ALT_BUNDLE_ELEM(eoi,		PARAVIRT_INST_EOI)
+	PARAVIRT_ALT_BUNDLE_ELEM(set_itm,	PARAVIRT_INST_SET_ITM)
+	PARAVIRT_ALT_BUNDLE_ELEM(thash,		PARAVIRT_INST_THASH)
+	PARAVIRT_ALT_BUNDLE_ELEM(ptc_ga,	PARAVIRT_INST_PTC_GA)
+	PARAVIRT_ALT_BUNDLE_ELEM(itr_d,		PARAVIRT_INST_ITR_D)
+	PARAVIRT_ALT_BUNDLE_ELEM(get_rr,	PARAVIRT_INST_GET_RR)
+	PARAVIRT_ALT_BUNDLE_ELEM(set_rr,	PARAVIRT_INST_SET_RR)
+	PARAVIRT_ALT_BUNDLE_ELEM(set_kr,	PARAVIRT_INST_SET_KR)
+	PARAVIRT_ALT_BUNDLE_ELEM(fc,		PARAVIRT_INST_FC)
+	PARAVIRT_ALT_BUNDLE_ELEM(get_cpuid,	PARAVIRT_INST_GET_CPUID)
+	PARAVIRT_ALT_BUNDLE_ELEM(get_pmd,	PARAVIRT_INST_GET_PMD)
+	PARAVIRT_ALT_BUNDLE_ELEM(get_eflag,	PARAVIRT_INST_GET_EFLAG)
+	PARAVIRT_ALT_BUNDLE_ELEM(set_eflag,	PARAVIRT_INST_SET_EFLAG)
+	PARAVIRT_ALT_BUNDLE_ELEM(get_psr,	PARAVIRT_INST_GET_PSR)
+
+	PARAVIRT_ALT_BUNDLE_ELEM(ssm_i,		PARAVIRT_BNDL_SSM_I)
+	PARAVIRT_ALT_BUNDLE_ELEM(rsm_i,		PARAVIRT_BNDL_RSM_I)
+	PARAVIRT_ALT_BUNDLE_ELEM(get_psr_i,	PARAVIRT_BNDL_GET_PSR_I)
+	PARAVIRT_ALT_BUNDLE_ELEM(intrin_local_irq_restore,
+					PARAVIRT_BNDL_INTRIN_LOCAL_IRQ_RESTORE)
+xen_alt_bundle_array_end:
+
+	.align 8
+	.global xen_alt_bundle_array_size
+xen_alt_bundle_array_size:
+	.long xen_alt_bundle_array_end - xen_alt_bundle_array_start
+
+
+#define PARAVIRT_ALT_INST_ELEM(name, type)				\
+	data8 xen_ ## name ## _stag ;					\
+	data8 xen_ ## name ## _etag ;					\
+	data8 type
+
+	__INITDATA_OR_MODULE
+	.align 8
+	.global xen_alt_inst_array
+xen_alt_inst_array:
+xen_alt_inst_array_start:
+	PARAVIRT_ALT_INST_ELEM(rfi,		PARAVIRT_INST_RFI)
+	PARAVIRT_ALT_INST_ELEM(rsm_psr_dt,	PARAVIRT_INST_RSM_DT)
+	PARAVIRT_ALT_INST_ELEM(ssm_psr_dt,	PARAVIRT_INST_SSM_DT)
+	PARAVIRT_ALT_INST_ELEM(cover,		PARAVIRT_INST_COVER)
+	PARAVIRT_ALT_INST_ELEM(itc_d,		PARAVIRT_INST_ITC_D)
+	PARAVIRT_ALT_INST_ELEM(itc_i,		PARAVIRT_INST_ITC_I)
+	PARAVIRT_ALT_INST_ELEM(ssm_i,		PARAVIRT_INST_SSM_I)
+	PARAVIRT_ALT_INST_ELEM(get_ivr,		PARAVIRT_INST_GET_IVR)
+	PARAVIRT_ALT_INST_ELEM(get_tpr,		PARAVIRT_INST_GET_TPR)
+	PARAVIRT_ALT_INST_ELEM(set_tpr,		PARAVIRT_INST_SET_TPR)
+	PARAVIRT_ALT_INST_ELEM(eoi,		PARAVIRT_INST_EOI)
+	PARAVIRT_ALT_INST_ELEM(set_itm,		PARAVIRT_INST_SET_ITM)
+	PARAVIRT_ALT_INST_ELEM(thash,		PARAVIRT_INST_THASH)
+	PARAVIRT_ALT_INST_ELEM(ptc_ga,		PARAVIRT_INST_PTC_GA)
+	PARAVIRT_ALT_INST_ELEM(itr_d,		PARAVIRT_INST_ITR_D)
+	PARAVIRT_ALT_INST_ELEM(get_rr,		PARAVIRT_INST_GET_RR)
+	PARAVIRT_ALT_INST_ELEM(set_rr,		PARAVIRT_INST_SET_RR)
+	PARAVIRT_ALT_INST_ELEM(set_kr,		PARAVIRT_INST_SET_KR)
+	PARAVIRT_ALT_INST_ELEM(fc,		PARAVIRT_INST_FC)
+	PARAVIRT_ALT_INST_ELEM(get_cpuid,	PARAVIRT_INST_GET_CPUID)
+	PARAVIRT_ALT_INST_ELEM(get_pmd,		PARAVIRT_INST_GET_PMD)
+	PARAVIRT_ALT_INST_ELEM(get_eflag,	PARAVIRT_INST_GET_EFLAG)
+	PARAVIRT_ALT_INST_ELEM(set_eflag,	PARAVIRT_INST_SET_EFLAG)
+	PARAVIRT_ALT_INST_ELEM(get_psr,		PARAVIRT_INST_GET_PSR)
+	PARAVIRT_ALT_INST_ELEM(set_rr0_to_rr4,	PARAVIRT_INST_SET_RR0_TO_RR4)
+
+	PARAVIRT_ALT_INST_ELEM(ssm_i,		PARAVIRT_BNDL_SSM_I)
+	PARAVIRT_ALT_INST_ELEM(rsm_i,		PARAVIRT_BNDL_RSM_I)
+	PARAVIRT_ALT_INST_ELEM(get_psr_i,	PARAVIRT_BNDL_GET_PSR_I)
+	PARAVIRT_ALT_INST_ELEM(intrin_local_irq_restore,
+					PARAVIRT_BNDL_INTRIN_LOCAL_IRQ_RESTORE)
+xen_alt_inst_array_end:
+
+	.align 8
+	.global xen_alt_inst_array_size
+xen_alt_inst_array_size:
+	.long xen_alt_inst_array_end - xen_alt_inst_array_start
diff --git a/arch/ia64/xen/privops_c.c b/arch/ia64/xen/privops_c.c
new file mode 100644
index 0000000..0fa2e23
--- /dev/null
+++ b/arch/ia64/xen/privops_c.c
@@ -0,0 +1,279 @@
+/******************************************************************************
+ * arch/ia64/xen/privops_c.c
+ *
+ * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <linux/linkage.h>
+#include <linux/init.h>
+#include <linux/module.h>
+
+#include <xen/interface/xen.h>
+
+#include <asm/asm-offsets.h>
+#define XEN_PSR_I_ADDR_ADDR	((uint8_t **)(XSI_BASE + XSI_PSR_I_ADDR_OFS))
+
+
+void __init_or_module
+xen_privop_ssm_i(void)
+{
+	/*
+	 * int masked = !xen_get_virtual_psr_i();
+	 *	// masked = *(*XEN_MAPPEDREGS->interrupt_mask_addr)
+	 * xen_set_virtual_psr_i(1)
+	 *	// *(*XEN_MAPPEDREGS->interrupt_mask_addr) = 0
+	 * // compiler barrier
+	 * if (masked) {
+	 *	uint8_t* pend_int_addr +	 *	
(uint8_t*)(*XEN_MAPPEDREGS->interrupt_mask_addr) - 1;
+	 *	uint8_t pending = *pend_int_addr;
+	 *	if (pending)
+	 *		XEN_HYPER_SSM_I
+	 * }
+	 */
+	register uint8_t *tmp asm ("r8");
+	register int masked asm ("r9");
+	register uint8_t *pending_intr_addr asm ("r10");
+
+	asm volatile(".global xen_ssm_i_instr\n\t"
+		     "xen_ssm_i_instr:\n\t"
+		     ".global xen_ssm_i_instr_start\n\t"
+		     "xen_ssm_i_instr_start:\n\t"
+		     ".global xen_ssm_i_stag\n\t"
+		     "[xen_ssm_i_stag:]\n\t"
+		     /* tmp = &XEN_MAPPEDREGS->interrupt_mask_addr */
+		     "mov %[tmp]=%[XEN_PSR_I_ADDR_ADDR_IMM]\n\t"
+		     ";;\n\t"
+		     /* tmp = *XEN_MAPPEDREGS->interrupt_mask_addr */
+		     "ld8 %[tmp]=[%[tmp]]\n\t"
+		     ";;\n\t"
+		     /* pending_intr_addr = tmp - 1 */
+		     "add %[pending_intr_addr]=-1,%[tmp]\n\t"
+		     /* masked = *tmp */
+		     "ld1 %[masked]=[%[tmp]]\n\t"
+		     ";;\n\t"
+		     /* *tmp = 0 */
+		     "st1 [%[tmp]]=r0\n\t"
+		     /* p6 = !masked */
+		     "cmp.ne.unc p6,p0=%[masked],r0\n\t"
+		     ";;\n\t"
+		     /* tmp = *pending_intr_addr */
+		     "(p6) ld1 %[tmp]=[%[pending_intr_addr]]\n\t"
+		     ";;\n\t"
+		     /* p7 = p6 && !tmp */
+		     "(p6) cmp.ne.unc p7,p0=%[tmp],r0\n\t"
+		     ";;\n\t"
+		     "(p7) break %[HYPERPRIVOP_SSM_I_IMM]\n\t"
+		     ".global xen_ssm_i_etag\n\t"
+		     "[xen_ssm_i_etag:]\n\t"
+		     ".global xen_ssm_i_instr_end\n\t"
+		     "xen_ssm_i_instr_end:\n\t"
+		     :
+		     [tmp] "=r"(tmp),
+		     [pending_intr_addr] "=r"(pending_intr_addr),
+		     [masked] "=r"(masked),
+
+		     "=m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR))
+		     :
+		     [XEN_PSR_I_ADDR_ADDR_IMM] "i"(XEN_PSR_I_ADDR_ADDR),
+		     [HYPERPRIVOP_SSM_I_IMM] "i"(HYPERPRIVOP_SSM_I),
+
+		     "m"(*((uint8_t *)XEN_PSR_I_ADDR_ADDR)),
+		     "m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR)),
+		     "m"(*(*((uint8_t **)XEN_PSR_I_ADDR_ADDR) - 1))
+		     :
+		     "memory",
+		     /*
+		      * predicate registers can't be specified as C variables
+		      * so that we use p6, p7, p8 here.
+		      */
+		     "p6", /* is_old */
+		     "p7"  /* is_pending */
+		);
+}
+
+void __init_or_module
+xen_privop_rsm_i(void)
+{
+	/*
+	 * psr_i_addr_addr = XEN_MAPPEDREGS->interrupt_mask_addr
+	 *                 = XEN_PSR_I_ADDR_ADDR;
+	 * psr_i_addr = *psr_i_addr_addr;
+	 * *psr_i_addr = 1;
+	 */
+	register unsigned long psr_i_addr asm("r8");
+	register uint8_t mask asm ("r9");
+	asm volatile (".global xen_rsm_i_instr\n\t"
+		      "xen_rsm_i_instr:\n\t"
+		      ".global xen_rsm_i_instr_start\n\t"
+		      "xen_rsm_i_instr_start:\n\t"
+		      ".global xen_rsm_i_stag\n\t"
+		      "[xen_rsm_i_stag:]\n\t"
+		      "mov %[psr_i_addr]=%[XEN_PSR_I_ADDR_ADDR_IMM]\n\t"
+		      "mov %[mask]=%[ONE_IMM]\n\t"
+		      ";;\n\t"
+		      "ld8 %[psr_i_addr]=[%[psr_i_addr]]\n\t"
+		      ";;\n\t"
+		      "st1 [%[psr_i_addr]]=%[mask]\n\t"
+		      ".global xen_rsm_i_etag\n\t"
+		      "[xen_rsm_i_etag:]\n\t"
+		      ".global xen_rsm_i_instr_end\n\t"
+		      "xen_rsm_i_instr_end:\n\t"
+		      :
+		      [psr_i_addr] "=r"(psr_i_addr),
+		      [mask] "=r"(mask),
+		      "=m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR)):
+		      [XEN_PSR_I_ADDR_ADDR_IMM] "i"(XEN_PSR_I_ADDR_ADDR),
+		      [ONE_IMM] "i"(1),
+		      "m"(*((uint8_t **)XEN_PSR_I_ADDR_ADDR)):
+		      "memory");
+}
+
+void __init_or_module
+xen_privop_ia64_intrin_local_irq_restore(unsigned long val)
+{
+	/*
+	 * psr_i_addr_addr = XEN_PSR_I_ADDR_ADDR
+	 * psr_i_addr = *psr_i_addr_addr
+	 * pending_intr_addr = psr_i_addr - 1
+	 * if (val & IA64_PSR_I) {
+	 *   masked = *psr_i_addr
+	 *   *psr_i_addr = 0
+	 *   compiler barrier
+	 *   if (masked) {
+	 *	uint8_t pending = *pending_intr_addr;
+	 *	if (pending)
+	 *		XEN_HYPER_SSM_I
+	 *   }
+	 * } else {
+	 *   *psr_i_addr = 1
+	 * }
+	 */
+
+	register unsigned long __val asm("r8") = val;
+	register uint8_t *psr_i_addr asm ("r9");
+	register uint8_t *pending_intr_addr asm ("r10");
+	register uint8_t masked asm ("r11");
+	register unsigned long one_or_pending asm ("r8");
+
+	asm volatile (
+		".global xen_intrin_local_irq_restore_instr\n\t"
+		"xen_intrin_local_irq_restore_instr:\n\t"
+		".global xen_intrin_local_irq_restore_instr_start\n\t"
+		"xen_intrin_local_irq_restore_instr_start:\n\t"
+		".global xen_intrin_local_irq_restore_stag\n\t"
+		"[xen_intrin_local_irq_restore_stag:]\n\t"
+		"tbit.nz p6,p7=%[val],%[IA64_PSR_I_BIT_IMM]\n\t"
+		"mov %[psr_i_addr]=%[XEN_PSR_I_ADDR_ADDR_IMM]\n\t"
+		";;\n\t"
+		"ld8 %[psr_i_addr]=[%[psr_i_addr]]\n\t"
+		"(p7)mov %[one_or_pending]=%[ONE_IMM]\n\t"
+		";;\n\t"
+		"add %[pending_intr_addr]=-1,%[psr_i_addr]\n\t"
+		";;\n\t"
+		"(p6) ld1 %[masked]=[%[psr_i_addr]]\n\t"
+		"(p7) st1 [%[psr_i_addr]]=%[one_or_pending]\n\t"
+		";;\n\t"
+		"(p6) st1 [%[psr_i_addr]]=r0\n\t"
+		"(p6) cmp.ne.unc p8,p0=%[masked],r0\n\t"
+		"(p6) ld1 %[one_or_pending]=[%[pending_intr_addr]]\n\t"
+		";;\n\t"
+		"(p8) cmp.eq.unc p9,p0=%[one_or_pending],r0\n\t"
+		";;\n\t"
+		"(p9) break %[HYPERPRIVOP_SSM_I_IMM]\n\t"
+		".global xen_intrin_local_irq_restore_etag\n\t"
+		"[xen_intrin_local_irq_restore_etag:]\n\t"
+		".global xen_intrin_local_irq_restore_instr_end\n\t"
+		"xen_intrin_local_irq_restore_instr_end:\n\t"
+		:
+		[psr_i_addr] "=r"(psr_i_addr),
+		[pending_intr_addr] "=r"(pending_intr_addr),
+		[masked] "=r"(masked),
+		[one_or_pending] "=r"(one_or_pending),
+
+		"=m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR))
+		:
+		[val] "r"(__val),
+		[IA64_PSR_I_BIT_IMM] "i"(IA64_PSR_I_BIT),
+		[ONE_IMM] "i"(1),
+
+		[XEN_PSR_I_ADDR_ADDR_IMM] "i"(XEN_PSR_I_ADDR_ADDR),
+		[HYPERPRIVOP_SSM_I_IMM] "i"(HYPERPRIVOP_SSM_I),
+
+		"m"(*((uint8_t *)XEN_PSR_I_ADDR_ADDR)),
+		"m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR)),
+		"m"(*(*((uint8_t **)XEN_PSR_I_ADDR_ADDR) - 1))
+		:
+		"memory",
+		"p6", /* is_psr_i_set  */
+		"p7", /* not_psr_i_set */
+		"p8", /* is_masked && is_psr_i_set */
+		"p9"  /* is_pending && is_masked && is_psr_i_set */
+		);
+}
+
+unsigned long __init_or_module
+xen_privop_get_psr_i(void)
+{
+	/*
+	 * tmp = XEN_MAPPEDREGS->interrupt_mask_addr = XEN_PSR_I_ADDR_ADDR;
+	 * tmp = *tmp
+	 * tmp = *tmp;
+	 * psr_i = tmp? 0: IA64_PSR_I;
+	 */
+	register unsigned long psr_i asm ("r8");
+	register unsigned long tmp asm ("r9");
+
+	asm volatile (".global xen_get_psr_i_instr\n\t"
+		      "xen_get_psr_i_instr:\n\t"
+		      ".global xen_get_psr_i_instr_start\n\t"
+		      "xen_get_psr_i_instr_start:\n\t"
+		      ".global xen_get_psr_i_stag\n\t"
+		      "[xen_get_psr_i_stag:]\n\t"
+		      /* tmp = XEN_PSR_I_ADDR_ADDR */
+		      "mov %[tmp]=%[XEN_PSR_I_ADDR_ADDR_IMM]\n\t"
+		      ";;\n\t"
+		      /* tmp = *tmp = *XEN_PSR_I_ADDR_ADDR */
+		      "ld8 %[tmp]=[%[tmp]]\n\t"
+		      /* psr_i = 0 */
+		      "mov %[psr_i]=0\n\t"
+		      ";;\n\t"
+		      /* tmp = *(uint8_t*)tmp */
+		      "ld1 %[tmp]=[%[tmp]]\n\t"
+		      ";;\n\t"
+		      /* if (!tmp) psr_i = IA64_PSR_I */
+		      "cmp.eq.unc p6,p0=%[tmp],r0\n\t"
+		      ";;\n\t"
+		      "(p6) mov %[psr_i]=%[IA64_PSR_I_IMM]\n\t"
+		      ".global xen_get_psr_i_etag\n\t"
+		      "[xen_get_psr_i_etag:]\n\t"
+		      ".global xen_get_psr_i_instr_end\n\t"
+		      "xen_get_psr_i_instr_end:\n\t"
+		      :
+		      [tmp] "=r"(tmp),
+		      [psr_i] "=r"(psr_i)
+		      :
+		      [XEN_PSR_I_ADDR_ADDR_IMM] "i"(XEN_PSR_I_ADDR_ADDR),
+		      [IA64_PSR_I_IMM] "i"(IA64_PSR_I),
+		      "m"(*((uint8_t **)XEN_PSR_I_ADDR_ADDR)),
+		      "m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR))
+		      :
+		      "p6");
+	return psr_i;
+}
diff --git a/arch/ia64/xen/xensetup.S b/arch/ia64/xen/xensetup.S
index 17ad297..2d3d5d4 100644
--- a/arch/ia64/xen/xensetup.S
+++ b/arch/ia64/xen/xensetup.S
@@ -35,6 +35,16 @@ GLOBAL_ENTRY(early_xen_setup)
 (isBP)	movl r28=XSI_BASE;;
 (isBP)	break 0x1000;;
 
+#ifdef CONFIG_PARAVIRT
+	/* patch privops */
+(isBP)	mov r4=rp
+	;;
+(isBP)	br.call.sptk.many rp=xen_paravirt_patch
+	;;
+(isBP)	mov rp=r4
+	;;
+#endif
+
 	br.ret.sptk.many rp
 	;;
 END(early_xen_setup)
diff --git a/include/asm-ia64/xen/privop.h b/include/asm-ia64/xen/privop.h
index 95e8e8a..d59cc31 100644
--- a/include/asm-ia64/xen/privop.h
+++ b/include/asm-ia64/xen/privop.h
@@ -557,6 +557,18 @@ do {									\
 
 #endif /* ASM_SUPPORTED && !CONFIG_PARAVIRT_ALT */
 
+#ifdef CONFIG_PARAVIRT_ALT
+#if defined(CONFIG_MODULES) && defined(CONFIG_XEN)
+void xen_alt_bundle_patch_module(struct paravirt_alt_bundle_patch *start,
+				 struct paravirt_alt_bundle_patch *end);
+void xen_alt_inst_patch_module(struct paravirt_alt_inst_patch *start,
+			       struct paravirt_alt_inst_patch *end);
+#else
+#define xen_alt_bundle_patch_module(start, end)	do { } while (0)
+#define xen_alt_inst_patch_module(start, end)	do { } while (0)
+#endif
+#endif /* CONFIG_PARAVIRT_ALT */
+
 #endif /* !__ASSEMBLY__ */
 
 /* these routines utilize privilege-sensitive or performance-sensitive
@@ -573,12 +585,24 @@ do {									\
 
 #ifdef CONFIG_XEN
 #ifdef __ASSEMBLY__
+#ifdef CONFIG_PARAVIRT_ENTRY
+#define BR_IF_NATIVE(target, reg_unused, pred_unused)	/* nothing */
+#elif defined(CONFIG_PARAVIRT_NOP_B_PATCH)
+#define BR_IF_NATIVE(target, reg_unused, pred_unused)	\
+	.body ;						\
+	[1:] ;						\
+	br.cond.sptk.many target;; ;			\
+	.section .paravirt_nop_b, "a" ;			\
+	.previous ;					\
+	.xdata8 ".paravirt_nop_b", 1b
+#else
 #define BR_IF_NATIVE(target, reg, pred)		\
 	.body ;					\
 	movl reg=running_on_xen;; ;		\
 	ld4 reg=[reg];; ;			\
 	cmp.eq pred,p0=reg,r0 ;			\
 	(pred)	br.cond.sptk.many target;;
+#endif
 #endif /* __ASSEMBLY__ */
 #endif
 
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 41/50] ia64/xen: introduce xen hypercall routines necessary for domU.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/hypercall.S        |    7 +
 include/asm-ia64/xen/hypercall.h |  426 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 433 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-ia64/xen/hypercall.h

diff --git a/arch/ia64/xen/hypercall.S b/arch/ia64/xen/hypercall.S
index a96f278..7c5242b 100644
--- a/arch/ia64/xen/hypercall.S
+++ b/arch/ia64/xen/hypercall.S
@@ -122,3 +122,10 @@ GLOBAL_ENTRY(xen_set_eflag)
 END(xen_set_eflag)
 #endif /* CONFIG_IA32_SUPPORT */
 #endif /* ASM_SUPPORTED */
+
+GLOBAL_ENTRY(__hypercall)
+	mov r2=r37
+	break 0x1000
+	br.ret.sptk.many b0
+	;;
+END(__hypercall)
diff --git a/include/asm-ia64/xen/hypercall.h b/include/asm-ia64/xen/hypercall.h
new file mode 100644
index 0000000..a266e44
--- /dev/null
+++ b/include/asm-ia64/xen/hypercall.h
@@ -0,0 +1,426 @@
+/******************************************************************************
+ * hypercall.h
+ *
+ * Linux-specific hypervisor handling.
+ *
+ * Copyright (c) 2002-2004, K A Fraser
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software
without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef _ASM_IA64_XEN_HYPERCALL_H
+#define _ASM_IA64_XEN_HYPERCALL_H
+
+#ifndef _ASM_IA64_XEN_HYPERVISOR_H
+# error "please don't include this file directly"
+#endif
+
+#include <asm/xen/xcom_hcall.h>
+struct xencomm_handle;
+extern unsigned long __hypercall(unsigned long a1, unsigned long a2,
+				 unsigned long a3, unsigned long a4,
+				 unsigned long a5, unsigned long cmd);
+
+/*
+ * Assembler stubs for hyper-calls.
+ */
+
+#define _hypercall0(type, name)					\
+({								\
+	long __res;						\
+	__res = __hypercall(0, 0, 0, 0, 0, __HYPERVISOR_##name);\
+	(type)__res;						\
+})
+
+#define _hypercall1(type, name, a1)				\
+({								\
+	long __res;						\
+	__res = __hypercall((unsigned long)a1,			\
+			     0, 0, 0, 0, __HYPERVISOR_##name);	\
+	(type)__res;						\
+})
+
+#define _hypercall2(type, name, a1, a2)				\
+({								\
+	long __res;						\
+	__res = __hypercall((unsigned long)a1,			\
+			    (unsigned long)a2,			\
+			    0, 0, 0, __HYPERVISOR_##name);	\
+	(type)__res;						\
+})
+
+#define _hypercall3(type, name, a1, a2, a3)			\
+({								\
+	long __res;						\
+	__res = __hypercall((unsigned long)a1,			\
+			    (unsigned long)a2,			\
+			    (unsigned long)a3,			\
+			    0, 0, __HYPERVISOR_##name);		\
+	(type)__res;						\
+})
+
+#define _hypercall4(type, name, a1, a2, a3, a4)			\
+({								\
+	long __res;						\
+	__res = __hypercall((unsigned long)a1,			\
+			    (unsigned long)a2,			\
+			    (unsigned long)a3,			\
+			    (unsigned long)a4,			\
+			    0, __HYPERVISOR_##name);		\
+	(type)__res;						\
+})
+
+#define _hypercall5(type, name, a1, a2, a3, a4, a5)		\
+({								\
+	long __res;						\
+	__res = __hypercall((unsigned long)a1,			\
+			    (unsigned long)a2,			\
+			    (unsigned long)a3,			\
+			    (unsigned long)a4,			\
+			    (unsigned long)a5,			\
+			    __HYPERVISOR_##name);		\
+	(type)__res;						\
+})
+
+
+static inline int
+xencomm_arch_hypercall_sched_op(int cmd, struct xencomm_handle *arg)
+{
+	return _hypercall2(int, sched_op, cmd, arg);
+}
+
+static inline long
+HYPERVISOR_set_timer_op(u64 timeout)
+{
+	unsigned long timeout_hi = (unsigned long)(timeout >> 32);
+	unsigned long timeout_lo = (unsigned long)timeout;
+	return _hypercall2(long, set_timer_op, timeout_lo, timeout_hi);
+}
+
+static inline int
+xencomm_arch_hypercall_multicall(struct xencomm_handle *call_list,
+				 int nr_calls)
+{
+	return _hypercall2(int, multicall, call_list, nr_calls);
+}
+
+static inline int
+xencomm_arch_hypercall_memory_op(unsigned int cmd, struct xencomm_handle *arg)
+{
+	return _hypercall2(int, memory_op, cmd, arg);
+}
+
+static inline int
+xencomm_arch_hypercall_event_channel_op(int cmd, struct xencomm_handle *arg)
+{
+	return _hypercall2(int, event_channel_op, cmd, arg);
+}
+
+static inline int
+xencomm_arch_hypercall_xen_version(int cmd, struct xencomm_handle *arg)
+{
+	return _hypercall2(int, xen_version, cmd, arg);
+}
+
+static inline int
+xencomm_arch_hypercall_console_io(int cmd, int count,
+				  struct xencomm_handle *str)
+{
+	return _hypercall3(int, console_io, cmd, count, str);
+}
+
+static inline int
+xencomm_arch_hypercall_physdev_op(int cmd, struct xencomm_handle *arg)
+{
+	return _hypercall2(int, physdev_op, cmd, arg);
+}
+
+static inline int
+xencomm_arch_hypercall_grant_table_op(unsigned int cmd,
+				      struct xencomm_handle *uop,
+				      unsigned int count)
+{
+	return _hypercall3(int, grant_table_op, cmd, uop, count);
+}
+
+int HYPERVISOR_grant_table_op(unsigned int cmd, void *uop, unsigned int count);
+
+extern int xencomm_arch_hypercall_suspend(struct xencomm_handle *arg);
+
+static inline int
+xencomm_arch_hypercall_callback_op(int cmd, struct xencomm_handle *arg)
+{
+	return _hypercall2(int, callback_op, cmd, arg);
+}
+
+static inline unsigned long
+xencomm_arch_hypercall_hvm_op(int cmd, void *arg)
+{
+	return _hypercall2(unsigned long, hvm_op, cmd, arg);
+}
+
+static inline long
+xencomm_arch_hypercall_vcpu_op(int cmd, int cpu, void *arg)
+{
+	return _hypercall3(long, vcpu_op, cmd, cpu, arg);
+}
+
+static inline int
+HYPERVISOR_physdev_op(int cmd, void *arg)
+{
+	switch (cmd) {
+	case PHYSDEVOP_eoi:
+		return _hypercall1(int, ia64_fast_eoi,
+				   ((struct physdev_eoi *)arg)->irq);
+	default:
+		return xencomm_hypercall_physdev_op(cmd, arg);
+	}
+}
+
+static inline int
+xencomm_arch_hypercall_xenoprof_op(int op, struct xencomm_handle *arg)
+{
+	return _hypercall2(int, xenoprof_op, op, arg);
+}
+
+static inline long
+xencomm_arch_hypercall_opt_feature(struct xencomm_handle *arg)
+{
+	return _hypercall1(long, opt_feature, arg);
+}
+
+#define xen_do_IRQ(irq, regs)		\
+do {					\
+	struct pt_regs *old_regs;	\
+	old_regs = set_irq_regs(regs);	\
+	irq_enter();			\
+	__do_IRQ(irq);			\
+	irq_exit();			\
+	set_irq_regs(old_regs);		\
+} while (0)
+#define irq_ctx_init(cpu)	do { } while (0)
+
+#include <linux/err.h>
+#ifdef HAVE_XEN_PLATFORM_COMPAT_H
+#include <xen/platform-compat.h>
+#endif
+
+static inline unsigned long
+__HYPERVISOR_ioremap(unsigned long ioaddr, unsigned long size)
+{
+	return _hypercall3(unsigned long, ia64_dom0vp_op,
+			   IA64_DOM0VP_ioremap, ioaddr, size);
+}
+
+static inline unsigned long
+HYPERVISOR_ioremap(unsigned long ioaddr, unsigned long size)
+{
+	unsigned long ret = ioaddr;
+	if (is_running_on_xen()) {
+		ret = __HYPERVISOR_ioremap(ioaddr, size);
+		if (unlikely(ret == -ENOSYS))
+			panic("hypercall %s failed with %ld. "
+			      "Please check Xen and Linux config mismatch\n",
+			      __func__, -ret);
+		else if (unlikely(IS_ERR_VALUE(ret)))
+			ret = ioaddr;
+	}
+	return ret;
+}
+
+static inline unsigned long
+__HYPERVISOR_phystomach(unsigned long gpfn)
+{
+	return _hypercall2(unsigned long, ia64_dom0vp_op,
+			   IA64_DOM0VP_phystomach, gpfn);
+}
+
+static inline unsigned long
+HYPERVISOR_phystomach(unsigned long gpfn)
+{
+	unsigned long ret = gpfn;
+	if (is_running_on_xen())
+		ret = __HYPERVISOR_phystomach(gpfn);
+	return ret;
+}
+
+static inline unsigned long
+__HYPERVISOR_machtophys(unsigned long mfn)
+{
+	return _hypercall2(unsigned long, ia64_dom0vp_op,
+			   IA64_DOM0VP_machtophys, mfn);
+}
+
+static inline unsigned long
+HYPERVISOR_machtophys(unsigned long mfn)
+{
+	unsigned long ret = mfn;
+	if (is_running_on_xen())
+		ret = __HYPERVISOR_machtophys(mfn);
+	return ret;
+}
+
+static inline unsigned long
+__HYPERVISOR_zap_physmap(unsigned long gpfn, unsigned int extent_order)
+{
+	return _hypercall3(unsigned long, ia64_dom0vp_op,
+			   IA64_DOM0VP_zap_physmap, gpfn, extent_order);
+}
+
+static inline unsigned long
+HYPERVISOR_zap_physmap(unsigned long gpfn, unsigned int extent_order)
+{
+	unsigned long ret = 0;
+	if (is_running_on_xen())
+		ret = __HYPERVISOR_zap_physmap(gpfn, extent_order);
+	return ret;
+}
+
+static inline unsigned long
+__HYPERVISOR_add_physmap(unsigned long gpfn, unsigned long mfn,
+			 unsigned long flags, domid_t domid)
+{
+	return _hypercall5(unsigned long, ia64_dom0vp_op,
+			   IA64_DOM0VP_add_physmap, gpfn, mfn, flags, domid);
+}
+
+static inline unsigned long
+HYPERVISOR_add_physmap(unsigned long gpfn, unsigned long mfn,
+		       unsigned long flags, domid_t domid)
+{
+	unsigned long ret = 0;
+	BUG_ON(!is_running_on_xen());
+	if (is_running_on_xen())
+		ret = __HYPERVISOR_add_physmap(gpfn, mfn, flags, domid);
+	return ret;
+}
+
+static inline unsigned long
+__HYPERVISOR_add_physmap_with_gmfn(unsigned long gpfn, unsigned long gmfn,
+				   unsigned long flags, domid_t domid)
+{
+	return _hypercall5(unsigned long, ia64_dom0vp_op,
+			   IA64_DOM0VP_add_physmap_with_gmfn,
+			   gpfn, gmfn, flags, domid);
+}
+
+static inline unsigned long
+HYPERVISOR_add_physmap_with_gmfn(unsigned long gpfn, unsigned long gmfn,
+				 unsigned long flags, domid_t domid)
+{
+	unsigned long ret = 0;
+	BUG_ON(!is_running_on_xen());
+	if (is_running_on_xen())
+		ret = __HYPERVISOR_add_physmap_with_gmfn(gpfn, gmfn,
+							 flags, domid);
+	return ret;
+}
+
+#ifdef CONFIG_XEN_IA64_EXPOSE_P2M
+static inline unsigned long
+HYPERVISOR_expose_p2m(unsigned long conv_start_gpfn,
+		      unsigned long assign_start_gpfn,
+		      unsigned long expose_size, unsigned long granule_pfn)
+{
+	return _hypercall5(unsigned long, ia64_dom0vp_op,
+			   IA64_DOM0VP_expose_p2m, conv_start_gpfn,
+			   assign_start_gpfn, expose_size, granule_pfn);
+}
+
+static inline int
+xencomm_arch_expose_foreign_p2m(unsigned long gpfn,
+				domid_t domid, struct xencomm_handle *arg,
+				unsigned long flags)
+{
+	return _hypercall5(int, ia64_dom0vp_op,
+			   IA64_DOM0VP_expose_foreign_p2m,
+			   gpfn, domid, arg, flags);
+}
+
+static inline int
+HYPERVISOR_unexpose_foreign_p2m(unsigned long gpfn, domid_t domid)
+{
+	return _hypercall3(int, ia64_dom0vp_op,
+			   IA64_DOM0VP_unexpose_foreign_p2m, gpfn, domid);
+}
+#endif
+
+static inline int
+xencomm_arch_hypercall_perfmon_op(unsigned long cmd,
+				  struct xencomm_handle *arg,
+				  unsigned long count)
+{
+	return _hypercall4(int, ia64_dom0vp_op,
+			   IA64_DOM0VP_perfmon, cmd, arg, count);
+}
+
+static inline int
+xencomm_arch_hypercall_fpswa_revision(struct xencomm_handle *arg)
+{
+	return _hypercall2(int, ia64_dom0vp_op,
+			   IA64_DOM0VP_fpswa_revision, arg);
+}
+
+static inline int
+xencomm_arch_hypercall_ia64_debug_op(unsigned long cmd,
+				     unsigned long domain,
+				     struct xencomm_handle *arg)
+{
+	return _hypercall3(int, ia64_debug_op, cmd, domain, arg);
+}
+
+static inline int
+HYPERVISOR_add_io_space(unsigned long phys_base,
+			unsigned long sparse,
+			unsigned long space_number)
+{
+	return _hypercall4(int, ia64_dom0vp_op, IA64_DOM0VP_add_io_space,
+			   phys_base, sparse, space_number);
+}
+
+/* for balloon driver */
+#define HYPERVISOR_update_va_mapping(va, new_val, flags) (0)
+
+/* Use xencomm to do hypercalls.  */
+#define HYPERVISOR_sched_op xencomm_hypercall_sched_op
+#define HYPERVISOR_event_channel_op xencomm_hypercall_event_channel_op
+#define HYPERVISOR_callback_op xencomm_hypercall_callback_op
+#define HYPERVISOR_multicall xencomm_hypercall_multicall
+#define HYPERVISOR_xen_version xencomm_hypercall_xen_version
+#define HYPERVISOR_console_io xencomm_hypercall_console_io
+#define HYPERVISOR_hvm_op xencomm_hypercall_hvm_op
+#define HYPERVISOR_memory_op xencomm_hypercall_memory_op
+#define HYPERVISOR_xenoprof_op xencomm_hypercall_xenoprof_op
+#define HYPERVISOR_perfmon_op xencomm_hypercall_perfmon_op
+#define HYPERVISOR_fpswa_revision xencomm_hypercall_fpswa_revision
+#define HYPERVISOR_suspend xencomm_hypercall_suspend
+#define HYPERVISOR_vcpu_op xencomm_hypercall_vcpu_op
+#define HYPERVISOR_opt_feature xencomm_hypercall_opt_feature
+#define HYPERVISOR_kexec_op xencomm_hypercall_kexec_op
+
+/* to compile gnttab_copy_grant_page() in drivers/xen/core/gnttab.c */
+#define HYPERVISOR_mmu_update(req, count, success_count, domid) ({ BUG(); 0; })
+
+#endif /* _ASM_IA64_XEN_HYPERCALL_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 42/50] ia64/xen: ia64 domU part of xencomm.

import ia64 specific part of xencomm which converts hypercall argument in
virtual address into pseudo physical address (guest physical address).

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/xcom_asm.S          |   27 +++
 arch/ia64/xen/xcom_hcall.c        |  458 +++++++++++++++++++++++++++++++++++++
 arch/ia64/xen/xencomm.c           |  108 +++++++++
 include/asm-ia64/xen/xcom_hcall.h |   55 +++++
 include/asm-ia64/xen/xencomm.h    |   33 +++
 5 files changed, 681 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/xcom_asm.S
 create mode 100644 arch/ia64/xen/xcom_hcall.c
 create mode 100644 arch/ia64/xen/xencomm.c
 create mode 100644 include/asm-ia64/xen/xcom_hcall.h
 create mode 100644 include/asm-ia64/xen/xencomm.h

diff --git a/arch/ia64/xen/xcom_asm.S b/arch/ia64/xen/xcom_asm.S
new file mode 100644
index 0000000..8747908
--- /dev/null
+++ b/arch/ia64/xen/xcom_asm.S
@@ -0,0 +1,27 @@
+/*
+ * xencomm suspend support
+ * Support routines for Xen
+ *
+ * Copyright (C) 2005 Dan Magenheimer <dan.magenheimer at hp.com>
+ */
+#include <asm/asmmacro.h>
+#include <xen/interface/xen.h>
+
+/*
+ * Stub for suspend.
+ * Just force the stacked registers to be written in memory.
+ */
+GLOBAL_ENTRY(xencomm_arch_hypercall_suspend)
+	;;
+	alloc r20=ar.pfs,0,0,6,0
+	mov r2=__HYPERVISOR_sched_op
+	;;
+	/* We don't want to deal with RSE.  */
+	flushrs
+	mov r33=r32
+	mov r32=2 // SCHEDOP_shutdown
+	;;
+	break 0x1000
+	;;
+	br.ret.sptk.many b0
+END(xencomm_arch_hypercall_suspend)
diff --git a/arch/ia64/xen/xcom_hcall.c b/arch/ia64/xen/xcom_hcall.c
new file mode 100644
index 0000000..bfddbd7
--- /dev/null
+++ b/arch/ia64/xen/xcom_hcall.c
@@ -0,0 +1,458 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ *          Tristan Gingold <tristan.gingold at bull.net>
+ *
+ *          Copyright (c) 2007
+ *          Isaku Yamahata <yamahata at valinux co jp>
+ *                          VA Linux Systems Japan K.K.
+ *          consolidate mini and inline version.
+ */
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/gfp.h>
+#include <linux/module.h>
+#include <xen/interface/xen.h>
+#include <xen/interface/memory.h>
+#include <xen/interface/xencomm.h>
+#include <xen/interface/version.h>
+#include <xen/interface/sched.h>
+#include <xen/interface/event_channel.h>
+#include <xen/interface/physdev.h>
+#include <xen/interface/grant_table.h>
+#include <xen/interface/callback.h>
+#include <xen/interface/vcpu.h>
+#include <asm/xen/hypervisor.h>
+#include <asm/page.h>
+#include <asm/uaccess.h>
+#include <asm/xen/xencomm.h>
+
+/* Xencomm notes:
+ * This file defines hypercalls to be used by xencomm.  The hypercalls simply
+ * create inlines or mini descriptors for pointers and then call the raw arch
+ * hypercall xencomm_arch_hypercall_XXX
+ *
+ * If the arch wants to directly use these hypercalls, simply define macros
+ * in asm/xen/hypercall.h, eg:
+ *  #define HYPERVISOR_sched_op xencomm_hypercall_sched_op
+ *
+ * The arch may also define HYPERVISOR_xxx as a function and do more operations
+ * before/after doing the hypercall.
+ *
+ * Note: because only inline or mini descriptors are created these functions
+ * must only be called with in kernel memory parameters.
+ */
+
+int
+xencomm_hypercall_console_io(int cmd, int count, char *str)
+{
+	return xencomm_arch_hypercall_console_io
+		(cmd, count, xencomm_map_no_alloc(str, count));
+}
+EXPORT_SYMBOL_GPL(xencomm_hypercall_console_io);
+
+int
+xencomm_hypercall_event_channel_op(int cmd, void *op)
+{
+	struct xencomm_handle *desc;
+	desc = xencomm_map_no_alloc(op, sizeof(struct evtchn_op));
+	if (desc == NULL)
+		return -EINVAL;
+
+	return xencomm_arch_hypercall_event_channel_op(cmd, desc);
+}
+EXPORT_SYMBOL_GPL(xencomm_hypercall_event_channel_op);
+
+int
+xencomm_hypercall_xen_version(int cmd, void *arg)
+{
+	struct xencomm_handle *desc;
+	unsigned int argsize;
+
+	switch (cmd) {
+	case XENVER_version:
+		/* do not actually pass an argument */
+		return xencomm_arch_hypercall_xen_version(cmd, 0);
+	case XENVER_extraversion:
+		argsize = sizeof(struct xen_extraversion);
+		break;
+	case XENVER_compile_info:
+		argsize = sizeof(struct xen_compile_info);
+		break;
+	case XENVER_capabilities:
+		argsize = sizeof(struct xen_capabilities_info);
+		break;
+	case XENVER_changeset:
+		argsize = sizeof(struct xen_changeset_info);
+		break;
+	case XENVER_platform_parameters:
+		argsize = sizeof(struct xen_platform_parameters);
+		break;
+	case XENVER_get_features:
+		argsize = (arg == NULL) ? 0 : sizeof(struct xen_feature_info);
+		break;
+
+	default:
+		printk(KERN_DEBUG
+		       "%s: unknown version op %d\n", __func__, cmd);
+		return -ENOSYS;
+	}
+
+	desc = xencomm_map_no_alloc(arg, argsize);
+	if (desc == NULL)
+		return -EINVAL;
+
+	return xencomm_arch_hypercall_xen_version(cmd, desc);
+}
+EXPORT_SYMBOL_GPL(xencomm_hypercall_xen_version);
+
+int
+xencomm_hypercall_physdev_op(int cmd, void *op)
+{
+	unsigned int argsize;
+
+	switch (cmd) {
+	case PHYSDEVOP_apic_read:
+	case PHYSDEVOP_apic_write:
+		argsize = sizeof(struct physdev_apic);
+		break;
+	case PHYSDEVOP_alloc_irq_vector:
+	case PHYSDEVOP_free_irq_vector:
+		argsize = sizeof(struct physdev_irq);
+		break;
+	case PHYSDEVOP_irq_status_query:
+		argsize = sizeof(struct physdev_irq_status_query);
+		break;
+
+	default:
+		printk(KERN_DEBUG
+		       "%s: unknown physdev op %d\n", __func__, cmd);
+		return -ENOSYS;
+	}
+
+	return xencomm_arch_hypercall_physdev_op
+		(cmd, xencomm_map_no_alloc(op, argsize));
+}
+
+static int
+xencommize_grant_table_op(struct xencomm_mini **xc_area,
+			  unsigned int cmd, void *op, unsigned int count,
+			  struct xencomm_handle **desc)
+{
+	struct xencomm_handle *desc1;
+	unsigned int argsize;
+
+	switch (cmd) {
+	case GNTTABOP_map_grant_ref:
+		argsize = sizeof(struct gnttab_map_grant_ref);
+		break;
+	case GNTTABOP_unmap_grant_ref:
+		argsize = sizeof(struct gnttab_unmap_grant_ref);
+		break;
+	case GNTTABOP_setup_table:
+	{
+		struct gnttab_setup_table *setup = op;
+
+		argsize = sizeof(*setup);
+
+		if (count != 1)
+			return -EINVAL;
+		desc1 = __xencomm_map_no_alloc
+			(xen_guest_handle(setup->frame_list),
+			 setup->nr_frames *
+			 sizeof(*xen_guest_handle(setup->frame_list)),
+			 *xc_area);
+		if (desc1 == NULL)
+			return -EINVAL;
+		(*xc_area)++;
+		set_xen_guest_handle(setup->frame_list, (void *)desc1);
+		break;
+	}
+	case GNTTABOP_dump_table:
+		argsize = sizeof(struct gnttab_dump_table);
+		break;
+	case GNTTABOP_transfer:
+		argsize = sizeof(struct gnttab_transfer);
+		break;
+	case GNTTABOP_copy:
+		argsize = sizeof(struct gnttab_copy);
+		break;
+	case GNTTABOP_query_size:
+		argsize = sizeof(struct gnttab_query_size);
+		break;
+	default:
+		printk(KERN_DEBUG "%s: unknown hypercall grant table op %d\n",
+		       __func__, cmd);
+		BUG();
+	}
+
+	*desc = __xencomm_map_no_alloc(op, count * argsize, *xc_area);
+	if (*desc == NULL)
+		return -EINVAL;
+	(*xc_area)++;
+
+	return 0;
+}
+
+int
+xencomm_hypercall_grant_table_op(unsigned int cmd, void *op,
+				 unsigned int count)
+{
+	int rc;
+	struct xencomm_handle *desc;
+	XENCOMM_MINI_ALIGNED(xc_area, 2);
+
+	rc = xencommize_grant_table_op(&xc_area, cmd, op, count, &desc);
+	if (rc)
+		return rc;
+
+	return xencomm_arch_hypercall_grant_table_op(cmd, desc, count);
+}
+EXPORT_SYMBOL_GPL(xencomm_hypercall_grant_table_op);
+
+int
+xencomm_hypercall_sched_op(int cmd, void *arg)
+{
+	struct xencomm_handle *desc;
+	unsigned int argsize;
+
+	switch (cmd) {
+	case SCHEDOP_yield:
+	case SCHEDOP_block:
+		argsize = 0;
+		break;
+	case SCHEDOP_shutdown:
+		argsize = sizeof(struct sched_shutdown);
+		break;
+	case SCHEDOP_poll:
+	{
+		struct sched_poll *poll = arg;
+		struct xencomm_handle *ports;
+
+		argsize = sizeof(struct sched_poll);
+		ports = xencomm_map_no_alloc(xen_guest_handle(poll->ports),
+				     sizeof(*xen_guest_handle(poll->ports)));
+
+		set_xen_guest_handle(poll->ports, (void *)ports);
+		break;
+	}
+	default:
+		printk(KERN_DEBUG "%s: unknown sched op %d\n", __func__, cmd);
+		return -ENOSYS;
+	}
+
+	desc = xencomm_map_no_alloc(arg, argsize);
+	if (desc == NULL)
+		return -EINVAL;
+
+	return xencomm_arch_hypercall_sched_op(cmd, desc);
+}
+EXPORT_SYMBOL_GPL(xencomm_hypercall_sched_op);
+
+int
+xencomm_hypercall_multicall(void *call_list, int nr_calls)
+{
+	int rc;
+	int i;
+	struct multicall_entry *mce;
+	struct xencomm_handle *desc;
+	XENCOMM_MINI_ALIGNED(xc_area, nr_calls * 2);
+
+	for (i = 0; i < nr_calls; i++) {
+		mce = (struct multicall_entry *)call_list + i;
+
+		switch (mce->op) {
+		case __HYPERVISOR_update_va_mapping:
+		case __HYPERVISOR_mmu_update:
+			/* No-op on ia64.  */
+			break;
+		case __HYPERVISOR_grant_table_op:
+			rc = xencommize_grant_table_op
+				(&xc_area,
+				 mce->args[0], (void *)mce->args[1],
+				 mce->args[2], &desc);
+			if (rc)
+				return rc;
+			mce->args[1] = (unsigned long)desc;
+			break;
+		case __HYPERVISOR_memory_op:
+		default:
+			printk(KERN_DEBUG
+			       "%s: unhandled multicall op entry op %lu\n",
+			       __func__, mce->op);
+			return -ENOSYS;
+		}
+	}
+
+	desc = xencomm_map_no_alloc(call_list,
+				    nr_calls * sizeof(struct multicall_entry));
+	if (desc == NULL)
+		return -EINVAL;
+
+	return xencomm_arch_hypercall_multicall(desc, nr_calls);
+}
+EXPORT_SYMBOL_GPL(xencomm_hypercall_multicall);
+
+int
+xencomm_hypercall_callback_op(int cmd, void *arg)
+{
+	unsigned int argsize;
+	switch (cmd) {
+	case CALLBACKOP_register:
+		argsize = sizeof(struct callback_register);
+		break;
+	case CALLBACKOP_unregister:
+		argsize = sizeof(struct callback_unregister);
+		break;
+	default:
+		printk(KERN_DEBUG
+		       "%s: unknown callback op %d\n", __func__, cmd);
+		return -ENOSYS;
+	}
+
+	return xencomm_arch_hypercall_callback_op
+		(cmd, xencomm_map_no_alloc(arg, argsize));
+}
+
+static int
+xencommize_memory_reservation(struct xencomm_mini *xc_area,
+			      struct xen_memory_reservation *mop)
+{
+	struct xencomm_handle *desc;
+
+	desc = __xencomm_map_no_alloc(xen_guest_handle(mop->extent_start),
+			mop->nr_extents *
+			sizeof(*xen_guest_handle(mop->extent_start)),
+			xc_area);
+	if (desc == NULL)
+		return -EINVAL;
+
+	set_xen_guest_handle(mop->extent_start, (void *)desc);
+	return 0;
+}
+
+int
+xencomm_hypercall_memory_op(unsigned int cmd, void *arg)
+{
+	GUEST_HANDLE(xen_pfn_t) extent_start_va[2] = {{NULL}, {NULL}};
+	struct xen_memory_reservation *xmr = NULL;
+	int rc;
+	struct xencomm_handle *desc;
+	unsigned int argsize;
+	XENCOMM_MINI_ALIGNED(xc_area, 2);
+
+	switch (cmd) {
+	case XENMEM_increase_reservation:
+	case XENMEM_decrease_reservation:
+	case XENMEM_populate_physmap:
+		xmr = (struct xen_memory_reservation *)arg;
+		set_xen_guest_handle(extent_start_va[0],
+				     xen_guest_handle(xmr->extent_start));
+
+		argsize = sizeof(*xmr);
+		rc = xencommize_memory_reservation(xc_area, xmr);
+		if (rc)
+			return rc;
+		xc_area++;
+		break;
+
+	case XENMEM_maximum_ram_page:
+		argsize = 0;
+		break;
+
+	case XENMEM_add_to_physmap:
+		argsize = sizeof(struct xen_add_to_physmap);
+		break;
+
+	default:
+		printk(KERN_DEBUG "%s: unknown memory op %d\n", __func__, cmd);
+		return -ENOSYS;
+	}
+
+	desc = xencomm_map_no_alloc(arg, argsize);
+	if (desc == NULL)
+		return -EINVAL;
+
+	rc = xencomm_arch_hypercall_memory_op(cmd, desc);
+
+	switch (cmd) {
+	case XENMEM_increase_reservation:
+	case XENMEM_decrease_reservation:
+	case XENMEM_populate_physmap:
+		set_xen_guest_handle(xmr->extent_start,
+				     xen_guest_handle(extent_start_va[0]));
+		break;
+	}
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(xencomm_hypercall_memory_op);
+
+int
+xencomm_hypercall_suspend(unsigned long srec)
+{
+	struct sched_shutdown arg;
+
+	arg.reason = SHUTDOWN_suspend;
+
+	return xencomm_arch_hypercall_suspend(
+		xencomm_map_no_alloc(&arg, sizeof(arg)));
+}
+
+long
+xencomm_hypercall_vcpu_op(int cmd, int cpu, void *arg)
+{
+	unsigned int argsize;
+	switch (cmd) {
+	case VCPUOP_register_runstate_memory_area: {
+		struct vcpu_register_runstate_memory_area *area +			(struct
vcpu_register_runstate_memory_area *)arg;
+		argsize = sizeof(*arg);
+		set_xen_guest_handle(area->addr.h,
+		     (void *)xencomm_map_no_alloc(area->addr.v,
+						  sizeof(area->addr.v)));
+		break;
+	}
+
+	default:
+		printk(KERN_DEBUG "%s: unknown vcpu op %d\n", __func__, cmd);
+		return -ENOSYS;
+	}
+
+	return xencomm_arch_hypercall_vcpu_op(cmd, cpu,
+					xencomm_map_no_alloc(arg, argsize));
+}
+
+long
+xencomm_hypercall_opt_feature(void *arg)
+{
+	return xencomm_arch_hypercall_opt_feature(
+		xencomm_map_no_alloc(arg,
+				     sizeof(struct xen_ia64_opt_feature)));
+}
+
+int
+xencomm_hypercall_fpswa_revision(unsigned int *revision)
+{
+	struct xencomm_handle *desc;
+
+	desc = xencomm_map_no_alloc(revision, sizeof(*revision));
+	if (desc == NULL)
+		return -EINVAL;
+
+	return xencomm_arch_hypercall_fpswa_revision(desc);
+}
+EXPORT_SYMBOL_GPL(xencomm_hypercall_fpswa_revision);
diff --git a/arch/ia64/xen/xencomm.c b/arch/ia64/xen/xencomm.c
new file mode 100644
index 0000000..6e9da66
--- /dev/null
+++ b/arch/ia64/xen/xencomm.c
@@ -0,0 +1,108 @@
+/*
+ * Copyright (C) 2006 Hollis Blanchard <hollisb at us.ibm.com>, IBM
Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ */
+
+#include <linux/gfp.h>
+#include <linux/mm.h>
+#include <xen/interface/xen.h>
+#include <asm/page.h>
+
+#ifdef HAVE_XEN_PLATFORM_COMPAT_H
+#include <xen/platform-compat.h>
+#endif
+
+#include <asm/xen/xencomm.h>
+
+static unsigned long kernel_start_pa;
+
+void
+xencomm_initialize(void)
+{
+	kernel_start_pa = KERNEL_START - ia64_tpa(KERNEL_START);
+}
+
+/* Translate virtual address to physical address.  */
+unsigned long
+xencomm_vtop(unsigned long vaddr)
+{
+#ifndef CONFIG_VMX_GUEST
+	struct page *page;
+	struct vm_area_struct *vma;
+#endif
+
+	if (vaddr == 0)
+		return 0;
+
+#ifdef __ia64__
+	if (REGION_NUMBER(vaddr) == 5) {
+		pgd_t *pgd;
+		pud_t *pud;
+		pmd_t *pmd;
+		pte_t *ptep;
+
+		/* On ia64, TASK_SIZE refers to current.  It is not initialized
+		   during boot.
+		   Furthermore the kernel is relocatable and __pa() doesn't
+		   work on  addresses.  */
+		if (vaddr >= KERNEL_START
+		    && vaddr < (KERNEL_START + KERNEL_TR_PAGE_SIZE))
+			return vaddr - kernel_start_pa;
+
+		/* In kernel area -- virtually mapped.  */
+		pgd = pgd_offset_k(vaddr);
+		if (pgd_none(*pgd) || pgd_bad(*pgd))
+			return ~0UL;
+
+		pud = pud_offset(pgd, vaddr);
+		if (pud_none(*pud) || pud_bad(*pud))
+			return ~0UL;
+
+		pmd = pmd_offset(pud, vaddr);
+		if (pmd_none(*pmd) || pmd_bad(*pmd))
+			return ~0UL;
+
+		ptep = pte_offset_kernel(pmd, vaddr);
+		if (!ptep)
+			return ~0UL;
+
+		return (pte_val(*ptep) & _PFN_MASK) | (vaddr & ~PAGE_MASK);
+	}
+#endif
+
+	if (vaddr > TASK_SIZE) {
+		/* kernel address */
+		return __pa(vaddr);
+	}
+
+
+#ifdef CONFIG_VMX_GUEST
+	/* No privcmd within vmx guest.  */
+	return ~0UL;
+#else
+	/* XXX double-check (lack of) locking */
+	vma = find_extend_vma(current->mm, vaddr);
+	if (!vma)
+		return ~0UL;
+
+	/* We assume the page is modified.  */
+	page = follow_page(vma, vaddr, FOLL_WRITE | FOLL_TOUCH);
+	if (!page)
+		return ~0UL;
+
+	return (page_to_pfn(page) << PAGE_SHIFT) | (vaddr & ~PAGE_MASK);
+#endif
+}
diff --git a/include/asm-ia64/xen/xcom_hcall.h
b/include/asm-ia64/xen/xcom_hcall.h
new file mode 100644
index 0000000..8b1f74e
--- /dev/null
+++ b/include/asm-ia64/xen/xcom_hcall.h
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2006 Tristan Gingold <tristan.gingold at bull.net>, Bull
SAS
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ */
+
+#ifndef _ASM_IA64_XEN_XCOM_HCALL_H
+#define _ASM_IA64_XEN_XCOM_HCALL_H
+
+/* These function creates inline or mini descriptor for the parameters and
+   calls the corresponding xencomm_arch_hypercall_X.
+   Architectures should defines HYPERVISOR_xxx as xencomm_hypercall_xxx unless
+   they want to use their own wrapper.  */
+extern int xencomm_hypercall_console_io(int cmd, int count, char *str);
+
+extern int xencomm_hypercall_event_channel_op(int cmd, void *op);
+
+extern int xencomm_hypercall_xen_version(int cmd, void *arg);
+
+extern int xencomm_hypercall_physdev_op(int cmd, void *op);
+
+extern int xencomm_hypercall_grant_table_op(unsigned int cmd, void *op,
+					    unsigned int count);
+
+extern int xencomm_hypercall_sched_op(int cmd, void *arg);
+
+extern int xencomm_hypercall_multicall(void *call_list, int nr_calls);
+
+extern int xencomm_hypercall_callback_op(int cmd, void *arg);
+
+extern int xencomm_hypercall_memory_op(unsigned int cmd, void *arg);
+
+extern unsigned long xencomm_hypercall_hvm_op(int cmd, void *arg);
+
+extern int xencomm_hypercall_suspend(unsigned long srec);
+
+extern long xencomm_hypercall_vcpu_op(int cmd, int cpu, void *arg);
+
+extern long xencomm_hypercall_opt_feature(void *arg);
+
+extern int xencomm_hypercall_kexec_op(int cmd, void *arg);
+
+#endif /* _ASM_IA64_XEN_XCOM_HCALL_H */
diff --git a/include/asm-ia64/xen/xencomm.h b/include/asm-ia64/xen/xencomm.h
new file mode 100644
index 0000000..e95db51
--- /dev/null
+++ b/include/asm-ia64/xen/xencomm.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 2006 Hollis Blanchard <hollisb at us.ibm.com>, IBM
Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ */
+
+#ifndef _ASM_IA64_XEN_XENCOMM_H
+#define _ASM_IA64_XEN_XENCOMM_H
+
+#define is_kernel_addr(x)					\
+	((PAGE_OFFSET <= (x) &&					\
+	  (x) < (PAGE_OFFSET + (1UL << IA64_MAX_PHYS_BITS))) ||	\
+	 (KERNEL_START <= (x) &&				\
+	  (x) < KERNEL_START + KERNEL_TR_PAGE_SIZE))
+
+/* Must be called before any hypercall.  */
+extern void xencomm_initialize(void);
+
+#include <xen/xencomm.h>
+
+#endif /* _ASM_IA64_XEN_XENCOMM_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 43/50] ia64/xen: define xen_alloc_vm_area()/xen_free_vm_area() for ia64 arch.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/util.c |  101 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 101 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/util.c

diff --git a/arch/ia64/xen/util.c b/arch/ia64/xen/util.c
new file mode 100644
index 0000000..242a1a4
--- /dev/null
+++ b/arch/ia64/xen/util.c
@@ -0,0 +1,101 @@
+/******************************************************************************
+ * arch/ia64/xen/util.c
+ * This file is the ia64 counterpart of drivers/xen/util.c
+ *
+ * Copyright (c) 2006 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <asm/uaccess.h>
+#include <xen/interface/memory.h>
+#include <asm/xen/hypercall.h>
+
+struct vm_struct *xen_alloc_vm_area(unsigned long size)
+{
+	int order;
+	unsigned long virt;
+	unsigned long nr_pages;
+	struct vm_struct *area;
+
+	order = get_order(size);
+	virt = __get_free_pages(GFP_KERNEL, order);
+	if (virt == 0)
+		goto err0;
+	nr_pages = 1 << order;
+	scrub_pages(virt, nr_pages);
+
+	area = kmalloc(sizeof(*area), GFP_KERNEL);
+	if (area == NULL)
+		goto err1;
+
+	area->flags = VM_IOREMAP;
+	area->addr = (void *)virt;
+	area->size = size;
+	area->pages = NULL;
+	area->nr_pages = nr_pages;
+	area->phys_addr = 0;	/* xenbus_map_ring_valloc uses this field!  */
+
+	return area;
+
+err1:
+	free_pages(virt, order);
+err0:
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(xen_alloc_vm_area);
+
+void xen_free_vm_area(struct vm_struct *area)
+{
+	unsigned int order = get_order(area->size);
+	unsigned long i;
+	unsigned long phys_addr = __pa(area->addr);
+
+	/* This area is used for foreign page mappping.
+	 * So underlying machine page may not be assigned. */
+	for (i = 0; i < (1 << order); i++) {
+		unsigned long ret;
+		unsigned long gpfn = (phys_addr >> PAGE_SHIFT) + i;
+		struct xen_memory_reservation reservation = {
+			.nr_extents   = 1,
+			.address_bits = 0,
+			.extent_order = 0,
+			.domid        = DOMID_SELF
+		};
+		set_xen_guest_handle(reservation.extent_start, &gpfn);
+		ret = HYPERVISOR_memory_op(XENMEM_populate_physmap,
+					   &reservation);
+		BUG_ON(ret != 1);
+	}
+	free_pages((unsigned long)area->addr, order);
+	kfree(area);
+}
+EXPORT_SYMBOL_GPL(xen_free_vm_area);
+
+/*
+ * Local variables:
+ *  c-file-style: "linux"
+ *  indent-tabs-mode: t
+ *  c-indent-level: 8
+ *  c-basic-offset: 8
+ *  tab-width: 8
+ * End:
+ */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:18 UTC

head link

[PATCH 44/50] ia64/xen: basic helper routines for xen/ia64.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/hypervisor.c        |  235 +++++++++++++++++++++++++++++++++++++
 include/asm-ia64/xen/hypervisor.h |  194 ++++++++++++++++++++++++++++++
 2 files changed, 429 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/hypervisor.c

diff --git a/arch/ia64/xen/hypervisor.c b/arch/ia64/xen/hypervisor.c
new file mode 100644
index 0000000..cb4b27f
--- /dev/null
+++ b/arch/ia64/xen/hypervisor.c
@@ -0,0 +1,235 @@
+/******************************************************************************
+ * include/asm-ia64/shadow.h
+ *
+ * Copyright (c) 2006 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <linux/spinlock.h>
+#include <linux/bootmem.h>
+#include <linux/module.h>
+#include <linux/vmalloc.h>
+#include <linux/efi.h>
+#include <asm/page.h>
+#include <asm/pgalloc.h>
+#include <asm/meminit.h>
+#include <asm/xen/hypervisor.h>
+#include <asm/xen/hypercall.h>
+#include <xen/interface/memory.h>
+
+#include "irq_xen.h"
+
+struct shared_info *HYPERVISOR_shared_info __read_mostly +	(struct shared_info
*)XSI_BASE;
+EXPORT_SYMBOL(HYPERVISOR_shared_info);
+
+DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
+#ifdef notyet
+DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);
+#endif
+
+struct start_info *xen_start_info;
+EXPORT_SYMBOL(xen_start_info);
+
+EXPORT_SYMBOL(running_on_xen);
+
+EXPORT_SYMBOL(__hypercall);
+
+/* Stolen from arch/x86/xen/enlighten.c */
+/*
+ * Flag to determine whether vcpu info placement is available on all
+ * VCPUs.  We assume it is to start with, and then set it to zero on
+ * the first failure.  This is because it can succeed on some VCPUs
+ * and not others, since it can involve hypervisor memory allocation,
+ * or because the guest failed to guarantee all the appropriate
+ * constraints on all VCPUs (ie buffer can't cross a page boundary).
+ *
+ * Note that any particular CPU may be using a placed vcpu structure,
+ * but we can only optimise if the all are.
+ *
+ * 0: not available, 1: available
+ */
+#ifdef notyet
+static int have_vcpu_info_placement;
+#endif
+
+static void __init xen_vcpu_setup(int cpu)
+{
+/* on Xen/IA64 VCPUOP_register_vcpu_info isn't supported */
+#ifdef notyet
+	struct vcpu_register_vcpu_info info;
+	int err;
+	struct vcpu_info *vcpup;
+#endif
+
+	/*
+	 * WARNING:
+	 * before changing MAX_VIRT_CPUS,
+	 * check that shared_info fits on a page
+	 */
+	BUILD_BUG_ON(sizeof(struct shared_info) > PAGE_SIZE);
+	per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
+
+#ifdef notyet
+	if (!have_vcpu_info_placement)
+		return;		/* already tested, not available */
+
+	vcpup = &per_cpu(xen_vcpu_info, cpu);
+
+	info.mfn = virt_to_mfn(vcpup);
+	info.offset = offset_in_page(vcpup);
+
+	printk(KERN_DEBUG
+	       "trying to map vcpu_info %d at %p, mfn %llx, offset %d\n",
+	       cpu, vcpup, info.mfn, info.offset);
+
+	/* Check to see if the hypervisor will put the vcpu_info
+	   structure where we want it, which allows direct access via
+	   a percpu-variable. */
+	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
+
+	if (err) {
+		printk(KERN_DEBUG "register_vcpu_info failed: err=%d\n", err);
+		have_vcpu_info_placement = 0;
+	} else {
+		/* This cpu is using the registered vcpu info, even if
+		   later ones fail to. */
+		per_cpu(xen_vcpu, cpu) = vcpup;
+
+		printk(KERN_DEBUG "cpu %d using vcpu_info at %p\n",
+		       cpu, vcpup);
+	}
+#endif
+}
+
+void __init xen_setup_vcpu_info_placement(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		xen_vcpu_setup(cpu);
+}
+
+void __init
+xen_setup(char **cmdline_p)
+{
+	extern void dig_setup(char **cmdline_p);
+
+	if (ia64_platform_is("xen"))
+		dig_setup(cmdline_p);
+}
+
+void __cpuinit
+xen_cpu_init(void)
+{
+	xen_smp_intr_init();
+}
+
+/****************************************************************************
+ * grant table hack
+ * cmd: GNTTABOP_xxx
+ */
+
+#include <linux/mm.h>
+#include <xen/interface/xen.h>
+#include <xen/grant_table.h>
+
+int arch_gnttab_map_shared(unsigned long *frames, unsigned long nr_gframes,
+			   unsigned long max_nr_gframes,
+			   struct grant_entry **__shared)
+{
+	*__shared = __va(frames[0] << PAGE_SHIFT);
+	return 0;
+}
+
+void arch_gnttab_unmap_shared(struct grant_entry *shared,
+			      unsigned long nr_gframes)
+{
+	/* nothing */
+}
+
+static void
+gnttab_map_grant_ref_pre(struct gnttab_map_grant_ref *uop)
+{
+	uint32_t flags;
+
+	flags = uop->flags;
+
+	if (flags & GNTMAP_host_map) {
+		if (flags & GNTMAP_application_map) {
+			printk(KERN_DEBUG
+			       "GNTMAP_application_map is not supported yet: "
+			       "flags 0x%x\n", flags);
+			BUG();
+		}
+		if (flags & GNTMAP_contains_pte) {
+			printk(KERN_DEBUG
+			       "GNTMAP_contains_pte is not supported yet: "
+			       "flags 0x%x\n", flags);
+			BUG();
+		}
+	} else if (flags & GNTMAP_device_map) {
+		printk("GNTMAP_device_map is not supported yet 0x%x\n", flags);
+		BUG();	/* XXX not yet. actually this flag is not used. */
+	} else {
+		BUG();
+	}
+}
+
+int
+HYPERVISOR_grant_table_op(unsigned int cmd, void *uop, unsigned int count)
+{
+	if (cmd == GNTTABOP_map_grant_ref) {
+		unsigned int i;
+		for (i = 0; i < count; i++) {
+			gnttab_map_grant_ref_pre(
+				(struct gnttab_map_grant_ref *)uop + i);
+		}
+	}
+	return xencomm_hypercall_grant_table_op(cmd, uop, count);
+}
+EXPORT_SYMBOL(HYPERVISOR_grant_table_op);
+
+/**************************************************************************
+ * opt feature
+ */
+void
+xen_ia64_enable_opt_feature(void)
+{
+	/* Enable region 7 identity map optimizations in Xen */
+	struct xen_ia64_opt_feature optf;
+
+	optf.cmd = XEN_IA64_OPTF_IDENT_MAP_REG7;
+	optf.on = XEN_IA64_OPTF_ON;
+	optf.pgprot = pgprot_val(PAGE_KERNEL);
+	optf.key = 0;	/* No key on linux. */
+	HYPERVISOR_opt_feature(&optf);
+}
+
+/**************************************************************************
+ * suspend/resume
+ */
+void
+xen_post_suspend(int suspend_cancelled)
+{
+	if (suspend_cancelled)
+		return;
+
+	xen_ia64_enable_opt_feature();
+	/* add more if necessary */
+}
diff --git a/include/asm-ia64/xen/hypervisor.h
b/include/asm-ia64/xen/hypervisor.h
index 78c5635..3c93109 100644
--- a/include/asm-ia64/xen/hypervisor.h
+++ b/include/asm-ia64/xen/hypervisor.h
@@ -42,9 +42,203 @@ extern const int running_on_xen;
 #  define is_running_on_xen()			(1)
 # else /* CONFIG_VMX_GUEST */
 #  define is_running_on_xen()			(0)
+#  define HYPERVISOR_ioremap(offset, size)	(offset)
 # endif /* CONFIG_VMX_GUEST */
 #endif /* CONFIG_XEN */
 
+#if defined(CONFIG_XEN) || defined(CONFIG_VMX_GUEST)
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/version.h>
+#include <linux/errno.h>
+#include <linux/init.h>
+#include <xen/interface/xen.h>
+#include <xen/interface/version.h>	/* to compile feature.c */
+#include <xen/interface/event_channel.h>
+#include <xen/interface/physdev.h>
+#include <xen/interface/sched.h>
+#include <asm/ptrace.h>
+#include <asm/page.h>
+#include <asm/percpu.h>
+#ifdef CONFIG_XEN
+#include <asm/xen/hypercall.h>
+#endif
+
+extern struct shared_info *HYPERVISOR_shared_info;
+extern struct start_info *xen_start_info;
+
+DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
+void __init xen_setup_vcpu_info_placement(void);
+void force_evtchn_callback(void);
+
+struct vm_struct *xen_alloc_vm_area(unsigned long size);
+void xen_free_vm_area(struct vm_struct *area);
+
+/* Turn jiffies into Xen system time. XXX Implement me. */
+#define jiffies_to_st(j)	0
+
+static inline int
+HYPERVISOR_yield(
+	void)
+{
+	int rc = HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
+
+	return rc;
+}
+
+static inline int
+HYPERVISOR_block(
+	void)
+{
+	int rc = HYPERVISOR_sched_op(SCHEDOP_block, NULL);
+
+	return rc;
+}
+
+static inline int
+HYPERVISOR_shutdown(
+	unsigned int reason)
+{
+	struct sched_shutdown sched_shutdown = {
+		.reason = reason
+	};
+
+	int rc = HYPERVISOR_sched_op(SCHEDOP_shutdown, &sched_shutdown);
+
+	return rc;
+}
+
+static inline int
+HYPERVISOR_poll(
+	evtchn_port_t *ports, unsigned int nr_ports, u64 timeout)
+{
+	struct sched_poll sched_poll = {
+		.nr_ports = nr_ports,
+		.timeout = jiffies_to_st(timeout)
+	};
+
+	int rc;
+
+	set_xen_guest_handle(sched_poll.ports, ports);
+	rc = HYPERVISOR_sched_op(SCHEDOP_poll, &sched_poll);
+
+	return rc;
+}
+
+#ifndef CONFIG_VMX_GUEST
+/* for drivers/xen/privcmd/privcmd.c */
+#define machine_to_phys_mapping 0
+struct vm_area_struct;
+int direct_remap_pfn_range(struct vm_area_struct *vma,
+			   unsigned long address,
+			   unsigned long mfn,
+			   unsigned long size,
+			   pgprot_t prot,
+			   domid_t  domid);
+struct file;
+int privcmd_enforce_singleshot_mapping(struct vm_area_struct *vma);
+int privcmd_mmap(struct file *file, struct vm_area_struct *vma);
+#define HAVE_ARCH_PRIVCMD_MMAP
+
+/* for drivers/xen/balloon/balloon.c */
+#ifdef CONFIG_XEN_SCRUB_PAGES
+#define scrub_pages(_p, _n) memset((void *)(_p), 0, (_n) << PAGE_SHIFT)
+#else
+#define scrub_pages(_p, _n) ((void)0)
+#endif
+#define	pte_mfn(_x)	pte_pfn(_x)
+#define phys_to_machine_mapping_valid(_x)	(1)
+
+void xen_contiguous_bitmap_init(unsigned long end_pfn);
+int __xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+				   unsigned int address_bits);
+static inline int
+xen_create_contiguous_region(unsigned long vstart,
+			     unsigned int order, unsigned int address_bits)
+{
+	int ret = 0;
+	if (is_running_on_xen()) {
+		ret = __xen_create_contiguous_region(vstart, order,
+						     address_bits);
+	}
+	return ret;
+}
+
+void __xen_destroy_contiguous_region(unsigned long vstart, unsigned int order);
+static inline void
+xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
+{
+	if (is_running_on_xen())
+		__xen_destroy_contiguous_region(vstart, order);
+}
+
+struct page;
+
+int xen_limit_pages_to_max_mfn(struct page *pages, unsigned int order,
+			       unsigned int address_bits);
+
+/* For drivers/xen/core/machine_reboot.c */
+#define HAVE_XEN_POST_SUSPEND
+void xen_post_suspend(int suspend_cancelled);
+
+/* For setup_arch() in arch/ia64/kernel/setup.c */
+void xen_ia64_enable_opt_feature(void);
+#endif /* !CONFIG_VMX_GUEST */
+
+#define __pte_ma(_x)	((pte_t) {(_x)})        /* unmodified use */
+#define mfn_pte(_x, _y)	__pte_ma(0)		/* unmodified use */
+
+/* for netfront.c, netback.c */
+#define MULTI_UVMFLAGS_INDEX 0 /* XXX any value */
+
+static inline void
+MULTI_update_va_mapping(
+	struct multicall_entry *mcl, unsigned long va,
+	pte_t new_val, unsigned long flags)
+{
+	mcl->op = __HYPERVISOR_update_va_mapping;
+	mcl->result = 0;
+}
+
+static inline void
+MULTI_grant_table_op(struct multicall_entry *mcl, unsigned int cmd,
+	void *uop, unsigned int count)
+{
+	mcl->op = __HYPERVISOR_grant_table_op;
+	mcl->args[0] = cmd;
+	mcl->args[1] = (unsigned long)uop;
+	mcl->args[2] = count;
+}
+
+static inline void
+MULTI_mmu_update(struct multicall_entry *mcl, struct mmu_update *req,
+		 int count, int *success_count, domid_t domid)
+{
+	mcl->op = __HYPERVISOR_mmu_update;
+	mcl->args[0] = (unsigned long)req;
+	mcl->args[1] = count;
+	mcl->args[2] = (unsigned long)success_count;
+	mcl->args[3] = domid;
+}
+
+/*
+ * for blktap.c
+ * int create_lookup_pte_addr(struct mm_struct *mm,
+ *                            unsigned long address,
+ *                            uint64_t *ptep);
+ */
+#define create_lookup_pte_addr(mm, address, ptep)			\
+	({								\
+		printk(KERN_EMERG					\
+		       "%s:%d "						\
+		       "create_lookup_pte_addr() isn't supported.\n",	\
+		       __func__, __LINE__);				\
+		BUG();							\
+		(-ENOSYS);						\
+	})
+
+#endif /* CONFIG_XEN || CONFIG_VMX_GUEST */
+
 #ifdef CONFIG_XEN_PRIVILEGED_GUEST
 #define is_initial_xendomain()						\
 	(is_running_on_xen() ? xen_start_info->flags & SIF_INITDOMAIN : 0)
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:19 UTC

head link

[PATCH 45/50] ia64/xen: domU xen machine vector without dma api.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/kernel/acpi.c        |    4 ++++
 arch/ia64/xen/machvec.c        |    4 ++++
 include/asm-ia64/machvec.h     |    2 ++
 include/asm-ia64/machvec_xen.h |   22 ++++++++++++++++++++++
 4 files changed, 32 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/machvec.c
 create mode 100644 include/asm-ia64/machvec_xen.h

diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
index 78f28d8..adf475a 100644
--- a/arch/ia64/kernel/acpi.c
+++ b/arch/ia64/kernel/acpi.c
@@ -118,6 +118,8 @@ acpi_get_sysname(void)
 		return "hpzx1";
 	} else if (!strcmp(hdr->oem_id, "SGI")) {
 		return "sn2";
+	} else if (is_running_on_xen() && !strcmp(hdr->oem_id,
"XEN")) {
+		return "xen";
 	}
 
 	return "dig";
@@ -132,6 +134,8 @@ acpi_get_sysname(void)
 	return "sn2";
 # elif defined (CONFIG_IA64_DIG)
 	return "dig";
+# elif defined (CONFIG_IA64_XEN)
+	return "xen";
 # else
 #	error Unknown platform.  Fix acpi.c.
 # endif
diff --git a/arch/ia64/xen/machvec.c b/arch/ia64/xen/machvec.c
new file mode 100644
index 0000000..4ad588a
--- /dev/null
+++ b/arch/ia64/xen/machvec.c
@@ -0,0 +1,4 @@
+#define MACHVEC_PLATFORM_NAME           xen
+#define MACHVEC_PLATFORM_HEADER         <asm/machvec_xen.h>
+#include <asm/machvec_init.h>
+
diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
index c201a20..cea8d63 100644
--- a/include/asm-ia64/machvec.h
+++ b/include/asm-ia64/machvec.h
@@ -120,6 +120,8 @@ extern void machvec_tlb_migrate_finish (struct mm_struct *);
 #  include <asm/machvec_hpzx1_swiotlb.h>
 # elif defined (CONFIG_IA64_SGI_SN2)
 #  include <asm/machvec_sn2.h>
+# elif defined (CONFIG_IA64_XEN)
+#  include <asm/machvec_xen.h>
 # elif defined (CONFIG_IA64_GENERIC)
 
 # ifdef MACHVEC_PLATFORM_HEADER
diff --git a/include/asm-ia64/machvec_xen.h b/include/asm-ia64/machvec_xen.h
new file mode 100644
index 0000000..ed0f84d
--- /dev/null
+++ b/include/asm-ia64/machvec_xen.h
@@ -0,0 +1,22 @@
+#ifndef _ASM_IA64_MACHVEC_XEN_h
+#define _ASM_IA64_MACHVEC_XEN_h
+
+extern ia64_mv_setup_t			xen_setup;
+extern ia64_mv_cpu_init_t		xen_cpu_init;
+extern ia64_mv_irq_init_t		xen_irq_init;
+extern ia64_mv_send_ipi_t		xen_platform_send_ipi;
+
+/*
+ * This stuff has dual use!
+ *
+ * For a generic kernel, the macros are used to initialize the
+ * platform's machvec structure.  When compiling a non-generic kernel,
+ * the macros are used directly.
+ */
+#define platform_name				"xen"
+#define platform_setup				xen_setup
+#define platform_cpu_init			xen_cpu_init
+#define platform_irq_init			xen_irq_init
+#define platform_send_ipi			xen_platform_send_ipi
+
+#endif /* _ASM_IA64_MACHVEC_XEN_h */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:19 UTC

head link

[PATCH 46/50] ia64/xen: define xen related address conversion helper functions for domU

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 include/asm-ia64/page.h     |    8 ++++++++
 include/asm-ia64/xen/page.h |   41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-ia64/xen/page.h

diff --git a/include/asm-ia64/page.h b/include/asm-ia64/page.h
index 4999a6c..5508dc2 100644
--- a/include/asm-ia64/page.h
+++ b/include/asm-ia64/page.h
@@ -227,4 +227,12 @@ get_order (unsigned long size)
 					 (((current->personality & READ_IMPLIES_EXEC) != 0)	\
 					  ? VM_EXEC : 0))
 
+/*
+ * XXX: to compile
+ * after pv_ops'fication of xen paravirtualization, this should be removed.
+ */
+#if !defined(__ASSEMBLY__) && defined(CONFIG_XEN)
+#include <asm/xen/page.h>
+#endif /* !__ASSEMBLY__ && CONFIG_XEN */
+
 #endif /* _ASM_IA64_PAGE_H */
diff --git a/include/asm-ia64/xen/page.h b/include/asm-ia64/xen/page.h
new file mode 100644
index 0000000..c562036
--- /dev/null
+++ b/include/asm-ia64/xen/page.h
@@ -0,0 +1,41 @@
+#ifndef _ASM_IA64_XEN_PAGE_H
+#define _ASM_IA64_XEN_PAGE_H
+
+#include <linux/kernel.h>
+#include <asm/xen/hypervisor.h>
+#include <asm/xen/hypercall.h>
+#include <xen/features.h>
+#include <xen/interface/xen.h>
+
+static inline unsigned long mfn_to_pfn(unsigned long mfn)
+{
+	return mfn;
+}
+
+static inline unsigned long pfn_to_mfn(unsigned long pfn)
+{
+	return pfn;
+}
+
+static inline void *mfn_to_virt(unsigned long mfn)
+{
+	return __va(mfn << PAGE_SHIFT);
+}
+
+static inline unsigned long virt_to_mfn(void *virt)
+{
+	return __pa(virt) >> PAGE_SHIFT;
+}
+
+/* for tpmfront.c */
+static inline unsigned long virt_to_machine(void *virt)
+{
+	return __pa(virt);
+}
+
+static inline void set_phys_to_machine(unsigned long pfn, unsigned long mfn)
+{
+	/* nothing */
+}
+
+#endif /* _ASM_IA64_XEN_PAGE_H */
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:19 UTC

head link

[PATCH 47/50] ia64/pv_ops/xen: define xen pv_info.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/Makefile     |    2 +
 arch/ia64/xen/xen_pv_ops.c |   69 ++++++++++++++++++++++++++++++++++++++++++++
 arch/ia64/xen/xensetup.S   |   10 ++++++
 3 files changed, 81 insertions(+), 0 deletions(-)
 create mode 100644 arch/ia64/xen/xen_pv_ops.c

diff --git a/arch/ia64/xen/Makefile b/arch/ia64/xen/Makefile
index c219358..4b1db56 100644
--- a/arch/ia64/xen/Makefile
+++ b/arch/ia64/xen/Makefile
@@ -2,6 +2,8 @@
 # Makefile for Xen components
 #
 
+obj-y := xen_pv_ops.o
+
 obj-$(CONFIG_PARAVIRT_ALT) += paravirt_xen.o privops_asm.o privops_c.o
 obj-$(CONFIG_PARAVIRT_NOP_B_PATCH) += paravirt_xen.o
 obj-$(CONFIG_PARAVIRT_ENTRY) += paravirt_xen.o
diff --git a/arch/ia64/xen/xen_pv_ops.c b/arch/ia64/xen/xen_pv_ops.c
new file mode 100644
index 0000000..18aa2f6
--- /dev/null
+++ b/arch/ia64/xen/xen_pv_ops.c
@@ -0,0 +1,69 @@
+/******************************************************************************
+ * arch/ia64/xen/xen_pv_ops.c
+ *
+ * Copyright (c) 2006 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <linux/console.h>
+#include <linux/kernel.h>
+#include <linux/notifier.h>
+#include <linux/pm.h>
+#include <linux/sched.h>
+#include <linux/string.h>
+
+#include <asm/paravirt.h>
+#include <asm/unwind.h>
+
+#include <xen/features.h>
+#include <asm/xen/hypervisor.h>
+#include <asm/xen/xencomm.h>
+
+/***************************************************************************
+ * general info
+ */
+static struct pv_info xen_info __initdata = {
+	.kernel_rpl = 2,	/* or 1: determin at runtime */
+	.paravirt_enabled = 1,
+	.name = "Xen/ia64",
+};
+
+#define IA64_RSC_PL_SHIFT	2
+#define IA64_RSC_PL_BIT_SIZE	2
+#define IA64_RSC_PL_MASK	((1UL << (IA64_RSC_PL_BIT_SIZE - 1)) <<
IA64_RSC_PL_SHIFT)
+
+static void __init
+xen_info_init(void)
+{
+	/* Xenified Linux/ia64 may run on pl = 1 or 2.
+	 * determin at run time. */
+	unsigned long rsc = ia64_getreg(_IA64_REG_AR_RSC);
+	unsigned int rpl = (rsc & IA64_RSC_PL_MASK) >> IA64_RSC_PL_SHIFT;
+	xen_info.kernel_rpl = rpl;
+}
+
+/***************************************************************************
+ * pv_ops initialization
+ */
+
+void __init
+xen_setup_pv_ops(void)
+{
+	xen_info_init();
+	pv_info = xen_info;
+}
diff --git a/arch/ia64/xen/xensetup.S b/arch/ia64/xen/xensetup.S
index 2d3d5d4..cb3432b 100644
--- a/arch/ia64/xen/xensetup.S
+++ b/arch/ia64/xen/xensetup.S
@@ -35,6 +35,16 @@ GLOBAL_ENTRY(early_xen_setup)
 (isBP)	movl r28=XSI_BASE;;
 (isBP)	break 0x1000;;
 
+#ifdef CONFIG_PARAVIRT_GUEST
+	/* patch privops */
+(isBP)	mov r4=rp
+	;;
+(isBP)	br.call.sptk.many rp=xen_setup_pv_ops
+	;;
+(isBP)	mov rp=r4
+	;;
+#endif
+
 #ifdef CONFIG_PARAVIRT
 	/* patch privops */
 (isBP)	mov r4=rp
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:19 UTC

head link

[PATCH 48/50] ia64/pv_ops/xen: define xen pv_init_ops.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/xen_pv_ops.c |  194 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 194 insertions(+), 0 deletions(-)

diff --git a/arch/ia64/xen/xen_pv_ops.c b/arch/ia64/xen/xen_pv_ops.c
index 18aa2f6..a2a7493 100644
--- a/arch/ia64/xen/xen_pv_ops.c
+++ b/arch/ia64/xen/xen_pv_ops.c
@@ -58,6 +58,199 @@ xen_info_init(void)
 }
 
 /***************************************************************************
+ * pv_init_ops
+ * initialization hooks.
+ */
+
+static void
+xen_panic_hypercall(struct unw_frame_info *info, void *arg)
+{
+	current->thread.ksp = (__u64)info->sw - 16;
+	HYPERVISOR_shutdown(SHUTDOWN_crash);
+	/* we're never actually going to get here... */
+}
+
+static int
+xen_panic_event(struct notifier_block *this, unsigned long event, void *ptr)
+{
+	unw_init_running(xen_panic_hypercall, NULL);
+	/* we're never actually going to get here... */
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block xen_panic_block = {
+	xen_panic_event, NULL, 0 /* try to go last */
+};
+
+static void xen_pm_power_off(void)
+{
+	local_irq_disable();
+	HYPERVISOR_shutdown(SHUTDOWN_poweroff);
+}
+
+static void __init
+xen_banner(void)
+{
+	printk(KERN_INFO
+	       "Running on Xen! pl = %d start_info_pfn=0x%lx nr_pages=%ld "
+	       "flags=0x%x\n",
+	       xen_info.kernel_rpl,
+	       HYPERVISOR_shared_info->arch.start_info_pfn,
+	       xen_start_info->nr_pages, xen_start_info->flags);
+}
+
+static int __init
+xen_reserve_memory(struct rsvd_region *region)
+{
+	region->start = (unsigned
long)__va((HYPERVISOR_shared_info->arch.start_info_pfn << PAGE_SHIFT));
+	region->end   = region->start + PAGE_SIZE;
+	return 1;
+}
+
+static void __init
+xen_arch_setup_early(void)
+{
+	struct shared_info *s;
+	BUG_ON(!is_running_on_xen());
+
+	s = HYPERVISOR_shared_info;
+	xen_start_info = __va(s->arch.start_info_pfn << PAGE_SHIFT);
+
+	/* Must be done before any hypercall.  */
+	xencomm_initialize();
+
+	xen_setup_features();
+	/* Register a call for panic conditions. */
+	atomic_notifier_chain_register(&panic_notifier_list,
+				       &xen_panic_block);
+	pm_power_off = xen_pm_power_off;
+
+	xen_ia64_enable_opt_feature();
+}
+
+static void __init
+xen_arch_setup_console(char **cmdline_p)
+{
+	/*
+	 * If a console= is NOT specified, we assume using the
+	 * xencons console is desired.  By default, this is xvc0
+	 * for both dom0 and domU.
+	 */
+	if (!strstr(*cmdline_p, "console=")) {
+		char *p, *q, name[5] = "xvc";
+		int offset = 0;
+
+#if defined(CONFIG_VGA_CONSOLE)
+		/*
+		 * conswitchp might be set intelligently from the
+		 * PCDP code.  If set to VGA console, use it.
+		 */
+		if (is_initial_xendomain() && conswitchp == &vga_con)
+			strncpy(name, "tty", 3);
+#endif
+
+		p = strstr(*cmdline_p, "xencons=");
+
+		if (p) {
+			p += 8;
+			if (!strncmp(p, "ttyS", 4)) {
+				strncpy(name, p, 4);
+				p += 4;
+				offset = simple_strtol(p, &q, 10);
+				if (p == q)
+					offset = 0;
+			} else if (!strncmp(p, "tty", 3) ||
+				   !strncmp(p, "xvc", 3)) {
+				strncpy(name, p, 3);
+				p += 3;
+				offset = simple_strtol(p, &q, 10);
+				if (p == q)
+					offset = 0;
+			} else if (!strncmp(p, "off", 3))
+				offset = -1;
+		}
+
+		if (offset >= 0)
+			add_preferred_console(name, offset, NULL);
+	} else if (!is_initial_xendomain()) {
+		/* use hvc_xen */
+		add_preferred_console("hvc", 0, NULL);
+	}
+
+#if !defined(CONFIG_VT) || !defined(CONFIG_DUMMY_CONSOLE)
+	if (!is_initial_xendomain()) {
+		conswitchp = NULL;
+	}
+#endif
+}
+
+static int __init
+xen_arch_setup_nomca(void)
+{
+	if (!is_initial_xendomain())
+		return 1;
+	return 0;
+}
+
+static void __init
+xen_post_platform_setup(void)
+{
+#ifdef CONFIG_XEN_PRIVILEGED_GUEST
+	if (is_running_on_xen() && !ia64_platform_is("xen")) {
+		extern ia64_mv_setup_t xen_setup;
+		xen_setup(cmdline_p);
+	}
+#endif
+}
+
+static void __init
+xen_post_paging_init(void)
+{
+#ifdef notyet /* XXX: notyet dma api paravirtualization*/
+#ifdef CONFIG_XEN
+	xen_contiguous_bitmap_init(max_pfn);
+#endif
+#endif
+}
+
+static void __init
+__xen_cpu_init(void)
+{
+#ifdef CONFIG_XEN_PRIVILEGED_GUEST
+	if (is_running_on_xen() && !ia64_platform_is("xen")) {
+		extern ia64_mv_cpu_init_t xen_cpu_init;
+		xen_cpu_init();
+	}
+#endif
+}
+
+static void __init
+xen_post_smp_prepare_boot_cpu(void)
+{
+	xen_setup_vcpu_info_placement();
+}
+
+static const struct pv_init_ops xen_init_ops __initdata = {
+	.banner = xen_banner,
+
+	.reserve_memory = xen_reserve_memory,
+
+	.arch_setup_early = xen_arch_setup_early,
+	.arch_setup_console = xen_arch_setup_console,
+	.arch_setup_nomca = xen_arch_setup_nomca,
+	.post_platform_setup = xen_post_platform_setup,
+	.post_paging_init = xen_post_paging_init,
+
+	.cpu_init = __xen_cpu_init,
+
+	.post_smp_prepare_boot_cpu = xen_post_smp_prepare_boot_cpu,
+
+	.bundle_patch_module = &xen_alt_bundle_patch_module,
+	.inst_patch_module = &xen_alt_inst_patch_module,
+};
+
+
+/***************************************************************************
  * pv_ops initialization
  */
 
@@ -66,4 +259,5 @@ xen_setup_pv_ops(void)
 {
 	xen_info_init();
 	pv_info = xen_info;
+	pv_init_ops = xen_init_ops;
 }
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:19 UTC

head link

[PATCH 49/50] ia64/pv_ops/xen: define xen pv_iosapic_ops.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/xen_pv_ops.c |   53 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 53 insertions(+), 0 deletions(-)

diff --git a/arch/ia64/xen/xen_pv_ops.c b/arch/ia64/xen/xen_pv_ops.c
index a2a7493..c35bb23 100644
--- a/arch/ia64/xen/xen_pv_ops.c
+++ b/arch/ia64/xen/xen_pv_ops.c
@@ -21,6 +21,7 @@
  */
 
 #include <linux/console.h>
+#include <linux/irq.h>
 #include <linux/kernel.h>
 #include <linux/notifier.h>
 #include <linux/pm.h>
@@ -251,6 +252,57 @@ static const struct pv_init_ops xen_init_ops __initdata = {
 
 
 /***************************************************************************
+ * pv_iosapic_ops
+ * iosapic read/write hooks.
+ */
+static void
+xen_pcat_compat_init(void)
+{
+	/* nothing */
+}
+
+static struct irq_chip*
+xen_iosapic_get_irq_chip(unsigned long trigger)
+{
+	return NULL;
+}
+
+static unsigned int
+xen_iosapic_read(char __iomem *iosapic, unsigned int reg)
+{
+	struct physdev_apic apic_op;
+	int ret;
+
+	apic_op.apic_physbase = (unsigned long)iosapic -
+					__IA64_UNCACHED_OFFSET;
+	apic_op.reg = reg;
+	ret = HYPERVISOR_physdev_op(PHYSDEVOP_apic_read, &apic_op);
+	if (ret)
+		return ret;
+	return apic_op.value;
+}
+
+static void
+xen_iosapic_write(char __iomem *iosapic, unsigned int reg, u32 val)
+{
+	struct physdev_apic apic_op;
+
+	apic_op.apic_physbase = (unsigned long)iosapic -
+					__IA64_UNCACHED_OFFSET;
+	apic_op.reg = reg;
+	apic_op.value = val;
+	HYPERVISOR_physdev_op(PHYSDEVOP_apic_write, &apic_op);
+}
+
+static const struct pv_iosapic_ops xen_iosapic_ops __initdata = {
+	.pcat_compat_init = xen_pcat_compat_init,
+	.get_irq_chip = xen_iosapic_get_irq_chip,
+
+	.__read = xen_iosapic_read,
+	.__write = xen_iosapic_write,
+};
+
+/***************************************************************************
  * pv_ops initialization
  */
 
@@ -260,4 +312,5 @@ xen_setup_pv_ops(void)
 	xen_info_init();
 	pv_info = xen_info;
 	pv_init_ops = xen_init_ops;
+	pv_iosapic_ops = xen_iosapic_ops;
 }
-- 
1.5.3

Isaku Yamahata

2008-Mar-05 18:19 UTC

head link

[PATCH 50/50] ia64/pv_ops/xen: define xen pv_irq_ops.

Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
---
 arch/ia64/xen/Makefile     |    2 +-
 arch/ia64/xen/hypercall.S  |   10 +
 arch/ia64/xen/irq_xen.c    |  435 ++++++++++++++++++++++++++++++++++++++++++++
 arch/ia64/xen/irq_xen.h    |    8 +
 arch/ia64/xen/xen_pv_ops.c |    3 +
 include/asm-ia64/hw_irq.h  |    4 +
 include/asm-ia64/irq.h     |   33 ++++
 7 files changed, 494 insertions(+), 1 deletions(-)
 create mode 100644 arch/ia64/xen/irq_xen.c
 create mode 100644 arch/ia64/xen/irq_xen.h

diff --git a/arch/ia64/xen/Makefile b/arch/ia64/xen/Makefile
index 4b1db56..ff7a58d 100644
--- a/arch/ia64/xen/Makefile
+++ b/arch/ia64/xen/Makefile
@@ -2,7 +2,7 @@
 # Makefile for Xen components
 #
 
-obj-y := xen_pv_ops.o
+obj-y := xen_pv_ops.o irq_xen.o 
 
 obj-$(CONFIG_PARAVIRT_ALT) += paravirt_xen.o privops_asm.o privops_c.o
 obj-$(CONFIG_PARAVIRT_NOP_B_PATCH) += paravirt_xen.o
diff --git a/arch/ia64/xen/hypercall.S b/arch/ia64/xen/hypercall.S
index 7c5242b..3fad2fe 100644
--- a/arch/ia64/xen/hypercall.S
+++ b/arch/ia64/xen/hypercall.S
@@ -123,6 +123,16 @@ END(xen_set_eflag)
 #endif /* CONFIG_IA32_SUPPORT */
 #endif /* ASM_SUPPORTED */
 
+GLOBAL_ENTRY(xen_send_ipi)
+	mov r14=r32
+	mov r15=r33
+	mov r2=0x400
+	break 0x1000
+	;;
+	br.ret.sptk.many rp
+	;;
+END(xen_send_ipi)
+
 GLOBAL_ENTRY(__hypercall)
 	mov r2=r37
 	break 0x1000
diff --git a/arch/ia64/xen/irq_xen.c b/arch/ia64/xen/irq_xen.c
new file mode 100644
index 0000000..57fab2b
--- /dev/null
+++ b/arch/ia64/xen/irq_xen.c
@@ -0,0 +1,435 @@
+/******************************************************************************
+ * arch/ia64/xen/irq_xen.c
+ *
+ * Copyright (c) 2006 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <linux/cpu.h>
+
+#include <xen/events.h>
+#include <xen/interface/callback.h>
+
+#include "irq_xen.h"
+
+/***************************************************************************
+ * pv_irq_ops
+ * irq operations
+ */
+
+static int
+xen_assign_irq_vector(int irq)
+{
+	struct physdev_irq irq_op;
+
+	irq_op.irq = irq;
+	if (HYPERVISOR_physdev_op(PHYSDEVOP_alloc_irq_vector, &irq_op))
+		return -ENOSPC;
+
+	return irq_op.vector;
+}
+
+static void
+xen_free_irq_vector(int vector)
+{
+	struct physdev_irq irq_op;
+
+	if (vector < IA64_FIRST_DEVICE_VECTOR ||
+	    vector > IA64_LAST_DEVICE_VECTOR)
+		return;
+
+	irq_op.vector = vector;
+	if (HYPERVISOR_physdev_op(PHYSDEVOP_free_irq_vector, &irq_op))
+		printk(KERN_WARNING "%s: xen_free_irq_vecotr fail vector=%d\n",
+		       __func__, vector);
+}
+
+
+static DEFINE_PER_CPU(int, timer_irq) = -1;
+static DEFINE_PER_CPU(int, ipi_irq) = -1;
+static DEFINE_PER_CPU(int, resched_irq) = -1;
+static DEFINE_PER_CPU(int, cmc_irq) = -1;
+static DEFINE_PER_CPU(int, cmcp_irq) = -1;
+static DEFINE_PER_CPU(int, cpep_irq) = -1;
+#define NAME_SIZE	15
+static DEFINE_PER_CPU(char[NAME_SIZE], timer_name);
+static DEFINE_PER_CPU(char[NAME_SIZE], ipi_name);
+static DEFINE_PER_CPU(char[NAME_SIZE], resched_name);
+static DEFINE_PER_CPU(char[NAME_SIZE], cmc_name);
+static DEFINE_PER_CPU(char[NAME_SIZE], cmcp_name);
+static DEFINE_PER_CPU(char[NAME_SIZE], cpep_name);
+#undef NAME_SIZE
+
+struct saved_irq {
+	unsigned int irq;
+	struct irqaction *action;
+};
+/* 16 should be far optimistic value, since only several percpu irqs
+ * are registered early.
+ */
+#define MAX_LATE_IRQ	16
+static struct saved_irq saved_percpu_irqs[MAX_LATE_IRQ];
+static unsigned short late_irq_cnt = 0;
+static unsigned short saved_irq_cnt = 0;
+static int xen_slab_ready = 0;
+
+#ifdef CONFIG_SMP
+/* Dummy stub. Though we may check RESCHEDULE_VECTOR before __do_IRQ,
+ * it ends up to issue several memory accesses upon percpu data and
+ * thus adds unnecessary traffic to other paths.
+ */
+static irqreturn_t
+xen_dummy_handler(int irq, void *dev_id)
+{
+
+	return IRQ_HANDLED;
+}
+
+static struct irqaction xen_resched_irqaction = {
+	.handler =	xen_dummy_handler,
+	.flags =	IRQF_DISABLED,
+	.name =		"resched"
+};
+
+static struct irqaction xen_tlb_irqaction = {
+	.handler =	xen_dummy_handler,
+	.flags =	IRQF_DISABLED,
+	.name =		"tlb_flush"
+};
+#endif
+
+/*
+ * This is xen version percpu irq registration, which needs bind
+ * to xen specific evtchn sub-system. One trick here is that xen
+ * evtchn binding interface depends on kmalloc because related
+ * port needs to be freed at device/cpu down. So we cache the
+ * registration on BSP before slab is ready and then deal them
+ * at later point. For rest instances happening after slab ready,
+ * we hook them to xen evtchn immediately.
+ *
+ * FIXME: MCA is not supported by far, and thus "nomca" boot param is
+ * required.
+ */
+static void
+__xen_register_percpu_irq(unsigned int cpu, unsigned int vec,
+			struct irqaction *action, int save)
+{
+	irq_desc_t *desc;
+	int irq = 0;
+
+	if (xen_slab_ready) {
+		switch (vec) {
+		case IA64_TIMER_VECTOR:
+			snprintf(per_cpu(timer_name, cpu),
+				 sizeof(per_cpu(timer_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_virq_to_irqhandler(VIRQ_ITC, cpu,
+				action->handler, action->flags,
+				per_cpu(timer_name, cpu), action->dev_id);
+			per_cpu(timer_irq, cpu) = irq;
+			break;
+		case IA64_IPI_RESCHEDULE:
+			snprintf(per_cpu(resched_name, cpu),
+				 sizeof(per_cpu(resched_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_ipi_to_irqhandler(RESCHEDULE_VECTOR, cpu,
+				action->handler, action->flags,
+				per_cpu(resched_name, cpu), action->dev_id);
+			per_cpu(resched_irq, cpu) = irq;
+			break;
+		case IA64_IPI_VECTOR:
+			snprintf(per_cpu(ipi_name, cpu),
+				 sizeof(per_cpu(ipi_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_ipi_to_irqhandler(IPI_VECTOR, cpu,
+				action->handler, action->flags,
+				per_cpu(ipi_name, cpu), action->dev_id);
+			per_cpu(ipi_irq, cpu) = irq;
+			break;
+		case IA64_CMC_VECTOR:
+			snprintf(per_cpu(cmc_name, cpu),
+				 sizeof(per_cpu(cmc_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_virq_to_irqhandler(VIRQ_MCA_CMC, cpu,
+						      action->handler,
+						      action->flags,
+						      per_cpu(cmc_name, cpu),
+						      action->dev_id);
+			per_cpu(cmc_irq, cpu) = irq;
+			break;
+		case IA64_CMCP_VECTOR:
+			snprintf(per_cpu(cmcp_name, cpu),
+				 sizeof(per_cpu(cmcp_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_ipi_to_irqhandler(CMCP_VECTOR, cpu,
+						     action->handler,
+						     action->flags,
+						     per_cpu(cmcp_name, cpu),
+						     action->dev_id);
+			per_cpu(cmcp_irq, cpu) = irq;
+			break;
+		case IA64_CPEP_VECTOR:
+			snprintf(per_cpu(cpep_name, cpu),
+				 sizeof(per_cpu(cpep_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_ipi_to_irqhandler(CPEP_VECTOR, cpu,
+						     action->handler,
+						     action->flags,
+						     per_cpu(cpep_name, cpu),
+						     action->dev_id);
+			per_cpu(cpep_irq, cpu) = irq;
+			break;
+		case IA64_CPE_VECTOR:
+		case IA64_MCA_RENDEZ_VECTOR:
+		case IA64_PERFMON_VECTOR:
+		case IA64_MCA_WAKEUP_VECTOR:
+		case IA64_SPURIOUS_INT_VECTOR:
+			/* No need to complain, these aren't supported. */
+			break;
+		default:
+			printk(KERN_WARNING "Percpu irq %d is unsupported "
+			       "by xen!\n", vec);
+			break;
+		}
+		BUG_ON(irq < 0);
+
+		if (irq > 0) {
+			/*
+			 * Mark percpu.  Without this, migrate_irqs() will
+			 * mark the interrupt for migrations and trigger it
+			 * on cpu hotplug.
+			 */
+			desc = irq_desc + irq;
+			desc->status |= IRQ_PER_CPU;
+		}
+	}
+
+	/* For BSP, we cache registered percpu irqs, and then re-walk
+	 * them when initializing APs
+	 */
+	if (!cpu && save) {
+		BUG_ON(saved_irq_cnt == MAX_LATE_IRQ);
+		saved_percpu_irqs[saved_irq_cnt].irq = vec;
+		saved_percpu_irqs[saved_irq_cnt].action = action;
+		saved_irq_cnt++;
+		if (!xen_slab_ready)
+			late_irq_cnt++;
+	}
+}
+
+static void
+xen_register_percpu_irq(ia64_vector vec, struct irqaction *action)
+{
+	__xen_register_percpu_irq(smp_processor_id(), vec, action, 1);
+}
+
+static void
+xen_bind_early_percpu_irq(void)
+{
+	int i;
+
+	xen_slab_ready = 1;
+	/* There's no race when accessing this cached array, since only
+	 * BSP will face with such step shortly
+	 */
+	for (i = 0; i < late_irq_cnt; i++)
+		__xen_register_percpu_irq(smp_processor_id(),
+					  saved_percpu_irqs[i].irq,
+					  saved_percpu_irqs[i].action, 0);
+}
+
+/* FIXME: There's no obvious point to check whether slab is ready. So
+ * a hack is used here by utilizing a late time hook.
+ */
+extern void (*late_time_init)(void);
+extern char xen_event_callback;
+extern void xen_init_IRQ(void);
+
+#ifdef CONFIG_HOTPLUG_CPU
+static int __devinit
+unbind_evtchn_callback(struct notifier_block *nfb,
+		       unsigned long action, void *hcpu)
+{
+	unsigned int cpu = (unsigned long)hcpu;
+
+	if (action == CPU_DEAD) {
+		/* Unregister evtchn.  */
+		if (per_cpu(cpep_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(cpep_irq, cpu), NULL);
+			per_cpu(cpep_irq, cpu) = -1;
+		}
+		if (per_cpu(cmcp_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(cmcp_irq, cpu), NULL);
+			per_cpu(cmcp_irq, cpu) = -1;
+		}
+		if (per_cpu(cmc_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(cmc_irq, cpu), NULL);
+			per_cpu(cmc_irq, cpu) = -1;
+		}
+		if (per_cpu(ipi_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(ipi_irq, cpu), NULL);
+			per_cpu(ipi_irq, cpu) = -1;
+		}
+		if (per_cpu(resched_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(resched_irq, cpu),
+						NULL);
+			per_cpu(resched_irq, cpu) = -1;
+		}
+		if (per_cpu(timer_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(timer_irq, cpu), NULL);
+			per_cpu(timer_irq, cpu) = -1;
+		}
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block unbind_evtchn_notifier = {
+	.notifier_call = unbind_evtchn_callback,
+	.priority = 0
+};
+#endif
+
+DECLARE_PER_CPU(int, ipi_to_irq[NR_IPIS]);
+void xen_smp_intr_init_early(unsigned int cpu)
+{
+#ifdef CONFIG_SMP
+	unsigned int i;
+
+	for (i = 0; i < saved_irq_cnt; i++)
+		__xen_register_percpu_irq(cpu, saved_percpu_irqs[i].irq,
+					  saved_percpu_irqs[i].action, 0);
+#endif
+}
+
+void xen_smp_intr_init(void)
+{
+#ifdef CONFIG_SMP
+	unsigned int cpu = smp_processor_id();
+	struct callback_register event = {
+		.type = CALLBACKTYPE_event,
+		.address = (unsigned long)&xen_event_callback,
+	};
+
+	if (cpu == 0) {
+		/* Initialization was already done for boot cpu.  */
+#ifdef CONFIG_HOTPLUG_CPU
+		/* Register the notifier only once.  */
+		register_cpu_notifier(&unbind_evtchn_notifier);
+#endif
+		return;
+	}
+
+	/* This should be piggyback when setup vcpu guest context */
+	BUG_ON(HYPERVISOR_callback_op(CALLBACKOP_register, &event));
+#endif /* CONFIG_SMP */
+}
+
+void __init
+xen_irq_init(void)
+{
+	struct callback_register event = {
+		.type = CALLBACKTYPE_event,
+		.address = (unsigned long)&xen_event_callback,
+	};
+
+	xen_init_IRQ();
+	BUG_ON(HYPERVISOR_callback_op(CALLBACKOP_register, &event));
+	late_time_init = xen_bind_early_percpu_irq;
+}
+
+void
+xen_platform_send_ipi(int cpu, int vector, int delivery_mode, int redirect)
+{
+	int irq = -1;
+
+#ifdef CONFIG_SMP
+	/* TODO: we need to call vcpu_up here */
+	if (unlikely(vector == ap_wakeup_vector)) {
+		/* XXX
+		 * This should be in __cpu_up(cpu) in ia64 smpboot.c
+		 * like x86. But don't want to modify it,
+		 * keep it untouched.
+		 */
+		xen_smp_intr_init_early(cpu);
+
+		xen_send_ipi(cpu, vector);
+		/* vcpu_prepare_and_up(cpu); */
+		return;
+	}
+#endif
+
+	switch (vector) {
+	case IA64_IPI_VECTOR:
+		irq = per_cpu(ipi_to_irq, cpu)[IPI_VECTOR];
+		break;
+	case IA64_IPI_RESCHEDULE:
+		irq = per_cpu(ipi_to_irq, cpu)[RESCHEDULE_VECTOR];
+		break;
+	case IA64_CMCP_VECTOR:
+		irq = per_cpu(ipi_to_irq, cpu)[CMCP_VECTOR];
+		break;
+	case IA64_CPEP_VECTOR:
+		irq = per_cpu(ipi_to_irq, cpu)[CPEP_VECTOR];
+		break;
+	default:
+		printk(KERN_WARNING "Unsupported IPI type 0x%x\n",
+		       vector);
+		irq = 0;
+		break;
+	}
+
+	BUG_ON(irq < 0);
+	notify_remote_via_irq(irq);
+	return;
+}
+
+static void __init
+xen_init_IRQ_early(void)
+{
+#ifdef CONFIG_SMP
+	register_percpu_irq(IA64_IPI_RESCHEDULE, &xen_resched_irqaction);
+	register_percpu_irq(IA64_IPI_LOCAL_TLB_FLUSH, &xen_tlb_irqaction);
+#endif
+}
+
+static void __init
+xen_init_IRQ_late(void)
+{
+#ifdef CONFIG_XEN_PRIVILEGED_GUEST
+	if (is_running_on_xen() && !ia64_platform_is("xen"))
+		xen_irq_init();
+#endif
+}
+
+static void
+xen_resend_irq(unsigned int vector)
+{
+	(void)resend_irq_on_evtchn(vector);
+}
+
+const struct pv_irq_ops xen_irq_ops __initdata = {
+	.init_IRQ_early = xen_init_IRQ_early,
+	.init_IRQ_late = xen_init_IRQ_late,
+
+	.assign_irq_vector = xen_assign_irq_vector,
+	.free_irq_vector = xen_free_irq_vector,
+	.register_percpu_irq = xen_register_percpu_irq,
+
+	.send_ipi = xen_platform_send_ipi,
+	.resend_irq = xen_resend_irq,
+};
diff --git a/arch/ia64/xen/irq_xen.h b/arch/ia64/xen/irq_xen.h
new file mode 100644
index 0000000..a2c3ed9
--- /dev/null
+++ b/arch/ia64/xen/irq_xen.h
@@ -0,0 +1,8 @@
+#ifndef IRQ_XEN_H
+#define IRQ_XEN_H
+
+extern const struct pv_irq_ops xen_irq_ops __initdata;
+extern void xen_smp_intr_init(void);
+extern void xen_send_ipi(int cpu, int vec);
+
+#endif /* IRQ_XEN_H */
diff --git a/arch/ia64/xen/xen_pv_ops.c b/arch/ia64/xen/xen_pv_ops.c
index c35bb23..93a5c64 100644
--- a/arch/ia64/xen/xen_pv_ops.c
+++ b/arch/ia64/xen/xen_pv_ops.c
@@ -35,6 +35,8 @@
 #include <asm/xen/hypervisor.h>
 #include <asm/xen/xencomm.h>
 
+#include "irq_xen.h"
+
 /***************************************************************************
  * general info
  */
@@ -313,4 +315,5 @@ xen_setup_pv_ops(void)
 	pv_info = xen_info;
 	pv_init_ops = xen_init_ops;
 	pv_iosapic_ops = xen_iosapic_ops;
+	pv_irq_ops = xen_irq_ops;
 }
diff --git a/include/asm-ia64/hw_irq.h b/include/asm-ia64/hw_irq.h
index 678efec..80009cd 100644
--- a/include/asm-ia64/hw_irq.h
+++ b/include/asm-ia64/hw_irq.h
@@ -15,7 +15,11 @@
 #include <asm/ptrace.h>
 #include <asm/smp.h>
 
+#ifndef CONFIG_XEN
 typedef u8 ia64_vector;
+#else
+typedef u16 ia64_vector;
+#endif
 
 /*
  * 0 special
diff --git a/include/asm-ia64/irq.h b/include/asm-ia64/irq.h
index a66d268..aead249 100644
--- a/include/asm-ia64/irq.h
+++ b/include/asm-ia64/irq.h
@@ -14,6 +14,7 @@
 #include <linux/types.h>
 #include <linux/cpumask.h>
 
+#ifndef CONFIG_XEN
 #define NR_VECTORS	256
 
 #if (NR_VECTORS + 32 * NR_CPUS) < 1024
@@ -21,6 +22,38 @@
 #else
 #define NR_IRQS 1024
 #endif
+#else
+/*
+ * The flat IRQ space is divided into two regions:
+ *  1. A one-to-one mapping of real physical IRQs. This space is only used
+ *     if we have physical device-access privilege. This region is at the
+ *     start of the IRQ space so that existing device drivers do not need
+ *     to be modified to translate physical IRQ numbers into our IRQ space.
+ *  3. A dynamic mapping of inter-domain and Xen-sourced virtual IRQs. These
+ *     are bound using the provided bind/unbind functions.
+ */
+
+#define PIRQ_BASE		0
+#define NR_PIRQS		256
+
+#define DYNIRQ_BASE		(PIRQ_BASE + NR_PIRQS)
+#define NR_DYNIRQS		(CONFIG_NR_CPUS * 8)
+
+#define NR_IRQS			(NR_PIRQS + NR_DYNIRQS)
+#define NR_IRQ_VECTORS		NR_IRQS
+
+#define pirq_to_irq(_x)		((_x) + PIRQ_BASE)
+#define irq_to_pirq(_x)		((_x) - PIRQ_BASE)
+
+#define dynirq_to_irq(_x)	((_x) + DYNIRQ_BASE)
+#define irq_to_dynirq(_x)	((_x) - DYNIRQ_BASE)
+
+#define RESCHEDULE_VECTOR	0
+#define IPI_VECTOR		1
+#define CMCP_VECTOR		2
+#define CPEP_VECTOR		3
+#define NR_IPIS			4
+#endif /* CONFIG_XEN */
 
 static __inline__ int
 irq_canonicalize (int irq)
-- 
1.5.3

Dong, Eddie

2008-Mar-20 09:13 UTC

head link

Xen common code across architecture

Jeremy & all:
	Current xen kernel codes are in arch/x86/xen, but xen dynamic
irqchip (events.c) are common for other architectures such as IA64. We
are in progress with enabling pv_ops for IA64 now and want to reuse same
code, do we need to move the code to some place common? suggestions?
	Thanks, eddie

Seemingly Similar Threads

Search for more apparently analagous threads

Linux Virtualization - Mar 2008 - [PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization

[PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization

[PATCH 01/50] xen: add missing __HYPERVISOR_arch_[0-7] definisions which ia64 needs.

[PATCH 02/50] xen: add missing VIRQ_ARCH_[0-7] definitions which ia64/xen needs.

[PATCH 03/50] xen: add missing definitions for xen grant table which ia64/xen needs.

[PATCH 04/50] xen: add missing definitions in include/xen/interface/vcpu.h which ia64/xen needs

[PATCH 05/50] xen: move features.c from arch/x86/xen/features.c to drivers/xen.

[PATCH 06/50] xen: move arch/x86/xen/events.c undedr drivers/xen and split out arch specific part.

[PATCH 07/50] xen: make include/xen/page.h portable moving those definitions under asm dir.

[PATCH 08/50] xen: replace callers of alloc_vm_area()/free_vm_area() with xen_ prefixed one.

[PATCH 09/50] xen: make grant table arch portable.

[PATCH 10/50] xen: import include/xen/interface/callback.h which ia64/xen needs.

[PATCH 11/50] xen: import arch generic part of xencomm.

[PATCH 12/50] ia64/pv_ops: introduce ia64_set_rr0_to_rr4() to make kernel paravirtualization friendly.

[PATCH 13/50] ia64/pv_ops: introduce ia64_get_psr_i() to make kernel paravirtualization friendly.

[PATCH 14/50] ia64/pv_ops: split out ia64_swtich_to(), ia64_leave_syscall() and ia64_leave_kernel from entry.S to switch_leave.S for paravirtualization.

[PATCH 15/50] ia64/pv_ops: preparation for paravirtualizatin of switch_leave.S and ivt.S

[PATCH 16/50] ia64/pv_ops: hook pal_call_static() for paravirtualization.

[PATCH 17/50] ia64/pv_ops: introduce basic facilities for binary patching.

[PATCH 18/50] ia64/pv_ops: preparation for ia64 intrinsics operations paravirtualization

[PATCH 19/50] ia64/pv_ops: define ia64 privileged instruction intrinsics for paravirtualized guest kernel.

[PATCH 20/50] ia64/pv_ops: paravirtualized instructions for hand written assembly code on native.

[PATCH 21/50] ia64/pv_ops: header file to switch paravirtualized assembly instructions.

[PATCH 22/50] ia64/pv_ops: paravirtualize minstate.h.

[PATCH 23/50] ia64/pv_ops: paravirtualize arch/ia64/kernel/switch_leave.S

[PATCH 24/50] ia64/pv_ops: paravirtualize arch/ia64/kernel/ivt.S.

[PATCH 25/50] ia64/pv_ops: introduce pv_info

[PATCH 26/50] ia64/pv_ops: introduce pv_init_ops and its hooks.

[PATCH 27/50] ia64/pv_ops: introduce pv_iosapic_ops and its hooks.

[PATCH 28/50] ia64/pv_ops: introduce pv_irq_ops and its hooks.

[PATCH 29/50] ia64/xen: increase IA64_MAX_RSVD_REGIONS.

[PATCH 30/50] ia64/xen: introduce synch bitops which is necessary for ia64/xen support.

[PATCH 31/50] ia64/xen: import xen hypercall header file for domU

[PATCH 32/50] ia64/xen: define xen assembler constants which will be used later.

[PATCH 33/50] ia64/xen: detect xen environment at early boot time and do minimal initialization.

[PATCH 34/50] ia64/xen: helper functions for xen fault handlers.

[PATCH 35/50] ia64/pv_ops/xen: paravirtualized instructions for hand written assembly code.

[PATCH 36/50] ia64/pv_ops/xen: paravirtualize DO_SAVE_MIN.

[PATCH 37/50] ia64/pv_ops/xen: multi compile switch_leave.S and ivt.S for xen.

[PATCH 38/50] ia64/xen: paravirtualize pal_call_static().

[PATCH 39/50] ia64/xen: introduce xen paravirtualized intrinsic operations for privileged instruction.

[PATCH 40/50] ia64/pv_ops/xen: xen privileged instruction intrinsics with binary patch.

[PATCH 41/50] ia64/xen: introduce xen hypercall routines necessary for domU.

[PATCH 42/50] ia64/xen: ia64 domU part of xencomm.

[PATCH 43/50] ia64/xen: define xen_alloc_vm_area()/xen_free_vm_area() for ia64 arch.

[PATCH 44/50] ia64/xen: basic helper routines for xen/ia64.

[PATCH 45/50] ia64/xen: domU xen machine vector without dma api.

[PATCH 46/50] ia64/xen: define xen related address conversion helper functions for domU

[PATCH 47/50] ia64/pv_ops/xen: define xen pv_info.

[PATCH 48/50] ia64/pv_ops/xen: define xen pv_init_ops.

[PATCH 49/50] ia64/pv_ops/xen: define xen pv_iosapic_ops.

[PATCH 50/50] ia64/pv_ops/xen: define xen pv_irq_ops.

Xen common code across architecture

Seemingly Similar Threads