Isaku Yamahata
2008-Feb-28 09:57 UTC
[PATCH 0/5] RFC: ia64/pv_ops: ia64 intrinsics paravirtualization
Hi. Thank you for comments on asm code paravirtualization. Its direction is getting clear. Although it hasn't been finished yet, I'd like to start discussion on ia64 intrinsics paravirtualization. This patch set is just for discussion so that it is a subset of xen Linux/ia64 domU paravirtualization, not self complete. You can get the full patched tree by typing git clone http://people.valinux.co.jp/~yamahata/xen-ia64/linux-2.6-xen-ia64.git/ A paravirtualized guest wants to replace ia64 intrinsics, i.e. the operations defined in include/asm-ia64/gcc_instrin.h or include/asm-ia64/intel_instrin.h, with its own version. (At least xenLinux/ia64 does.) So we need a sort of interface to do so. I want to discuss on which direction to go for, please comment. This paravirtualization corresponds to the part of x86 pv_ops, Performance critical code written in C. They are basically indirect function call via pv_xxx_ops. For performance, each pv instance is allowed to binary patch in order to replace function call instruction with their predefined instructions in place. The ia64 intrinsics corresonds to this kind of interface. The discussion points so far are - binary patching should be mandatory or optional? The current patch requires binary patch, but some people think requiring binary patch for pv instances is a bad idea. I think by providing reasonable helper functions set, binary patch won't be burden for pv instances. - How differ from x86 pv_ops? Some people think that the very similarity to x86 pv_ops is important. I guess they're thinking so considering maintenance cost. Anyway ia64 is already different from x86, so such difference doesn't matter as long as ia64 paravirtualization interface is clean enough for maintenance. Note: the way can differ from one operation from another, but it might cause some inconsistency. The following ways are proposed so far. * Option 1: the current way The code would look like static inline unsigned long paravirt_get_cpuid(int index) { register __u64 ia64_intri_res asm ("r8"); register __u64 __index asm ("r8") = index; asm volatile (paravirt_alt_inst("mov %0=cpuid[%r1]", PARAVIRT_INST_GET_CPUID): "=r"(ia64_intri_res): "0O"(__index)); return ia64_intri_res; } #define ia64_get_cpuid paravirt_get_cpuid note: Using r8 is derived from xen hypercall abi. We have to define which register should be used or can be clobbered. Pros: - in-place binary patch is possible. (We may want to pad with nop. How many?) - native case performance is good. - native case doesn't need any modification. Cons: - binary patch is required for pv instances. - Probably current implementation might be too xen-biased. Reviewing them would be necessary for hypervisor neutrality. * Option 2: direct branch The code would look like static inline unsigned long paravirt_get_cpuid(int index) { register __u64 ia64_intri_res asm ("r8"); register __u64 __index asm ("r8") = index; register __u64 ret_addr asm ("r9"); asm volatile (paravirt_alt_inst( "br.cond b0=native_get_cpuid", /* or brl.cond for fast hypercall */ PARAVIRT_INST_GET_CPUID): "=r"(ia64_intri_res), "=r"(ret_addr): "0O"(__index)" "b0"); return ia64_intri_res; } #define ia64_get_cpuid paravirt_get_cpuid note: Using r8 is derived from xen hypercall abi. We have to define which register should be used or can be clobbered. Pros: - in-place binary patch is possible. (We may want to pad with nop. How many?) - so that performance would be good for native case using it. Cons: - binary patch is required for pv instances. - native case needs binary patch for optimal performance. * Option 3: indirect branch The code would look like static inline unsigned long paravirt_get_cpuid(int index) { register __u64 ia64_intri_res asm ("r8"); register __u64 __index asm ("r8") = index; register __u64 func asm ("r9"); asm volatile (paravirt_alt_inst( "mov %1 = pv_cpu_ops" "add %1 = %1, PV_CPU_GET_CPUID_OFFSET" "ld8 %1 = [%1]" "mov b1 = %1" "br.cond b0=b1" PARAVIRT_INST_GET_CPUID): "=r"(ia64_intri_res), "=r"(func): "0O"(__index): "b0", "b1"); return ia64_intri_res; } #define ia64_get_cpuid paravirt_get_cpuid note: Using r8 is derived from xen hypercall abi. We have to define which register should be used or can be clobbered. Pros: - binary patching isn't required for pv instances. - in-place binary patch is possible (We may want to pad with nop. How many?) - so that performance would be good for native case using it. Cons: - use more spaces than the option #2. - For optimal performance binary patch is necessary anyway. * Option 4: indirect function call The code would look like struct pv_cpu_ops { unsigned long (*get_cpuid)(unsigned long index) .... }; extern struct pv_cpu_ops pv_cpu_ops; ... static inline unsigned long paravirt_get_cpuid(unsigned long index) { return pv_cpu_ops->get_cpuid(index); } #define ia64_get_cpuid paravirt_get_cpuid Pros: - Binary patch isn't required. - indirect function call is the very way x86 pv_ops adopted. - If hypervisor supports fast hypercall using gate page, it may want to use function call. Cons: - Binary patch is difficult. ia64 function call uses stacked registers, so that marking br.call instruction is difficult. - so that the performance is suboptimal especially for native case. Possibly the alternative is direct function call. At boot time, scan all text detecting branch instructions which jumps to given functions and binary patch branch target. My current preference is option #1 or #2 making abi more hypervisor neutral. thanks,
Isaku Yamahata
2008-Feb-28 09:57 UTC
[PATCH 1/5] ia64/pv_ops: preparation for ia64 intrinsics operations paravirtualization
To make them overridable cleanly, change their prefix from ia64_ to native_ and define ia64_ to native_. Later ia64_xxx would be redeinfed to pv_ops'ed one. Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp> --- include/asm-ia64/gcc_intrin.h | 58 +++++++++++++++++----------------- include/asm-ia64/intel_intrin.h | 64 +++++++++++++++++++------------------- include/asm-ia64/intrinsics.h | 14 ++++---- include/asm-ia64/privop.h | 36 ++++++++++++++++++++++ 4 files changed, 104 insertions(+), 68 deletions(-) diff --git a/include/asm-ia64/gcc_intrin.h b/include/asm-ia64/gcc_intrin.h index de2ed2c..31db638 100644 --- a/include/asm-ia64/gcc_intrin.h +++ b/include/asm-ia64/gcc_intrin.h @@ -28,7 +28,7 @@ extern void ia64_bad_param_for_getreg (void); register unsigned long ia64_r13 asm ("r13") __used; #endif -#define ia64_setreg(regnum, val) \ +#define native_setreg(regnum, val) \ ({ \ switch (regnum) { \ case _IA64_REG_PSR_L: \ @@ -57,7 +57,7 @@ register unsigned long ia64_r13 asm ("r13") __used; } \ }) -#define ia64_getreg(regnum) \ +#define native_getreg(regnum) \ ({ \ __u64 ia64_intri_res; \ \ @@ -94,7 +94,7 @@ register unsigned long ia64_r13 asm ("r13") __used; #define ia64_hint_pause 0 -#define ia64_hint(mode) \ +#define native_hint(mode) \ ({ \ switch (mode) { \ case ia64_hint_pause: \ @@ -381,7 +381,7 @@ register unsigned long ia64_r13 asm ("r13") __used; #define ia64_invala() asm volatile ("invala" ::: "memory") -#define ia64_thash(addr) \ +#define native_thash(addr) \ ({ \ __u64 ia64_intri_res; \ asm volatile ("thash %0=%1" : "=r"(ia64_intri_res) : "r" (addr)); \ @@ -401,18 +401,18 @@ register unsigned long ia64_r13 asm ("r13") __used; #define ia64_nop(x) asm volatile ("nop %0"::"i"(x)); -#define ia64_itci(addr) asm volatile ("itc.i %0;;" :: "r"(addr) : "memory") +#define native_itci(addr) asm volatile ("itc.i %0;;" :: "r"(addr) : "memory") -#define ia64_itcd(addr) asm volatile ("itc.d %0;;" :: "r"(addr) : "memory") +#define native_itcd(addr) asm volatile ("itc.d %0;;" :: "r"(addr) : "memory") -#define ia64_itri(trnum, addr) asm volatile ("itr.i itr[%0]=%1" \ +#define native_itri(trnum, addr) asm volatile ("itr.i itr[%0]=%1" \ :: "r"(trnum), "r"(addr) : "memory") -#define ia64_itrd(trnum, addr) asm volatile ("itr.d dtr[%0]=%1" \ +#define native_itrd(trnum, addr) asm volatile ("itr.d dtr[%0]=%1" \ :: "r"(trnum), "r"(addr) : "memory") -#define ia64_tpa(addr) \ +#define native_tpa(addr) \ ({ \ __u64 ia64_pa; \ asm volatile ("tpa %0 = %1" : "=r"(ia64_pa) : "r"(addr) : "memory"); \ @@ -422,22 +422,22 @@ register unsigned long ia64_r13 asm ("r13") __used; #define __ia64_set_dbr(index, val) \ asm volatile ("mov dbr[%0]=%1" :: "r"(index), "r"(val) : "memory") -#define ia64_set_ibr(index, val) \ +#define native_set_ibr(index, val) \ asm volatile ("mov ibr[%0]=%1" :: "r"(index), "r"(val) : "memory") -#define ia64_set_pkr(index, val) \ +#define native_set_pkr(index, val) \ asm volatile ("mov pkr[%0]=%1" :: "r"(index), "r"(val) : "memory") -#define ia64_set_pmc(index, val) \ +#define native_set_pmc(index, val) \ asm volatile ("mov pmc[%0]=%1" :: "r"(index), "r"(val) : "memory") -#define ia64_set_pmd(index, val) \ +#define native_set_pmd(index, val) \ asm volatile ("mov pmd[%0]=%1" :: "r"(index), "r"(val) : "memory") -#define ia64_set_rr(index, val) \ +#define native_set_rr(index, val) \ asm volatile ("mov rr[%0]=%1" :: "r"(index), "r"(val) : "memory"); -#define ia64_get_cpuid(index) \ +#define native_get_cpuid(index) \ ({ \ __u64 ia64_intri_res; \ asm volatile ("mov %0=cpuid[%r1]" : "=r"(ia64_intri_res) : "rO"(index)); \ @@ -451,21 +451,21 @@ register unsigned long ia64_r13 asm ("r13") __used; ia64_intri_res; \ }) -#define ia64_get_ibr(index) \ +#define native_get_ibr(index) \ ({ \ __u64 ia64_intri_res; \ asm volatile ("mov %0=ibr[%1]" : "=r"(ia64_intri_res) : "r"(index)); \ ia64_intri_res; \ }) -#define ia64_get_pkr(index) \ +#define native_get_pkr(index) \ ({ \ __u64 ia64_intri_res; \ asm volatile ("mov %0=pkr[%1]" : "=r"(ia64_intri_res) : "r"(index)); \ ia64_intri_res; \ }) -#define ia64_get_pmc(index) \ +#define native_get_pmc(index) \ ({ \ __u64 ia64_intri_res; \ asm volatile ("mov %0=pmc[%1]" : "=r"(ia64_intri_res) : "r"(index)); \ @@ -473,48 +473,48 @@ register unsigned long ia64_r13 asm ("r13") __used; }) -#define ia64_get_pmd(index) \ +#define native_get_pmd(index) \ ({ \ __u64 ia64_intri_res; \ asm volatile ("mov %0=pmd[%1]" : "=r"(ia64_intri_res) : "r"(index)); \ ia64_intri_res; \ }) -#define ia64_get_rr(index) \ +#define native_get_rr(index) \ ({ \ __u64 ia64_intri_res; \ asm volatile ("mov %0=rr[%1]" : "=r"(ia64_intri_res) : "r" (index)); \ ia64_intri_res; \ }) -#define ia64_fc(addr) asm volatile ("fc %0" :: "r"(addr) : "memory") +#define native_fc(addr) asm volatile ("fc %0" :: "r"(addr) : "memory") #define ia64_sync_i() asm volatile (";; sync.i" ::: "memory") -#define ia64_ssm(mask) asm volatile ("ssm %0":: "i"((mask)) : "memory") -#define ia64_rsm(mask) asm volatile ("rsm %0":: "i"((mask)) : "memory") +#define native_ssm(mask) asm volatile ("ssm %0":: "i"((mask)) : "memory") +#define native_rsm(mask) asm volatile ("rsm %0":: "i"((mask)) : "memory") #define ia64_sum(mask) asm volatile ("sum %0":: "i"((mask)) : "memory") #define ia64_rum(mask) asm volatile ("rum %0":: "i"((mask)) : "memory") -#define ia64_ptce(addr) asm volatile ("ptc.e %0" :: "r"(addr)) +#define native_ptce(addr) asm volatile ("ptc.e %0" :: "r"(addr)) -#define ia64_ptcga(addr, size) \ +#define native_ptcga(addr, size) \ do { \ asm volatile ("ptc.ga %0,%1" :: "r"(addr), "r"(size) : "memory"); \ ia64_dv_serialize_data(); \ } while (0) -#define ia64_ptcl(addr, size) \ +#define native_ptcl(addr, size) \ do { \ asm volatile ("ptc.l %0,%1" :: "r"(addr), "r"(size) : "memory"); \ ia64_dv_serialize_data(); \ } while (0) -#define ia64_ptri(addr, size) \ +#define native_ptri(addr, size) \ asm volatile ("ptr.i %0,%1" :: "r"(addr), "r"(size) : "memory") -#define ia64_ptrd(addr, size) \ +#define native_ptrd(addr, size) \ asm volatile ("ptr.d %0,%1" :: "r"(addr), "r"(size) : "memory") /* Values for lfhint in ia64_lfetch and ia64_lfetch_fault */ @@ -596,7 +596,7 @@ do { \ } \ }) -#define ia64_intrin_local_irq_restore(x) \ +#define native_intrin_local_irq_restore(x) \ do { \ asm volatile (";; cmp.ne p6,p7=%0,r0;;" \ "(p6) ssm psr.i;" \ diff --git a/include/asm-ia64/intel_intrin.h b/include/asm-ia64/intel_intrin.h index a520d10..ab3c8a3 100644 --- a/include/asm-ia64/intel_intrin.h +++ b/include/asm-ia64/intel_intrin.h @@ -16,8 +16,8 @@ * intrinsic */ -#define ia64_getreg __getReg -#define ia64_setreg __setReg +#define native_getreg __getReg +#define native_setreg __setReg #define ia64_hint __hint #define ia64_hint_pause __hint_pause @@ -33,16 +33,16 @@ #define ia64_getf_exp __getf_exp #define ia64_shrp _m64_shrp -#define ia64_tpa __tpa +#define native_tpa __tpa #define ia64_invala __invala #define ia64_invala_gr __invala_gr #define ia64_invala_fr __invala_fr #define ia64_nop __nop #define ia64_sum __sum -#define ia64_ssm __ssm +#define native_ssm __ssm #define ia64_rum __rum -#define ia64_rsm __rsm -#define ia64_fc __fc +#define native_rsm __rsm +#define native_fc __fc #define ia64_ldfs __ldfs #define ia64_ldfd __ldfd @@ -80,24 +80,24 @@ #define __ia64_set_dbr(index, val) \ __setIndReg(_IA64_REG_INDR_DBR, index, val) -#define ia64_set_ibr(index, val) \ +#define native_set_ibr(index, val) \ __setIndReg(_IA64_REG_INDR_IBR, index, val) -#define ia64_set_pkr(index, val) \ +#define native_set_pkr(index, val) \ __setIndReg(_IA64_REG_INDR_PKR, index, val) -#define ia64_set_pmc(index, val) \ +#define native_set_pmc(index, val) \ __setIndReg(_IA64_REG_INDR_PMC, index, val) -#define ia64_set_pmd(index, val) \ +#define native_set_pmd(index, val) \ __setIndReg(_IA64_REG_INDR_PMD, index, val) -#define ia64_set_rr(index, val) \ +#define native_set_rr(index, val) \ __setIndReg(_IA64_REG_INDR_RR, index, val) -#define ia64_get_cpuid(index) __getIndReg(_IA64_REG_INDR_CPUID, index) +#define native_get_cpuid(index) __getIndReg(_IA64_REG_INDR_CPUID, index) #define __ia64_get_dbr(index) __getIndReg(_IA64_REG_INDR_DBR, index) -#define ia64_get_ibr(index) __getIndReg(_IA64_REG_INDR_IBR, index) -#define ia64_get_pkr(index) __getIndReg(_IA64_REG_INDR_PKR, index) -#define ia64_get_pmc(index) __getIndReg(_IA64_REG_INDR_PMC, index) -#define ia64_get_pmd(index) __getIndReg(_IA64_REG_INDR_PMD, index) -#define ia64_get_rr(index) __getIndReg(_IA64_REG_INDR_RR, index) +#define native_get_ibr(index) __getIndReg(_IA64_REG_INDR_IBR, index) +#define native_get_pkr(index) __getIndReg(_IA64_REG_INDR_PKR, index) +#define native_get_pmc(index) __getIndReg(_IA64_REG_INDR_PMC, index) +#define native_get_pmd(index) __getIndReg(_IA64_REG_INDR_PMD, index) +#define native_get_rr(index) __getIndReg(_IA64_REG_INDR_RR, index) #define ia64_srlz_d __dsrlz #define ia64_srlz_i __isrlz @@ -119,18 +119,18 @@ #define ia64_ld8_acq __ld8_acq #define ia64_sync_i __synci -#define ia64_thash __thash -#define ia64_ttag __ttag -#define ia64_itcd __itcd -#define ia64_itci __itci -#define ia64_itrd __itrd -#define ia64_itri __itri -#define ia64_ptce __ptce -#define ia64_ptcl __ptcl -#define ia64_ptcg __ptcg -#define ia64_ptcga __ptcga -#define ia64_ptri __ptri -#define ia64_ptrd __ptrd +#define native_thash __thash +#define native_ttag __ttag +#define native_itcd __itcd +#define native_itci __itci +#define native_itrd __itrd +#define native_itri __itri +#define native_ptce __ptce +#define native_ptcl __ptcl +#define native_ptcg __ptcg +#define native_ptcga __ptcga +#define native_ptri __ptri +#define native_ptrd __ptrd #define ia64_dep_mi _m64_dep_mi /* Values for lfhint in __lfetch and __lfetch_fault */ @@ -145,13 +145,13 @@ #define ia64_lfetch_fault __lfetch_fault #define ia64_lfetch_fault_excl __lfetch_fault_excl -#define ia64_intrin_local_irq_restore(x) \ +#define native_intrin_local_irq_restore(x) \ do { \ if ((x) != 0) { \ - ia64_ssm(IA64_PSR_I); \ + native_ssm(IA64_PSR_I); \ ia64_srlz_d(); \ } else { \ - ia64_rsm(IA64_PSR_I); \ + native_rsm(IA64_PSR_I); \ } \ } while (0) diff --git a/include/asm-ia64/intrinsics.h b/include/asm-ia64/intrinsics.h index 5800ad0..3a58069 100644 --- a/include/asm-ia64/intrinsics.h +++ b/include/asm-ia64/intrinsics.h @@ -18,15 +18,15 @@ # include <asm/gcc_intrin.h> #endif -#define ia64_get_psr_i() (ia64_getreg(_IA64_REG_PSR) & IA64_PSR_I) +#define native_get_psr_i() (native_getreg(_IA64_REG_PSR) & IA64_PSR_I) -#define ia64_set_rr0_to_rr4(val0, val1, val2, val3, val4) \ +#define native_set_rr0_to_rr4(val0, val1, val2, val3, val4) \ do { \ - ia64_set_rr(0x0000000000000000UL, (val0)); \ - ia64_set_rr(0x2000000000000000UL, (val1)); \ - ia64_set_rr(0x4000000000000000UL, (val2)); \ - ia64_set_rr(0x6000000000000000UL, (val3)); \ - ia64_set_rr(0x8000000000000000UL, (val4)); \ + native_set_rr(0x0000000000000000UL, (val0)); \ + native_set_rr(0x2000000000000000UL, (val1)); \ + native_set_rr(0x4000000000000000UL, (val2)); \ + native_set_rr(0x6000000000000000UL, (val3)); \ + native_set_rr(0x8000000000000000UL, (val4)); \ } while (0) /* diff --git a/include/asm-ia64/privop.h b/include/asm-ia64/privop.h index 7b9de4f..b0b74fd 100644 --- a/include/asm-ia64/privop.h +++ b/include/asm-ia64/privop.h @@ -16,6 +16,42 @@ /* fallback for native case */ +#ifndef IA64_PARAVIRTUALIZED_PRIVOP +#ifndef __ASSEMBLY +#define ia64_getreg native_getreg +#define ia64_setreg native_setreg +#define ia64_hint native_hint +#define ia64_thash native_thash +#define ia64_itci native_itci +#define ia64_itcd native_itcd +#define ia64_itri native_itri +#define ia64_itrd native_itrd +#define ia64_tpa native_tpa +#define ia64_set_ibr native_set_ibr +#define ia64_set_pkr native_set_pkr +#define ia64_set_pmc native_set_pmc +#define ia64_set_pmd native_set_pmd +#define ia64_set_rr native_set_rr +#define ia64_get_cpuid native_get_cpuid +#define ia64_get_ibr native_get_ibr +#define ia64_get_pkr native_get_pkr +#define ia64_get_pmc native_get_pmc +#define ia64_get_pmd native_get_pmd +#define ia64_get_rr native_get_rr +#define ia64_fc native_fc +#define ia64_ssm native_ssm +#define ia64_rsm native_rsm +#define ia64_ptce native_ptce +#define ia64_ptcga native_ptcga +#define ia64_ptcl native_ptcl +#define ia64_ptri native_ptri +#define ia64_ptrd native_ptrd +#define ia64_get_psr_i native_get_psr_i +#define ia64_intrin_local_irq_restore native_intrin_local_irq_restore +#define ia64_set_rr0_to_rr4 native_set_rr0_to_rr4 +#endif /* !__ASSEMBLY */ +#endif /* !IA64_PARAVIRTUALIZED_PRIVOP */ + #ifndef IA64_PARAVIRTUALIZED_ENTRY #define ia64_switch_to native_switch_to #define ia64_leave_syscall native_leave_syscall -- 1.5.3
Isaku Yamahata
2008-Feb-28 09:57 UTC
[PATCH 2/5] ia64: introduce basic facilities for binary patching.
Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp> --- arch/ia64/Kconfig | 72 +++++++++++++ arch/ia64/kernel/Makefile | 5 + arch/ia64/kernel/paravirt_alt.c | 118 ++++++++++++++++++++++ arch/ia64/kernel/paravirt_core.c | 201 +++++++++++++++++++++++++++++++++++++ arch/ia64/kernel/paravirt_entry.c | 99 ++++++++++++++++++ arch/ia64/kernel/paravirt_nop.c | 49 +++++++++ arch/ia64/kernel/vmlinux.lds.S | 35 +++++++ include/asm-ia64/module.h | 6 + include/asm-ia64/paravirt_alt.h | 82 +++++++++++++++ include/asm-ia64/paravirt_core.h | 54 ++++++++++ include/asm-ia64/paravirt_entry.h | 62 +++++++++++ include/asm-ia64/paravirt_nop.h | 46 +++++++++ 12 files changed, 829 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kernel/paravirt_alt.c create mode 100644 arch/ia64/kernel/paravirt_core.c create mode 100644 arch/ia64/kernel/paravirt_entry.c create mode 100644 arch/ia64/kernel/paravirt_nop.c create mode 100644 include/asm-ia64/paravirt_alt.h create mode 100644 include/asm-ia64/paravirt_core.h create mode 100644 include/asm-ia64/paravirt_entry.h create mode 100644 include/asm-ia64/paravirt_nop.h diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index dff9edf..bc84008 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -110,6 +110,78 @@ config AUDIT_ARCH bool default y +menuconfig PARAVIRT_GUEST + bool "Paravirtualized guest support" + help + Say Y here to get to see options related to running Linux under + various hypervisors. This option alone does not add any kernel code. + + If you say N, all options in this submenu will be skipped and disabled. + +if PARAVIRT_GUEST + +config PARAVIRT + bool + default y + help + This changes the kernel so it can modify itself when it is run + under a hypervisor, potentially improving performance significantly + over full virtualization. However, when run without a hypervisor + the kernel is theoretically slower and slightly larger. + +config PARAVIRT_ALT + bool "paravirt_alt binary patching infrastructure" + depends on PARAVIRT + default y + help + The binary patching infratstructure to replace some privileged + instructions with hypervisor specific instrutions. + There are several sensitive(i.e. non-virtualizable) instructions and + performance critical privileged instructions which Xen + paravirtualize as hyperprivops. + For transparent paravirtualization (i.e. single binary should run + on both baremetal and xen environment), xenLinux/IA64 needs + something like "if (is_running_on_xen()) {} else {}" where + is_running_on_xen() is determined at boot time. + This configuration tries to eliminate the overheads for hyperprivops + by annotating such instructions and replacing them with hyperprivops + at boot time. + +config PARAVIRT_ENTRY + bool "paravirt entry" + depends on PARAVIRT + default y + help + The entry point hooking infrastructure to change the execution path + at the boot time. + There are several paravirtualized paths in hand coded assembly code + which isn't binary patched easily by the paravirt_alt infrastructure. + E.g. ia64_switch_to, ia64_leave_syscall, ia64_leave_kernel and + ia64_pal_call_static. + For those hand written assembly code, change the execution path + by hooking them and jumping to hand paravirtualized code. + +config PARAVIRT_NOP_B_PATCH + bool "paravirt branch if native" + depends on PARAVIRT + default y + help + paravirt branch if native + There are several paravirtualized paths in hand coded assembly code. + For transparent paravirtualization, there are codes like + GLOBAL_ENTRY(xen_xxx) + 'movl reg=running_on_xen;;' + 'ld4 reg=[reg];;' + 'cmp.e1 pred,p0=reg,r0' + '(pred) br.cond.sptk.many <native_xxx>;;' + To reduce overhead when running on bare metal, just + "br.cond.sptk.many <native_xxx>" and replace it with 'nop.b 0' + when running on xen. + +#source "arch/ia64/xen/Kconfig" + +endif + choice prompt "System type" default IA64_GENERIC diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile index 9281bf6..185e0e2 100644 --- a/arch/ia64/kernel/Makefile +++ b/arch/ia64/kernel/Makefile @@ -36,6 +36,11 @@ obj-$(CONFIG_PCI_MSI) += msi_ia64.o mca_recovery-y += mca_drv.o mca_drv_asm.o obj-$(CONFIG_IA64_MC_ERR_INJECT)+= err_inject.o +obj-$(CONFIG_PARAVIRT) += paravirt_core.o +obj-$(CONFIG_PARAVIRT_ALT) += paravirt_alt.o +obj-$(CONFIG_PARAVIRT_ENTRY) += paravirt_entry.o paravirtentry.o +obj-$(CONFIG_PARAVIRT_NOP_B_PATCH) += paravirt_nop.o + obj-$(CONFIG_IA64_ESI) += esi.o ifneq ($(CONFIG_IA64_ESI),) obj-y += esi_stub.o # must be in kernel proper diff --git a/arch/ia64/kernel/paravirt_alt.c b/arch/ia64/kernel/paravirt_alt.c new file mode 100644 index 0000000..d0a34a7 --- /dev/null +++ b/arch/ia64/kernel/paravirt_alt.c @@ -0,0 +1,118 @@ +/****************************************************************************** + * linux/arch/ia64/xen/paravirt_alt.c + * + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <asm/paravirt_core.h> + +extern const char nop_bundle[]; +extern const unsigned long nop_bundle_size; + +static void __init_or_module +fill_nop(void *sbundle, void *ebundle) +{ + void *bundle = sbundle; + BUG_ON((((unsigned long)sbundle) % sizeof(bundle_t)) != 0); + BUG_ON((((unsigned long)ebundle) % sizeof(bundle_t)) != 0); + + while (bundle < ebundle) { + memcpy(bundle, nop_bundle, nop_bundle_size); + + bundle += nop_bundle_size; + } +} + +void __init_or_module +paravirt_alt_bundle_patch_apply(struct paravirt_alt_bundle_patch *start, + struct paravirt_alt_bundle_patch *end, + unsigned long(*patch)(void *sbundle, + void *ebundle, + unsigned long type)) +{ + struct paravirt_alt_bundle_patch *p; + + for (p = start; p < end; p++) { + unsigned long used; + + used = (*patch)(p->sbundle, p->ebundle, p->type); + if (used == 0) + continue; + + fill_nop(p->sbundle + used, p->ebundle); + paravirt_flush_i_cache_range(p->sbundle, + p->ebundle - p->sbundle); + } + ia64_sync_i(); + ia64_srlz_i(); +} + +/* + * nop.i, nop.m, nop.f instruction are same format. + * but nop.b has differennt format. + * This doesn't support nop.b for now. + */ +static void __init_or_module +fill_nop_inst(unsigned long stag, unsigned long etag) +{ + extern const bundle_t nop_mfi_inst_bundle[]; + unsigned long tag; + const cmp_inst_t nop_inst = paravirt_read_slot0(nop_mfi_inst_bundle); + + for (tag = stag; tag < etag; tag = paravirt_get_next_tag(tag)) + paravirt_write_inst(tag, nop_inst); +} + +void __init_or_module +paravirt_alt_inst_patch_apply(struct paravirt_alt_inst_patch *start, + struct paravirt_alt_inst_patch *end, + unsigned long (*patch)(unsigned long stag, + unsigned long etag, + unsigned long type)) +{ + struct paravirt_alt_inst_patch *p; + + for (p = start; p < end; p++) { + unsigned long tag; + bundle_t *sbundle; + bundle_t *ebundle; + + tag = (*patch)(p->stag, p->etag, p->type); + if (tag == p->stag) + continue; + + fill_nop_inst(tag, p->etag); + sbundle = paravirt_get_bundle(p->stag); + ebundle = paravirt_get_bundle(p->etag) + 1; + paravirt_flush_i_cache_range(sbundle, (ebundle - sbundle) * + sizeof(bundle_t)); + } + ia64_sync_i(); + ia64_srlz_i(); +} + +/* + * Local variables: + * mode: C + * c-set-style: "linux" + * c-basic-offset: 8 + * tab-width: 8 + * indent-tabs-mode: t + * End: + */ diff --git a/arch/ia64/kernel/paravirt_core.c b/arch/ia64/kernel/paravirt_core.c new file mode 100644 index 0000000..6b7c70f --- /dev/null +++ b/arch/ia64/kernel/paravirt_core.c @@ -0,0 +1,201 @@ +/****************************************************************************** + * linux/arch/ia64/xen/paravirt_core.c + * + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <asm/paravirt_core.h> + +/* + * flush_icache_range() can't be used here. + * we are here before cpu_init() which initializes + * ia64_i_cache_stride_shift. flush_icache_range() uses it. + */ +void __init_or_module +paravirt_flush_i_cache_range(const void *instr, unsigned long size) +{ + unsigned long i; + + for (i = 0; i < size; i += sizeof(bundle_t)) + asm volatile ("fc.i %0":: "r"(instr + i): "memory"); +} + +bundle_t* __init_or_module +paravirt_get_bundle(unsigned long tag) +{ + return (bundle_t *)(tag & ~3UL); +} + +unsigned long __init_or_module +paravirt_get_slot(unsigned long tag) +{ + return tag & 3UL; +} + +#if 0 +unsigned long __init_or_module +paravirt_get_num_inst(unsigned long stag, unsigned long etag) +{ + bundle_t *sbundle = paravirt_get_bundle(stag); + unsigned long sslot = paravirt_get_slot(stag); + bundle_t *ebundle = paravirt_get_bundle(etag); + unsigned long eslot = paravirt_get_slot(etag); + + return (ebundle - sbundle) * 3 + eslot - sslot + 1; +} +#endif + +unsigned long __init_or_module +paravirt_get_next_tag(unsigned long tag) +{ + unsigned long slot = paravirt_get_slot(tag); + + switch (slot) { + case 0: + case 1: + return tag + 1; + case 2: { + bundle_t *bundle = paravirt_get_bundle(tag); + return (unsigned long)(bundle + 1); + } + default: + BUG(); + } + /* NOTREACHED */ +} + +cmp_inst_t __init_or_module +paravirt_read_slot0(const bundle_t *bundle) +{ + cmp_inst_t inst; + inst.l = bundle->quad0.slot0; + return inst; +} + +cmp_inst_t __init_or_module +paravirt_read_slot1(const bundle_t *bundle) +{ + cmp_inst_t inst; + inst.l = bundle->quad0.slot1_p0 | + ((unsigned long long)bundle->quad1.slot1_p1 << 18UL); + return inst; +} + +cmp_inst_t __init_or_module +paravirt_read_slot2(const bundle_t *bundle) +{ + cmp_inst_t inst; + inst.l = bundle->quad1.slot2; + return inst; +} + +cmp_inst_t __init_or_module +paravirt_read_inst(unsigned long tag) +{ + bundle_t *bundle = paravirt_get_bundle(tag); + unsigned long slot = paravirt_get_slot(tag); + + switch (slot) { + case 0: + return paravirt_read_slot0(bundle); + case 1: + return paravirt_read_slot1(bundle); + case 2: + return paravirt_read_slot2(bundle); + default: + BUG(); + } + /* NOTREACHED */ +} + +void __init_or_module +paravirt_write_slot0(bundle_t *bundle, cmp_inst_t inst) +{ + bundle->quad0.slot0 = inst.l; +} + +void __init_or_module +paravirt_write_slot1(bundle_t *bundle, cmp_inst_t inst) +{ + bundle->quad0.slot1_p0 = inst.l; + bundle->quad1.slot1_p1 = inst.l >> 18UL; +} + +void __init_or_module +paravirt_write_slot2(bundle_t *bundle, cmp_inst_t inst) +{ + bundle->quad1.slot2 = inst.l; +} + +void __init_or_module +paravirt_write_inst(unsigned long tag, cmp_inst_t inst) +{ + bundle_t *bundle = paravirt_get_bundle(tag); + unsigned long slot = paravirt_get_slot(tag); + + switch (slot) { + case 0: + paravirt_write_slot0(bundle, inst); + break; + case 1: + paravirt_write_slot1(bundle, inst); + break; + case 2: + paravirt_write_slot2(bundle, inst); + break; + default: + BUG(); + } + paravirt_flush_i_cache_range(bundle, sizeof(*bundle)); +} + +/* for debug */ +void +print_bundle(const bundle_t *bundle) +{ + const unsigned long *quad = (const unsigned long *)bundle; + cmp_inst_t slot0 = paravirt_read_slot0(bundle); + cmp_inst_t slot1 = paravirt_read_slot1(bundle); + cmp_inst_t slot2 = paravirt_read_slot2(bundle); + + printk(KERN_DEBUG + "bundle 0x%p 0x%016lx 0x%016lx\n", bundle, quad[0], quad[1]); + printk(KERN_DEBUG + "bundle template 0x%x\n", + bundle->quad0.template); + printk(KERN_DEBUG + "slot0 0x%lx slot1_p0 0x%lx slot1_p1 0x%lx slot2 0x%lx\n", + (unsigned long)bundle->quad0.slot0, + (unsigned long)bundle->quad0.slot1_p0, + (unsigned long)bundle->quad1.slot1_p1, + (unsigned long)bundle->quad1.slot2); + printk(KERN_DEBUG + "slot0 0x%016llx slot1 0x%016llx slot2 0x%016llx\n", + slot0.l, slot1.l, slot2.l); +} + +/* + * Local variables: + * mode: C + * c-set-style: "linux" + * c-basic-offset: 8 + * tab-width: 8 + * indent-tabs-mode: t + * End: + */ diff --git a/arch/ia64/kernel/paravirt_entry.c b/arch/ia64/kernel/paravirt_entry.c new file mode 100644 index 0000000..708287a --- /dev/null +++ b/arch/ia64/kernel/paravirt_entry.c @@ -0,0 +1,99 @@ +/****************************************************************************** + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <asm/paravirt_core.h> +#include <asm/paravirt_entry.h> + +/* br.cond.sptk.many <target25> B1 */ +typedef union inst_b1 { + cmp_inst_t inst; + struct { + unsigned long qp: 6; + unsigned long btype: 3; + unsigned long unused: 3; + unsigned long p: 1; + unsigned long imm20b: 20; + unsigned long wh: 2; + unsigned long d: 1; + unsigned long s: 1; + unsigned long opcode: 4; + }; + unsigned long l; +} inst_b1_t; + +static void __init +__paravirt_entry_apply(unsigned long tag, const void *target) +{ + bundle_t *bundle = paravirt_get_bundle(tag); + cmp_inst_t inst = paravirt_read_inst(tag); + unsigned long target25 = (unsigned long)target - (unsigned long)bundle; + inst_b1_t inst_b1; + + inst_b1.l = inst.l; + if (target25 & (1UL << 63)) + inst_b1.s = 1; + else + inst_b1.s = 0; + + inst_b1.imm20b = target25 >> 4; + inst.l = inst_b1.l; + + paravirt_write_inst(tag, inst); + paravirt_flush_i_cache_range(bundle, sizeof(*bundle)); +} + +static void __init +paravirt_entry_apply(const struct paravirt_entry_patch *entry_patch, + const struct paravirt_entry *entries, + unsigned int nr_entries) +{ + unsigned int i; + for (i = 0; i < nr_entries; i++) { + if (entry_patch->type == entries[i].type) { + __paravirt_entry_apply(entry_patch->tag, + entries[i].entry); + break; + } + } +} + +void __init +paravirt_entry_patch_apply(const struct paravirt_entry_patch *start, + const struct paravirt_entry_patch *end, + const struct paravirt_entry *entries, + unsigned int nr_entries) +{ + const struct paravirt_entry_patch *p; + for (p = start; p < end; p++) + paravirt_entry_apply(p, entries, nr_entries); + + ia64_sync_i(); + ia64_srlz_i(); +} + +/* + * Local variables: + * mode: C + * c-set-style: "linux" + * c-basic-offset: 8 + * tab-width: 8 + * indent-tabs-mode: t + * End: + */ diff --git a/arch/ia64/kernel/paravirt_nop.c b/arch/ia64/kernel/paravirt_nop.c new file mode 100644 index 0000000..ee5a204 --- /dev/null +++ b/arch/ia64/kernel/paravirt_nop.c @@ -0,0 +1,49 @@ +/****************************************************************************** + * linux/arch/ia64/xen/paravirt_nop.c + * + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <asm/paravirt_core.h> +#include <asm/paravirt_nop.h> + +void __init_or_module +paravirt_nop_b_patch_apply(const struct paravirt_nop_patch *start, + const struct paravirt_nop_patch *end) +{ + extern const bundle_t nop_b_inst_bundle; + const cmp_inst_t nop_b_inst = paravirt_read_slot0(&nop_b_inst_bundle); + const struct paravirt_nop_patch *p; + + for (p = start; p < end; p++) + paravirt_write_inst(p->tag, nop_b_inst); + + ia64_sync_i(); + ia64_srlz_i(); +} + +/* + * Local variables: + * mode: C + * c-set-style: "linux" + * c-basic-offset: 8 + * tab-width: 8 + * indent-tabs-mode: t + * End: + */ diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S index 80622ac..0cbe0a1 100644 --- a/arch/ia64/kernel/vmlinux.lds.S +++ b/arch/ia64/kernel/vmlinux.lds.S @@ -163,6 +163,41 @@ SECTIONS __end___mckinley_e9_bundles = .; } +#if defined(CONFIG_PARAVIRT_ALT) + . = ALIGN(16); + .paravirt_bundles : AT(ADDR(.paravirt_bundles) - LOAD_OFFSET) + { + __start_paravirt_bundles = .; + *(.paravirt_bundles) + __stop_paravirt_bundles = .; + } + . = ALIGN(16); + .paravirt_insts : AT(ADDR(.paravirt_insts) - LOAD_OFFSET) + { + __start_paravirt_insts = .; + *(.paravirt_insts) + __stop_paravirt_insts = .; + } +#endif +#if defined(CONFIG_PARAVIRT_NOP_B_PATCH) + . = ALIGN(16); + .paravirt_nop_b : AT(ADDR(.paravirt_nop_b) - LOAD_OFFSET) + { + __start_paravirt_nop_b = .; + *(.paravirt_nop_b) + __stop_paravirt_nop_b = .; + } +#endif +#if defined(CONFIG_PARAVIRT_ENTRY) + . = ALIGN(16); + .paravirt_entry : AT(ADDR(.paravirt_entry) - LOAD_OFFSET) + { + __start_paravirt_entry = .; + *(.paravirt_entry) + __stop_paravirt_entry = .; + } +#endif + #if defined(CONFIG_IA64_GENERIC) /* Machine Vector */ . = ALIGN(16); diff --git a/include/asm-ia64/module.h b/include/asm-ia64/module.h index d2da61e..44f63ff 100644 --- a/include/asm-ia64/module.h +++ b/include/asm-ia64/module.h @@ -16,6 +16,12 @@ struct mod_arch_specific { struct elf64_shdr *got; /* global offset table */ struct elf64_shdr *opd; /* official procedure descriptors */ struct elf64_shdr *unwind; /* unwind-table section */ +#ifdef CONFIG_PARAVIRT_ALT + struct elf64_shdr *paravirt_bundles; + /* paravirt_alt_bundle_patch table */ + struct elf64_shdr *paravirt_insts; + /* paravirt_alt_inst_patch table */ +#endif unsigned long gp; /* global-pointer for module */ void *core_unw_table; /* core unwind-table cookie returned by unwinder */ diff --git a/include/asm-ia64/paravirt_alt.h b/include/asm-ia64/paravirt_alt.h new file mode 100644 index 0000000..34c5473 --- /dev/null +++ b/include/asm-ia64/paravirt_alt.h @@ -0,0 +1,82 @@ +/****************************************************************************** + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef __ASM_PARAVIRT_ALT_H +#define __ASM_PARAVIRT_ALT_H + +#ifndef __ASSEMBLER__ +/* for binary patch */ +struct paravirt_alt_bundle_patch { + void *sbundle; + void *ebundle; + unsigned long type; +}; + +/* label means the beginning of new bundle */ +#define paravirt_alt_bundle(instr, privop) \ + "\t1:\n" \ + "\t" instr "\n" \ + "\t2:\n" \ + "\t.section .paravirt_bundles, \"a\"\n" \ + "\t.previous\n" \ + "\t.xdata8 \".paravirt_bundles\", 1b, 2b, " \ + __stringify(privop) "\n" + +struct paravirt_alt_inst_patch { + unsigned long stag; + unsigned long etag; + unsigned long type; +}; + +#define paravirt_alt_inst(instr, privop) \ + "\t[1:]\n" \ + "\t" instr "\n" \ + "\t[2:]\n" \ + "\t.section .paravirt_insts, \"a\"\n" \ + "\t.previous\n" \ + "\t.xdata8 \".paravirt_insts\", 1b, 2b, " \ + __stringify(privop) "\n" + +void +paravirt_alt_bundle_patch_apply(struct paravirt_alt_bundle_patch *start, + struct paravirt_alt_bundle_patch *end, + unsigned long(*patch)(void *sbundle, + void *ebundle, + unsigned long type)); + +void +paravirt_alt_inst_patch_apply(struct paravirt_alt_inst_patch *start, + struct paravirt_alt_inst_patch *end, + unsigned long (*patch)(unsigned long stag, + unsigned long etag, + unsigned long type)); +#endif /* __ASSEMBLER__ */ + +#endif /* __ASM_PARAVIRT_ALT_H */ + +/* + * Local variables: + * mode: C + * c-set-style: "linux" + * c-basic-offset: 8 + * tab-width: 8 + * indent-tabs-mode: t + * End: + */ diff --git a/include/asm-ia64/paravirt_core.h b/include/asm-ia64/paravirt_core.h new file mode 100644 index 0000000..9979740 --- /dev/null +++ b/include/asm-ia64/paravirt_core.h @@ -0,0 +1,54 @@ +/****************************************************************************** + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef __ASM_PARAVIRT_CORE_H +#define __ASM_PARAVIRT_CORE_H + +#include <asm/kprobes.h> + +void paravirt_flush_i_cache_range(const void *instr, unsigned long size); + +bundle_t *paravirt_get_bundle(unsigned long tag); +unsigned long paravirt_get_slot(unsigned long tag); +unsigned long paravirt_get_next_tag(unsigned long tag); + +cmp_inst_t paravirt_read_slot0(const bundle_t *bundle); +cmp_inst_t paravirt_read_slot1(const bundle_t *bundle); +cmp_inst_t paravirt_read_slot2(const bundle_t *bundle); +cmp_inst_t paravirt_read_inst(unsigned long tag); + +void paravirt_write_slot0(bundle_t *bundle, cmp_inst_t inst); +void paravirt_write_slot1(bundle_t *bundle, cmp_inst_t inst); +void paravirt_write_slot2(bundle_t *bundle, cmp_inst_t inst); +void paravirt_write_inst(unsigned long tag, cmp_inst_t inst); + +void print_bundle(const bundle_t *bundle); + +#endif /* __ASM_PARAVIRT_CORE_H */ + +/* + * Local variables: + * mode: C + * c-set-style: "linux" + * c-basic-offset: 8 + * tab-width: 8 + * indent-tabs-mode: t + * End: + */ diff --git a/include/asm-ia64/paravirt_entry.h b/include/asm-ia64/paravirt_entry.h new file mode 100644 index 0000000..857fd37 --- /dev/null +++ b/include/asm-ia64/paravirt_entry.h @@ -0,0 +1,62 @@ +/****************************************************************************** + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef __ASM_PARAVIRT_ENTRY_H +#define __ASM_PARAVIRT_ENTRY_H + +#ifdef __ASSEMBLY__ + +#define BR_COND_SPTK_MANY(target, type) \ + [1:] ; \ + br.cond.sptk.many target;; ; \ + .section .paravirt_entry, "a" ; \ + .previous ; \ + .xdata8 ".paravirt_entry", 1b, type + +#else /* __ASSEMBLY__ */ + +struct paravirt_entry_patch { + unsigned long tag; + unsigned long type; +}; + +struct paravirt_entry { + void *entry; + unsigned long type; +}; + +void +paravirt_entry_patch_apply(const struct paravirt_entry_patch *start, + const struct paravirt_entry_patch *end, + const struct paravirt_entry *entries, + unsigned int nr_entries); + +#endif /* __ASSEMBLY__ */ + +#endif /* __ASM_PARAVIRT_ENTRY_H */ +/* + * Local variables: + * mode: C + * c-set-style: "linux" + * c-basic-offset: 8 + * tab-width: 8 + * indent-tabs-mode: t + * End: + */ diff --git a/include/asm-ia64/paravirt_nop.h b/include/asm-ia64/paravirt_nop.h new file mode 100644 index 0000000..2b05430 --- /dev/null +++ b/include/asm-ia64/paravirt_nop.h @@ -0,0 +1,46 @@ +/****************************************************************************** + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef __ASM_PARAVIRT_OPS_H +#define __ASM_PARAVIRT_OPS_H + +#ifndef __ASSEMBLY__ + +struct paravirt_nop_patch { + unsigned long tag; +}; + +void +paravirt_nop_b_patch_apply(const struct paravirt_nop_patch *start, + const struct paravirt_nop_patch *end); + +#endif /* !__ASSEMBLEY__ */ + +#endif /* __ASM_PARAVIRT_OPS_H */ + +/* + * Local variables: + * mode: C + * c-set-style: "linux" + * c-basic-offset: 8 + * tab-width: 8 + * indent-tabs-mode: t + * End: + */ -- 1.5.3
Isaku Yamahata
2008-Feb-28 09:57 UTC
[PATCH 3/5] ia64/pv_ops: define ia64 privileged instruction intrinsics for paravirtualized guest kernel.
Make ia64 privileged instruction intrinsics paravirtualizable with binary patching allowing each pv instances to override each intrinsics. Mark privileged instructions which needs paravirtualization and allow pv instance can binary patch at early boot time. Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp> --- arch/ia64/kernel/paravirtentry.S | 37 +++ include/asm-ia64/privop.h | 4 + include/asm-ia64/privop_paravirt.h | 587 ++++++++++++++++++++++++++++++++++++ 3 files changed, 628 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kernel/paravirtentry.S create mode 100644 include/asm-ia64/privop_paravirt.h diff --git a/arch/ia64/kernel/paravirtentry.S b/arch/ia64/kernel/paravirtentry.S new file mode 100644 index 0000000..013511f --- /dev/null +++ b/arch/ia64/kernel/paravirtentry.S @@ -0,0 +1,37 @@ +/****************************************************************************** + * linux/arch/ia64/xen/paravirtentry.S + * + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <asm/types.h> +#include <asm/asmmacro.h> +#include <asm/paravirt_entry.h> +#include <asm/privop_paravirt.h> + +#define BRANCH(sym, type) \ + GLOBAL_ENTRY(paravirt_ ## sym) ; \ + BR_COND_SPTK_MANY(native_ ## sym, type) ; \ + END(paravirt_ ## sym) + + BRANCH(switch_to, PARAVIRT_ENTRY_SWITCH_TO) + BRANCH(leave_syscall, PARAVIRT_ENTRY_LEAVE_SYSCALL) + BRANCH(work_processed_syscall, PARAVIRT_ENTRY_WORK_PROCESSED_SYSCALL) + BRANCH(leave_kernel, PARAVIRT_ENTRY_LEAVE_KERNEL) + BRANCH(pal_call_static, PARAVIRT_ENTRY_PAL_CALL_STATIC) diff --git a/include/asm-ia64/privop.h b/include/asm-ia64/privop.h index b0b74fd..69591e0 100644 --- a/include/asm-ia64/privop.h +++ b/include/asm-ia64/privop.h @@ -10,6 +10,10 @@ * */ +#ifdef CONFIG_PARAVIRT +#include <asm/privop_paravirt.h> +#endif + #ifdef CONFIG_XEN #include <asm/xen/privop.h> #endif diff --git a/include/asm-ia64/privop_paravirt.h b/include/asm-ia64/privop_paravirt.h new file mode 100644 index 0000000..bd7de70 --- /dev/null +++ b/include/asm-ia64/privop_paravirt.h @@ -0,0 +1,587 @@ +/****************************************************************************** + * privops_paravirt.h + * + * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef _ASM_IA64_PRIVOP_PARAVIRT_H +#define _ASM_IA64_PRIVOP_PARAVIRT_H + +#define PARAVIRT_INST_START 0x1 +#define PARAVIRT_INST_RFI (PARAVIRT_INST_START + 0x0) +#define PARAVIRT_INST_RSM_DT (PARAVIRT_INST_START + 0x1) +#define PARAVIRT_INST_SSM_DT (PARAVIRT_INST_START + 0x2) +#define PARAVIRT_INST_COVER (PARAVIRT_INST_START + 0x3) +#define PARAVIRT_INST_ITC_D (PARAVIRT_INST_START + 0x4) +#define PARAVIRT_INST_ITC_I (PARAVIRT_INST_START + 0x5) +#define PARAVIRT_INST_SSM_I (PARAVIRT_INST_START + 0x6) +#define PARAVIRT_INST_GET_IVR (PARAVIRT_INST_START + 0x7) +#define PARAVIRT_INST_GET_TPR (PARAVIRT_INST_START + 0x8) +#define PARAVIRT_INST_SET_TPR (PARAVIRT_INST_START + 0x9) +#define PARAVIRT_INST_EOI (PARAVIRT_INST_START + 0xa) +#define PARAVIRT_INST_SET_ITM (PARAVIRT_INST_START + 0xb) +#define PARAVIRT_INST_THASH (PARAVIRT_INST_START + 0xc) +#define PARAVIRT_INST_PTC_GA (PARAVIRT_INST_START + 0xd) +#define PARAVIRT_INST_ITR_D (PARAVIRT_INST_START + 0xe) +#define PARAVIRT_INST_GET_RR (PARAVIRT_INST_START + 0xf) +#define PARAVIRT_INST_SET_RR (PARAVIRT_INST_START + 0x10) +#define PARAVIRT_INST_SET_KR (PARAVIRT_INST_START + 0x11) +#define PARAVIRT_INST_FC (PARAVIRT_INST_START + 0x12) +#define PARAVIRT_INST_GET_CPUID (PARAVIRT_INST_START + 0x13) +#define PARAVIRT_INST_GET_PMD (PARAVIRT_INST_START + 0x14) +#define PARAVIRT_INST_GET_EFLAG (PARAVIRT_INST_START + 0x15) +#define PARAVIRT_INST_SET_EFLAG (PARAVIRT_INST_START + 0x16) +#define PARAVIRT_INST_RSM_BE (PARAVIRT_INST_START + 0x17) +#define PARAVIRT_INST_GET_PSR (PARAVIRT_INST_START + 0x18) +#define PARAVIRT_INST_SET_RR0_TO_RR4 (PARAVIRT_INST_START + 0x19) + +#define PARAVIRT_BNDL_START 0x10000000 +#define PARAVIRT_BNDL_SSM_I (PARAVIRT_BNDL_START + 0x0) +#define PARAVIRT_BNDL_RSM_I (PARAVIRT_BNDL_START + 0x1) +#define PARAVIRT_BNDL_GET_PSR_I (PARAVIRT_BNDL_START + 0x2) +#define PARAVIRT_BNDL_INTRIN_LOCAL_IRQ_RESTORE (PARAVIRT_BNDL_START + 0x3) + +/* + * struct task_struct* (*ia64_switch_to)(void* next_task); + * void *ia64_leave_syscall; + * void *ia64_work_processed_syscall + * void *ia64_leave_kernel; + * struct ia64_pal_retval (*pal_call_static)(u64, u64, u64, u64, u64); + */ + +#define PARAVIRT_ENTRY_START 0x20000000 +#define PARAVIRT_ENTRY_SWITCH_TO (PARAVIRT_ENTRY_START + 0) +#define PARAVIRT_ENTRY_LEAVE_SYSCALL (PARAVIRT_ENTRY_START + 1) +#define PARAVIRT_ENTRY_WORK_PROCESSED_SYSCALL (PARAVIRT_ENTRY_START + 2) +#define PARAVIRT_ENTRY_LEAVE_KERNEL (PARAVIRT_ENTRY_START + 3) +#define PARAVIRT_ENTRY_PAL_CALL_STATIC (PARAVIRT_ENTRY_START + 4) + + +#ifndef __ASSEMBLER__ + +#include <linux/stringify.h> +#include <linux/types.h> +#include <asm/paravirt_alt.h> +#include <asm/kregs.h> /* for IA64_PSR_I */ +#include <asm/xen/interface.h> + +/************************************************/ +/* Instructions paravirtualized for correctness */ +/************************************************/ +/* Note that "ttag" and "cover" are also privilege-sensitive; "ttag" + * is not currently used (though it may be in a long-format VHPT system!) */ +#ifdef ASM_SUPPORTED +static inline unsigned long +paravirt_fc(unsigned long addr) +{ + register __u64 ia64_intri_res asm ("r8"); + register __u64 __addr asm ("r8") = addr; + asm volatile (paravirt_alt_inst("fc %1", PARAVIRT_INST_THASH): + "=r"(ia64_intri_res): "0"(__addr): "memory"); + return ia64_intri_res; +} +#define paravirt_fc(addr) paravirt_fc((unsigned long)addr) + +static inline unsigned long +paravirt_thash(unsigned long addr) +{ + register __u64 ia64_intri_res asm ("r8"); + register __u64 __addr asm ("r8") = addr; + asm volatile (paravirt_alt_inst("thash %0=%1", PARAVIRT_INST_THASH): + "=r"(ia64_intri_res): "0"(__addr)); + return ia64_intri_res; +} + +static inline unsigned long +paravirt_get_cpuid(int index) +{ + register __u64 ia64_intri_res asm ("r8"); + register __u64 __index asm ("r8") = index; + asm volatile (paravirt_alt_inst("mov %0=cpuid[%r1]", + PARAVIRT_INST_GET_CPUID): + "=r"(ia64_intri_res): "0O"(__index)); + return ia64_intri_res; +} + +static inline unsigned long +paravirt_get_pmd(int index) +{ + register __u64 ia64_intri_res asm ("r8"); + register __u64 __index asm ("r8") = index; + asm volatile (paravirt_alt_inst("mov %0=pmd[%1]", + PARAVIRT_INST_GET_PMD): + "=r"(ia64_intri_res): "0"(__index)); + return ia64_intri_res; +} + +static inline unsigned long +paravirt_get_eflag(void) +{ + register __u64 ia64_intri_res asm ("r8"); + asm volatile (paravirt_alt_inst("mov %0=ar%1", + PARAVIRT_INST_GET_EFLAG): + "=r"(ia64_intri_res): + "i"(_IA64_REG_AR_EFLAG - _IA64_REG_AR_KR0): "memory"); + return ia64_intri_res; +} + +static inline void +paravirt_set_eflag(unsigned long val) +{ + register __u64 __val asm ("r8") = val; + asm volatile (paravirt_alt_inst("mov ar%0=%1", + PARAVIRT_INST_SET_EFLAG):: + "i"(_IA64_REG_AR_EFLAG - _IA64_REG_AR_KR0), "r"(__val): + "memory"); +} + +/************************************************/ +/* Instructions paravirtualized for performance */ +/************************************************/ + +static inline unsigned long +paravirt_get_psr(void) +{ + register __u64 ia64_intri_res asm ("r8"); + asm volatile (paravirt_alt_inst("mov %0=psr", PARAVIRT_INST_GET_PSR): + "=r"(ia64_intri_res)); + return ia64_intri_res; +} + +static inline unsigned long +paravirt_get_ivr(void) +{ + register __u64 ia64_intri_res asm ("r8"); + asm volatile (paravirt_alt_inst("mov %0=cr%1", PARAVIRT_INST_GET_IVR): + "=r"(ia64_intri_res): + "i" (_IA64_REG_CR_IVR - _IA64_REG_CR_DCR)); + return ia64_intri_res; +} + +static inline unsigned long +paravirt_get_tpr(void) +{ + register __u64 ia64_intri_res asm ("r8"); + asm volatile (paravirt_alt_inst("mov %0=cr%1", PARAVIRT_INST_GET_TPR): + "=r"(ia64_intri_res): + "i" (_IA64_REG_CR_TPR - _IA64_REG_CR_DCR)); + return ia64_intri_res; +} + +static inline void +paravirt_set_tpr(unsigned long val) +{ + register __u64 __val asm ("r8") = val; + asm volatile (paravirt_alt_inst("mov cr%0=%1", PARAVIRT_INST_SET_TPR):: + "i" (_IA64_REG_CR_TPR - _IA64_REG_CR_DCR), "r"(__val): + "memory"); +} + +static inline void +paravirt_eoi(unsigned long val) +{ + register __u64 __val asm ("r8") = val; + asm volatile (paravirt_alt_inst("mov cr%0=%1", PARAVIRT_INST_EOI):: + "i" (_IA64_REG_CR_EOI - _IA64_REG_CR_DCR), "r"(__val): + "memory"); +} + +static inline void +paravirt_set_itm(unsigned long val) +{ + register __u64 __val asm ("r8") = val; + asm volatile (paravirt_alt_inst("mov cr%0=%1", PARAVIRT_INST_SET_ITM):: + "i" (_IA64_REG_CR_ITM - _IA64_REG_CR_DCR), "r"(__val): + "memory"); +} + +static inline void +paravirt_ptcga(unsigned long addr, unsigned long size) +{ + register __u64 __addr asm ("r8") = addr; + register __u64 __size asm ("r9") = size; + asm volatile (paravirt_alt_inst("ptc.ga %0,%1", PARAVIRT_INST_PTC_GA):: + "r"(__addr), "r"(__size): "memory"); + ia64_dv_serialize_data(); +} + +static inline unsigned long +paravirt_get_rr(unsigned long index) +{ + register __u64 ia64_intri_res asm ("r8"); + register __u64 __index asm ("r8") = index; + asm volatile (paravirt_alt_inst("mov %0=rr[%1]", PARAVIRT_INST_GET_RR): + "=r"(ia64_intri_res) : "0" (__index)); + return ia64_intri_res; +} + +static inline void +paravirt_set_rr(unsigned long index, unsigned long val) +{ + register __u64 __index asm ("r8") = index; + register __u64 __val asm ("r9") = val; + asm volatile (paravirt_alt_inst("mov rr[%0]=%1", PARAVIRT_INST_SET_RR):: + "r"(__index), "r"(__val): "memory"); +} + +static inline void +paravirt_set_rr0_to_rr4(unsigned long val0, unsigned long val1, + unsigned long val2, unsigned long val3, + unsigned long val4) +{ + register __u64 __val0 asm ("r8") = val0; + register __u64 __val1 asm ("r9") = val1; + register __u64 __val2 asm ("r10") = val2; + register __u64 __val3 asm ("r11") = val3; + register __u64 __val4 asm ("r14") = val4; + asm volatile (paravirt_alt_inst("\t;;\n" + "\t{.mmi\n" + "\tmov rr[%0]=%1\n" + /* + * without this stop bit + * assembler complains. + */ + "\t;;\n" + "\tmov rr[%2]=%3\n" + "\tnop.i 0\n" + "\t}\n" + "\t{.mmi\n" + "\tmov rr[%4]=%5\n" + "\tmov rr[%6]=%7\n" + "\tnop.i 0\n" + "\t}\n" + "\tmov rr[%8]=%9;;\n", + PARAVIRT_INST_SET_RR0_TO_RR4):: + "r"(0x0000000000000000UL), "r"(__val0), + "r"(0x2000000000000000UL), "r"(__val1), + "r"(0x4000000000000000UL), "r"(__val2), + "r"(0x6000000000000000UL), "r"(__val3), + "r"(0x8000000000000000UL), "r"(__val4) : + "memory"); +} + +static inline void +paravirt_set_kr(unsigned long index, unsigned long val) +{ + register __u64 __index asm ("r8") = index - _IA64_REG_AR_KR0; + register __u64 __val asm ("r9") = val; + + /* + * asm volatile ("break %0":: + * "i"(PARAVIRT_INST_SET_KR), "r"(__index), "r"(__val)); + */ +#ifndef BUILD_BUG_ON +#define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)])) +#endif + BUILD_BUG_ON(!__builtin_constant_p(__index)); + switch (index) { + case _IA64_REG_AR_KR0: + asm volatile (paravirt_alt_inst("mov ar%0=%2", + PARAVIRT_INST_SET_KR):: + "i" (_IA64_REG_AR_KR0 - _IA64_REG_AR_KR0), + "r"(__index), "r"(__val): + "memory"); + break; + case _IA64_REG_AR_KR1: + asm volatile (paravirt_alt_inst("mov ar%0=%2", + PARAVIRT_INST_SET_KR):: + "i" (_IA64_REG_AR_KR1 - _IA64_REG_AR_KR0), + "r"(__index), "r"(__val): + "memory"); + break; + case _IA64_REG_AR_KR2: + asm volatile (paravirt_alt_inst("mov ar%0=%2", + PARAVIRT_INST_SET_KR):: + "i" (_IA64_REG_AR_KR2 - _IA64_REG_AR_KR0), + "r"(__index), "r"(__val): + "memory"); + break; + case _IA64_REG_AR_KR3: + asm volatile (paravirt_alt_inst("mov ar%0=%2", + PARAVIRT_INST_SET_KR):: + "i" (_IA64_REG_AR_KR3 - _IA64_REG_AR_KR0), + "r"(__index), "r"(__val): + "memory"); + break; + case _IA64_REG_AR_KR4: + asm volatile (paravirt_alt_inst("mov ar%0=%2", + PARAVIRT_INST_SET_KR):: + "i" (_IA64_REG_AR_KR4 - _IA64_REG_AR_KR0), + "r"(__index), "r"(__val): + "memory"); + break; + case _IA64_REG_AR_KR5: + asm volatile (paravirt_alt_inst("mov ar%0=%2", + PARAVIRT_INST_SET_KR):: + "i" (_IA64_REG_AR_KR5 - _IA64_REG_AR_KR0), + "r"(__index), "r"(__val): + "memory"); + break; + case _IA64_REG_AR_KR6: + asm volatile (paravirt_alt_inst("mov ar%0=%2", + PARAVIRT_INST_SET_KR):: + "i" (_IA64_REG_AR_KR6 - _IA64_REG_AR_KR0), + "r"(__index), "r"(__val): + "memory"); + break; + case _IA64_REG_AR_KR7: + asm volatile (paravirt_alt_inst("mov ar%0=%2", + PARAVIRT_INST_SET_KR):: + "i" (_IA64_REG_AR_KR7 - _IA64_REG_AR_KR0), + "r"(__index), "r"(__val): + "memory"); + break; + default: { + extern void compile_error_ar_kr_index_must_be_copmile_time_constant(void); + compile_error_ar_kr_index_must_be_copmile_time_constant(); + break; + } + } +} +#endif /* ASM_SUPPORTED */ + +static inline unsigned long +paravirt_getreg(unsigned long regnum) +{ + __u64 ia64_intri_res; + + switch (regnum) { + case _IA64_REG_PSR: + ia64_intri_res = paravirt_get_psr(); + break; + case _IA64_REG_CR_IVR: + ia64_intri_res = paravirt_get_ivr(); + break; + case _IA64_REG_CR_TPR: + ia64_intri_res = paravirt_get_tpr(); + break; + case _IA64_REG_AR_EFLAG: + ia64_intri_res = paravirt_get_eflag(); + break; + default: + ia64_intri_res = native_getreg(regnum); + break; + } + return ia64_intri_res; + } + +static inline void +paravirt_setreg(unsigned long regnum, unsigned long val) +{ + switch (regnum) { + case _IA64_REG_AR_KR0 ... _IA64_REG_AR_KR7: + paravirt_set_kr(regnum, val); + break; + case _IA64_REG_CR_ITM: + paravirt_set_itm(val); + break; + case _IA64_REG_CR_TPR: + paravirt_set_tpr(val); + break; + case _IA64_REG_CR_EOI: + paravirt_eoi(val); + break; + case _IA64_REG_AR_EFLAG: + paravirt_set_eflag(val); + break; + default: + native_setreg(regnum, val); + break; + } +} + +#ifdef ASM_SUPPORTED + +#define NOP_BUNDLE \ + "{\n\t" \ + "nop 0\n\t" \ + "nop 0\n\t" \ + "nop 0\n\t" \ + "}\n\t" + +static inline void +paravirt_ssm_i(void) +{ + /* five bundles */ + asm volatile (paravirt_alt_bundle("{\n\t" + "ssm psr.i\n\t" + "nop 0\n\t" + "nop 0\n\t" + "}\n\t" + NOP_BUNDLE + NOP_BUNDLE + NOP_BUNDLE + NOP_BUNDLE, + PARAVIRT_BNDL_SSM_I)::: + "r8", "r9", "r10", + "p6", "p7", + "memory"); +} + +static inline void +paravirt_rsm_i(void) +{ + /* two budles */ + asm volatile (paravirt_alt_bundle("{\n\t" + "rsm psr.i\n\t" + "nop 0\n\t" + "nop 0\n\t" + "}\n\t" + NOP_BUNDLE, + PARAVIRT_BNDL_RSM_I)::: + "r8", "r9", + "memory"); +} + +static inline unsigned long +paravirt_get_psr_i(void) +{ + register unsigned long psr_i asm ("r8"); + register unsigned long mask asm ("r9"); + + /* three bundles */ + asm volatile (paravirt_alt_bundle("{\n\t" + "mov %0=psr\n\t" + "mov %1=%2\n\t" + ";;\n\t" + "and %0=%0,%1\n\t" + "}\n\t" + NOP_BUNDLE + NOP_BUNDLE, + PARAVIRT_BNDL_GET_PSR_I): + "=r"(psr_i), + "=r"(mask) + : + "i"(IA64_PSR_I) + : + /* "r8", "r9", */ + "p6"); + return psr_i; +} + +static inline void +paravirt_intrin_local_irq_restore(unsigned long flags) +{ + register unsigned long __flags asm ("r8") = flags; + + /* six bundles */ + asm volatile (paravirt_alt_bundle(";;\n\t" + "{\n\t" + "cmp.ne p6,p7=%0,r0;;\n\t" + "(p6) ssm psr.i;\n\t" + "nop 0\n\t" + "}\n\t" + "{\n\t" + "(p7) rsm psr.i;;\n\t" + "(p6) srlz.d\n\t" + "nop 0\n\t" + "}\n\t" + NOP_BUNDLE + NOP_BUNDLE + NOP_BUNDLE + NOP_BUNDLE, + PARAVIRT_BNDL_INTRIN_LOCAL_IRQ_RESTORE):: + "r"(__flags) : + /* "r8",*/ "r9", "r10", "r11", + "p6", "p7", "p8", "p9", + "memory"); + +} + +#undef NOP_BUNDLE + +#endif /* ASM_SUPPORTED */ + +static inline void +paravirt_ssm(unsigned long mask) +{ + if (mask == IA64_PSR_I) + paravirt_ssm_i(); + else + native_ssm(mask); +} + +static inline void +paravirt_rsm(unsigned long mask) +{ + if (mask == IA64_PSR_I) + paravirt_rsm_i(); + else + native_rsm(mask); +} + +#if defined(ASM_SUPPORTED) && defined(CONFIG_PARAVIRT_ALT) + +#define IA64_PARAVIRTUALIZED_PRIVOP + +#define ia64_fc(addr) paravirt_fc(addr) +#define ia64_thash(addr) paravirt_thash(addr) +#define ia64_get_cpuid(i) paravirt_get_cpuid(i) +#define ia64_get_pmd(i) paravirt_get_pmd(i) +#define ia64_ptcga(addr, size) paravirt_ptcga((addr), (size)) +#define ia64_set_rr(index, val) paravirt_set_rr((index), (val)) +#define ia64_get_rr(index) paravirt_get_rr(index) +#define ia64_getreg(regnum) paravirt_getreg(regnum) +#define ia64_setreg(regnum, val) paravirt_setreg((regnum), (val)) +#define ia64_set_rr0_to_rr4(val0, val1, val2, val3, val4) \ + paravirt_set_rr0_to_rr4((val0), (val1), (val2), (val3), (val4)) + +#define ia64_ssm(mask) paravirt_ssm(mask) +#define ia64_rsm(mask) paravirt_rsm(mask) +#define ia64_get_psr_i() paravirt_get_psr_i() +#define ia64_intrin_local_irq_restore(x) \ + paravirt_intrin_local_irq_restore(x) + +/* the remainder of these are not performance-sensitive so its + * OK to not paravirtualize and just take a privop trap and emulate */ +#define ia64_hint native_hint +#define ia64_set_pmd native_set_pmd +#define ia64_itci native_itci +#define ia64_itcd native_itcd +#define ia64_itri native_itri +#define ia64_itrd native_itrd +#define ia64_tpa native_tpa +#define ia64_set_ibr native_set_ibr +#define ia64_set_pkr native_set_pkr +#define ia64_set_pmc native_set_pmc +#define ia64_get_ibr native_get_ibr +#define ia64_get_pkr native_get_pkr +#define ia64_get_pmc native_get_pmc +#define ia64_ptce native_ptce +#define ia64_ptcl native_ptcl +#define ia64_ptri native_ptri +#define ia64_ptrd native_ptrd + +#endif /* ASM_SUPPORTED && CONFIG_PARAVIRT_ALT */ + +#endif /* __ASSEMBLER__*/ + +/* these routines utilize privilege-sensitive or performance-sensitive + * privileged instructions so the code must be replaced with + * paravirtualized versions */ +#ifdef CONFIG_PARAVIRT_ENTRY +#define IA64_PARAVIRTUALIZED_ENTRY +#define ia64_switch_to paravirt_switch_to +#define ia64_work_processed_syscall paravirt_work_processed_syscall +#define ia64_leave_syscall paravirt_leave_syscall +#define ia64_leave_kernel paravirt_leave_kernel +#define ia64_pal_call_static paravirt_pal_call_static +#endif /* CONFIG_PARAVIRT_ENTRY */ + +#endif /* _ASM_IA64_PRIVOP_PARAVIRT_H */ -- 1.5.3
Isaku Yamahata
2008-Feb-28 09:57 UTC
[PATCH 4/5] ia64/xen: introduce xen paravirtualized intrinsic operations for privileged instruction.
Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp> --- arch/ia64/xen/hypercall.S | 124 +++++++++++ include/asm-ia64/xen/privop.h | 489 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 613 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/xen/hypercall.S diff --git a/arch/ia64/xen/hypercall.S b/arch/ia64/xen/hypercall.S new file mode 100644 index 0000000..a96f278 --- /dev/null +++ b/arch/ia64/xen/hypercall.S @@ -0,0 +1,124 @@ +/* + * Support routines for Xen hypercalls + * + * Copyright (C) 2005 Dan Magenheimer <dan.magenheimer at hp.com> + */ + +#include <asm/asmmacro.h> +#include <asm/intrinsics.h> + +#ifdef __INTEL_COMPILER +# undef ASM_SUPPORTED +#else +# define ASM_SUPPORTED +#endif + +#ifndef ASM_SUPPORTED +GLOBAL_ENTRY(xen_get_psr) + XEN_HYPER_GET_PSR + br.ret.sptk.many rp + ;; +END(xen_get_psr) + +GLOBAL_ENTRY(xen_get_ivr) + XEN_HYPER_GET_IVR + br.ret.sptk.many rp + ;; +END(xen_get_ivr) + +GLOBAL_ENTRY(xen_get_tpr) + XEN_HYPER_GET_TPR + br.ret.sptk.many rp + ;; +END(xen_get_tpr) + +GLOBAL_ENTRY(xen_set_tpr) + mov r8=r32 + XEN_HYPER_SET_TPR + br.ret.sptk.many rp + ;; +END(xen_set_tpr) + +GLOBAL_ENTRY(xen_eoi) + mov r8=r32 + XEN_HYPER_EOI + br.ret.sptk.many rp + ;; +END(xen_eoi) + +GLOBAL_ENTRY(xen_thash) + mov r8=r32 + XEN_HYPER_THASH + br.ret.sptk.many rp + ;; +END(xen_thash) + +GLOBAL_ENTRY(xen_set_itm) + mov r8=r32 + XEN_HYPER_SET_ITM + br.ret.sptk.many rp + ;; +END(xen_set_itm) + +GLOBAL_ENTRY(xen_ptcga) + mov r8=r32 + mov r9=r33 + XEN_HYPER_PTC_GA + br.ret.sptk.many rp + ;; +END(xen_ptcga) + +GLOBAL_ENTRY(xen_get_rr) + mov r8=r32 + XEN_HYPER_GET_RR + br.ret.sptk.many rp + ;; +END(xen_get_rr) + +GLOBAL_ENTRY(xen_set_rr) + mov r8=r32 + mov r9=r33 + XEN_HYPER_SET_RR + br.ret.sptk.many rp + ;; +END(xen_set_rr) + +GLOBAL_ENTRY(xen_set_kr) + mov r8=r32 + mov r9=r33 + XEN_HYPER_SET_KR + br.ret.sptk.many rp +END(xen_set_kr) + +GLOBAL_ENTRY(xen_fc) + mov r8=r32 + XEN_HYPER_FC + br.ret.sptk.many rp +END(xen_fc) + +GLOBAL_ENTRY(xen_get_cpuid) + mov r8=r32 + XEN_HYPER_GET_CPUID + br.ret.sptk.many rp +END(xen_get_cpuid) + +GLOBAL_ENTRY(xen_get_pmd) + mov r8=r32 + XEN_HYPER_GET_PMD + br.ret.sptk.many rp +END(xen_get_pmd) + +#ifdef CONFIG_IA32_SUPPORT +GLOBAL_ENTRY(xen_get_eflag) + XEN_HYPER_GET_EFLAG + br.ret.sptk.many rp +END(xen_get_eflag) + +// some bits aren't set if pl!=0, see SDM vol1 3.1.8 +GLOBAL_ENTRY(xen_set_eflag) + mov r8=r32 + XEN_HYPER_SET_EFLAG + br.ret.sptk.many rp +END(xen_set_eflag) +#endif /* CONFIG_IA32_SUPPORT */ +#endif /* ASM_SUPPORTED */ diff --git a/include/asm-ia64/xen/privop.h b/include/asm-ia64/xen/privop.h index 0fa8aa6..95e8e8a 100644 --- a/include/asm-ia64/xen/privop.h +++ b/include/asm-ia64/xen/privop.h @@ -70,6 +70,495 @@ #define XSI_IHA (XSI_BASE + XSI_IHA_OFS) #endif +#ifndef __ASSEMBLY__ +#define XEN_HYPER_SSM_I asm("break %0" : : "i" (HYPERPRIVOP_SSM_I)) +#define XEN_HYPER_GET_IVR asm("break %0" : : "i" (HYPERPRIVOP_GET_IVR)) + +/************************************************/ +/* Instructions paravirtualized for correctness */ +/************************************************/ + +/* "fc" and "thash" are privilege-sensitive instructions, meaning they + * may have different semantics depending on whether they are executed + * at PL0 vs PL!=0. When paravirtualized, these instructions mustn't + * be allowed to execute directly, lest incorrect semantics result. */ +#ifdef ASM_SUPPORTED +static inline void +xen_fc(unsigned long addr) +{ + register __u64 __addr asm ("r8") = addr; + asm volatile ("break %0":: "i"(HYPERPRIVOP_FC), "r"(__addr)); +} + +static inline unsigned long +xen_thash(unsigned long addr) +{ + register __u64 ia64_intri_res asm ("r8"); + register __u64 __addr asm ("r8") = addr; + asm volatile ("break %1": + "=r"(ia64_intri_res): + "i"(HYPERPRIVOP_THASH), "0"(__addr)); + return ia64_intri_res; +} +#else +extern void xen_fc(unsigned long addr); +extern unsigned long xen_thash(unsigned long addr); +#endif + +/* Note that "ttag" and "cover" are also privilege-sensitive; "ttag" + * is not currently used (though it may be in a long-format VHPT system!) + * and the semantics of cover only change if psr.ic is off which is very + * rare (and currently non-existent outside of assembly code */ + +/* There are also privilege-sensitive registers. These registers are + * readable at any privilege level but only writable at PL0. */ +#ifdef ASM_SUPPORTED +static inline unsigned long +xen_get_cpuid(int index) +{ + register __u64 ia64_intri_res asm ("r8"); + register __u64 __index asm ("r8") = index; + asm volatile ("break %1": + "=r"(ia64_intri_res): + "i"(HYPERPRIVOP_GET_CPUID), "0"(__index)); + return ia64_intri_res; +} + +static inline unsigned long +xen_get_pmd(int index) +{ + register __u64 ia64_intri_res asm ("r8"); + register __u64 __index asm ("r8") = index; + asm volatile ("break %1": + "=r"(ia64_intri_res): + "i"(HYPERPRIVOP_GET_PMD), "0O"(__index)); + return ia64_intri_res; +} +#else +extern unsigned long xen_get_cpuid(int index); +extern unsigned long xen_get_pmd(int index); +#endif + +#ifdef ASM_SUPPORTED +static inline unsigned long +xen_get_eflag(void) +{ + register __u64 ia64_intri_res asm ("r8"); + asm volatile ("break %1": + "=r"(ia64_intri_res): "i"(HYPERPRIVOP_GET_EFLAG)); + return ia64_intri_res; +} + +static inline void +xen_set_eflag(unsigned long val) +{ + register __u64 __val asm ("r8") = val; + asm volatile ("break %0":: "i"(HYPERPRIVOP_SET_EFLAG), "r"(__val)); +} +#else +extern unsigned long xen_get_eflag(void); /* see xen_ia64_getreg */ +extern void xen_set_eflag(unsigned long); /* see xen_ia64_setreg */ +#endif + +/************************************************/ +/* Instructions paravirtualized for performance */ +/************************************************/ + +/* Xen uses memory-mapped virtual privileged registers for access to many + * performance-sensitive privileged registers. Some, like the processor + * status register (psr), are broken up into multiple memory locations. + * Others, like "pend", are abstractions based on privileged registers. + * "Pend" is guaranteed to be set if reading cr.ivr would return a + * (non-spurious) interrupt. */ +#define XEN_MAPPEDREGS ((struct mapped_regs *)XMAPPEDREGS_BASE) + +#define XSI_PSR_I \ + (*XEN_MAPPEDREGS->interrupt_mask_addr) +#define xen_get_virtual_psr_i() \ + (!XSI_PSR_I) +#define xen_set_virtual_psr_i(_val) \ + ({ XSI_PSR_I = (uint8_t)(_val) ? 0 : 1; }) +#define xen_set_virtual_psr_ic(_val) \ + ({ XEN_MAPPEDREGS->interrupt_collection_enabled = _val ? 1 : 0; }) +#define xen_get_virtual_pend() \ + (*(((uint8_t *)XEN_MAPPEDREGS->interrupt_mask_addr) - 1)) + +/* Hyperprivops are "break" instructions with a well-defined API. + * In particular, the virtual psr.ic bit must be off; in this way + * it is guaranteed to never conflict with a linux break instruction. + * Normally, this is done in a xen stub but this one is frequent enough + * that we inline it */ +#define xen_hyper_ssm_i() \ +({ \ + XEN_HYPER_SSM_I; \ +}) + +/* turning off interrupts can be paravirtualized simply by writing + * to a memory-mapped virtual psr.i bit (implemented as a 16-bit bool) */ +#define xen_rsm_i() \ +do { \ + xen_set_virtual_psr_i(0); \ + barrier(); \ +} while (0) + +/* turning on interrupts is a bit more complicated.. write to the + * memory-mapped virtual psr.i bit first (to avoid race condition), + * then if any interrupts were pending, we have to execute a hyperprivop + * to ensure the pending interrupt gets delivered; else we're done! */ +#define xen_ssm_i() \ +do { \ + int old = xen_get_virtual_psr_i(); \ + xen_set_virtual_psr_i(1); \ + barrier(); \ + if (!old && xen_get_virtual_pend()) \ + xen_hyper_ssm_i(); \ +} while (0) + +#define xen_ia64_intrin_local_irq_restore(x) \ +do { \ + if (is_running_on_xen()) { \ + if ((x) & IA64_PSR_I) \ + xen_ssm_i(); \ + else \ + xen_rsm_i(); \ + } else { \ + native_intrin_local_irq_restore((x)); \ + } \ +} while (0) + +#define xen_get_psr_i() \ +({ \ + \ + (is_running_on_xen()) ? \ + (xen_get_virtual_psr_i() ? IA64_PSR_I : 0) \ + : native_get_psr_i() \ +}) + +#define xen_ia64_ssm(mask) \ +do { \ + if ((mask) == IA64_PSR_I) { \ + if (is_running_on_xen()) \ + xen_ssm_i(); \ + else \ + native_ssm(mask); \ + } else { \ + native_ssm(mask); \ + } \ +} while (0) + +#define xen_ia64_rsm(mask) \ +do { \ + if ((mask) == IA64_PSR_I) { \ + if (is_running_on_xen()) \ + xen_rsm_i(); \ + else \ + native_rsm(mask); \ + } else { \ + native_rsm(mask); \ + } \ +} while (0) + +/* Although all privileged operations can be left to trap and will + * be properly handled by Xen, some are frequent enough that we use + * hyperprivops for performance. */ + +#ifndef ASM_SUPPORTED +extern unsigned long xen_get_psr(void); +extern unsigned long xen_get_ivr(void); +extern unsigned long xen_get_tpr(void); +extern void xen_set_itm(unsigned long); +extern void xen_set_tpr(unsigned long); +extern void xen_eoi(unsigned long); +extern void xen_set_rr(unsigned long index, unsigned long val); +extern unsigned long xen_get_rr(unsigned long index); +extern void xen_set_kr(unsigned long index, unsigned long val); +extern void xen_ptcga(unsigned long addr, unsigned long size); +#else +static inline unsigned long +xen_get_psr(void) +{ + register __u64 ia64_intri_res asm ("r8"); + asm volatile ("break %1": + "=r"(ia64_intri_res): "i"(HYPERPRIVOP_GET_PSR)); + return ia64_intri_res; +} + +static inline unsigned long +xen_get_ivr(void) +{ + register __u64 ia64_intri_res asm ("r8"); + asm volatile ("break %1": + "=r"(ia64_intri_res): "i"(HYPERPRIVOP_GET_IVR)); + return ia64_intri_res; +} + +static inline unsigned long +xen_get_tpr(void) +{ + register __u64 ia64_intri_res asm ("r8"); + asm volatile ("break %1": + "=r"(ia64_intri_res): "i"(HYPERPRIVOP_GET_TPR)); + return ia64_intri_res; +} + +static inline void +xen_set_tpr(unsigned long val) +{ + register __u64 __val asm ("r8") = val; + asm volatile ("break %0":: + "i"(HYPERPRIVOP_GET_TPR), "r"(__val)); +} + +static inline void +xen_eoi(unsigned long val) +{ + register __u64 __val asm ("r8") = val; + asm volatile ("break %0":: + "i"(HYPERPRIVOP_EOI), "r"(__val)); +} + +static inline void +xen_set_itm(unsigned long val) +{ + register __u64 __val asm ("r8") = val; + asm volatile ("break %0":: "i"(HYPERPRIVOP_SET_ITM), "r"(__val)); +} + +static inline void +xen_ptcga(unsigned long addr, unsigned long size) +{ + register __u64 __addr asm ("r8") = addr; + register __u64 __size asm ("r9") = size; + asm volatile ("break %0":: + "i"(HYPERPRIVOP_PTC_GA), "r"(__addr), "r"(__size)); +} + +static inline unsigned long +xen_get_rr(unsigned long index) +{ + register __u64 ia64_intri_res asm ("r8"); + register __u64 __index asm ("r8") = index; + asm volatile ("break %1": + "=r"(ia64_intri_res): + "i"(HYPERPRIVOP_GET_RR), "0"(__index)); + return ia64_intri_res; +} + +static inline void +xen_set_rr(unsigned long index, unsigned long val) +{ + register __u64 __index asm ("r8") = index; + register __u64 __val asm ("r9") = val; + asm volatile ("break %0":: + "i"(HYPERPRIVOP_SET_RR), "r"(__index), "r"(__val)); +} + +static inline void +xen_set_rr0_to_rr4(unsigned long val0, unsigned long val1, + unsigned long val2, unsigned long val3, unsigned long val4) +{ + register __u64 __val0 asm ("r8") = val0; + register __u64 __val1 asm ("r9") = val1; + register __u64 __val2 asm ("r10") = val2; + register __u64 __val3 asm ("r11") = val3; + register __u64 __val4 asm ("r14") = val4; + asm volatile ("break %0" :: + "i"(HYPERPRIVOP_SET_RR0_TO_RR4), + "r"(__val0), "r"(__val1), + "r"(__val2), "r"(__val3), "r"(__val4)); +} + +static inline void +xen_set_kr(unsigned long index, unsigned long val) +{ + register __u64 __index asm ("r8") = index; + register __u64 __val asm ("r9") = val; + asm volatile ("break %0":: + "i"(HYPERPRIVOP_SET_KR), "r"(__index), "r"(__val)); +} +#endif + +/* Note: It may look wrong to test for is_running_on_xen() in each case. + * However regnum is always a constant so, as written, the compiler + * eliminates the switch statement, whereas is_running_on_xen() must be + * tested dynamically. */ +#define xen_ia64_getreg(regnum) \ +({ \ + __u64 ia64_intri_res; \ + \ + switch (regnum) { \ + case _IA64_REG_PSR: \ + ia64_intri_res = (is_running_on_xen()) ? \ + xen_get_psr() : \ + native_getreg(regnum); \ + break; \ + case _IA64_REG_CR_IVR: \ + ia64_intri_res = (is_running_on_xen()) ? \ + xen_get_ivr() : \ + native_getreg(regnum); \ + break; \ + case _IA64_REG_CR_TPR: \ + ia64_intri_res = (is_running_on_xen()) ? \ + xen_get_tpr() : \ + native_getreg(regnum); \ + break; \ + case _IA64_REG_AR_EFLAG: \ + ia64_intri_res = (is_running_on_xen()) ? \ + xen_get_eflag() : \ + native_getreg(regnum); \ + break; \ + default: \ + ia64_intri_res = native_getreg(regnum); \ + break; \ + } \ + ia64_intri_res; \ +}) + +#define xen_ia64_setreg(regnum, val) \ +({ \ + switch (regnum) { \ + case _IA64_REG_AR_KR0 ... _IA64_REG_AR_KR7: \ + (is_running_on_xen()) ? \ + xen_set_kr(((regnum)-_IA64_REG_AR_KR0), (val)) :\ + native_setreg((regnum), (val)); \ + break; \ + case _IA64_REG_CR_ITM: \ + (is_running_on_xen()) ? \ + xen_set_itm(val) : \ + native_setreg((regnum), (val)); \ + break; \ + case _IA64_REG_CR_TPR: \ + (is_running_on_xen()) ? \ + xen_set_tpr(val) : \ + native_setreg((regnum), (val)); \ + break; \ + case _IA64_REG_CR_EOI: \ + (is_running_on_xen()) ? \ + xen_eoi(val) : \ + native_setreg((regnum), (val)); \ + break; \ + case _IA64_REG_AR_EFLAG: \ + (is_running_on_xen()) ? \ + xen_set_eflag(val) : \ + native_setreg((regnum), (val)); \ + break; \ + default: \ + native_setreg((regnum), (val)); \ + break; \ + } \ +}) + +#if defined(ASM_SUPPORTED) && !defined(CONFIG_PARAVIRT_ALT) + +#define IA64_PARAVIRTUALIZED_PRIVOP + +#define ia64_fc(addr) \ +do { \ + if (is_running_on_xen()) \ + xen_fc((unsigned long)(addr)); \ + else \ + native_fc(addr); \ +} while (0) + +#define ia64_thash(addr) \ +({ \ + unsigned long ia64_intri_res; \ + if (is_running_on_xen()) \ + ia64_intri_res = \ + xen_thash((unsigned long)(addr)); \ + else \ + ia64_intri_res = native_thash(addr); \ + ia64_intri_res; \ +}) + +#define ia64_get_cpuid(i) \ +({ \ + unsigned long ia64_intri_res; \ + if (is_running_on_xen()) \ + ia64_intri_res = xen_get_cpuid(i); \ + else \ + ia64_intri_res = native_get_cpuid(i); \ + ia64_intri_res; \ +}) + +#define ia64_get_pmd(i) \ +({ \ + unsigned long ia64_intri_res; \ + if (is_running_on_xen()) \ + ia64_intri_res = xen_get_pmd(i); \ + else \ + ia64_intri_res = native_get_pmd(i); \ + ia64_intri_res; \ +}) + + +#define ia64_ptcga(addr, size) \ +do { \ + if (is_running_on_xen()) \ + xen_ptcga((addr), (size)); \ + else \ + native_ptcga((addr), (size)); \ +} while (0) + +#define ia64_set_rr(index, val) \ +do { \ + if (is_running_on_xen()) \ + xen_set_rr((index), (val)); \ + else \ + native_set_rr((index), (val)); \ +} while (0) + +#define ia64_get_rr(index) \ +({ \ + __u64 ia64_intri_res; \ + if (is_running_on_xen()) \ + ia64_intri_res = xen_get_rr((index)); \ + else \ + ia64_intri_res = native_get_rr((index)); \ + ia64_intri_res; \ +}) + +#define ia64_set_rr0_to_rr4(val0, val1, val2, val3, val4) \ +do { \ + if (is_running_on_xen()) \ + xen_set_rr0_to_rr4((val0), (val1), (val2), \ + (val3), (val4)); \ + else \ + native_set_rr0_to_rr4((val0), (val1), (val2), \ + (val3), (val4)); \ +} while (0) + +#define ia64_getreg xen_ia64_getreg +#define ia64_setreg xen_ia64_setreg +#define ia64_ssm xen_ia64_ssm +#define ia64_rsm xen_ia64_rsm +#define ia64_intrin_local_irq_restore xen_ia64_intrin_local_irq_restore +#define ia64_get_psr_i xen_get_psr_i + +/* the remainder of these are not performance-sensitive so its + * OK to not paravirtualize and just take a privop trap and emulate */ +#define ia64_hint native_hint +#define ia64_set_pmd native_set_pmd +#define ia64_itci native_itci +#define ia64_itcd native_itcd +#define ia64_itri native_itri +#define ia64_itrd native_itrd +#define ia64_tpa native_tpa +#define ia64_set_ibr native_set_ibr +#define ia64_set_pkr native_set_pkr +#define ia64_set_pmc native_set_pmc +#define ia64_get_ibr native_get_ibr +#define ia64_get_pkr native_get_pkr +#define ia64_get_pmc native_get_pmc +#define ia64_ptce native_ptce +#define ia64_ptcl native_ptcl +#define ia64_ptri native_ptri +#define ia64_ptrd native_ptrd + +#endif /* ASM_SUPPORTED && !CONFIG_PARAVIRT_ALT */ + +#endif /* !__ASSEMBLY__ */ + /* these routines utilize privilege-sensitive or performance-sensitive * privileged instructions so the code must be replaced with * paravirtualized versions */ -- 1.5.3
Isaku Yamahata
2008-Feb-28 09:57 UTC
[PATCH 5/5] ia64/pv_ops/xen: xen privileged instruction intrinsics with binary patch.
With binary patching, make intrinsics paravirtualization hypervisor neutral. So far xen intrinsics doesn't allow another hypervisor. Binary patch marked privileged operations which needs paravirtualization if running on xen at early boot time. Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp> --- arch/ia64/kernel/module.c | 32 +++++ arch/ia64/xen/Makefile | 7 + arch/ia64/xen/paravirt_xen.c | 242 +++++++++++++++++++++++++++++++++++ arch/ia64/xen/privops_asm.S | 221 ++++++++++++++++++++++++++++++++ arch/ia64/xen/privops_c.c | 279 +++++++++++++++++++++++++++++++++++++++++ arch/ia64/xen/xensetup.S | 10 ++ include/asm-ia64/xen/privop.h | 24 ++++ 7 files changed, 815 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/xen/Makefile create mode 100644 arch/ia64/xen/paravirt_xen.c create mode 100644 arch/ia64/xen/privops_asm.S create mode 100644 arch/ia64/xen/privops_c.c diff --git a/arch/ia64/kernel/module.c b/arch/ia64/kernel/module.c index e58f436..2806f70 100644 --- a/arch/ia64/kernel/module.c +++ b/arch/ia64/kernel/module.c @@ -454,6 +454,14 @@ module_frob_arch_sections (Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, char *secstrings, mod->arch.opd = s; else if (strcmp(".IA_64.unwind", secstrings + s->sh_name) == 0) mod->arch.unwind = s; +#ifdef CONFIG_PARAVIRT_ALT + else if (strcmp(".paravirt_bundles", + secstrings + s->sh_name) == 0) + mod->arch.paravirt_bundles = s; + else if (strcmp(".paravirt_insts", + secstrings + s->sh_name) == 0) + mod->arch.paravirt_insts = s; +#endif if (!mod->arch.core_plt || !mod->arch.init_plt || !mod->arch.got || !mod->arch.opd) { printk(KERN_ERR "%s: sections missing\n", mod->name); @@ -929,6 +937,30 @@ module_finalize (const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs, struct module *mo DEBUGP("%s: init: entry=%p\n", __FUNCTION__, mod->init); if (mod->arch.unwind) register_unwind_table(mod); +#ifdef CONFIG_PARAVIRT_ALT + if (mod->arch.paravirt_bundles) { + struct paravirt_alt_bundle_patch *start + (struct paravirt_alt_bundle_patch *) + mod->arch.paravirt_bundles->sh_addr; + struct paravirt_alt_bundle_patch *end + (struct paravirt_alt_bundle_patch *) + (mod->arch.paravirt_bundles->sh_addr + + mod->arch.paravirt_bundles->sh_size); + + xen_alt_bundle_patch_module(start, end); + } + if (mod->arch.paravirt_insts) { + struct paravirt_alt_inst_patch *start + (struct paravirt_alt_inst_patch *) + mod->arch.paravirt_insts->sh_addr; + struct paravirt_alt_inst_patch *end + (struct paravirt_alt_inst_patch *) + (mod->arch.paravirt_insts->sh_addr + + mod->arch.paravirt_insts->sh_size); + + xen_alt_inst_patch_module(start, end); + } +#endif return 0; } diff --git a/arch/ia64/xen/Makefile b/arch/ia64/xen/Makefile new file mode 100644 index 0000000..c219358 --- /dev/null +++ b/arch/ia64/xen/Makefile @@ -0,0 +1,7 @@ +# +# Makefile for Xen components +# + +obj-$(CONFIG_PARAVIRT_ALT) += paravirt_xen.o privops_asm.o privops_c.o +obj-$(CONFIG_PARAVIRT_NOP_B_PATCH) += paravirt_xen.o +obj-$(CONFIG_PARAVIRT_ENTRY) += paravirt_xen.o diff --git a/arch/ia64/xen/paravirt_xen.c b/arch/ia64/xen/paravirt_xen.c new file mode 100644 index 0000000..57b9dfd --- /dev/null +++ b/arch/ia64/xen/paravirt_xen.c @@ -0,0 +1,242 @@ +/****************************************************************************** + * linux/arch/ia64/xen/paravirt_xen.c + * + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <linux/types.h> +#include <linux/string.h> +#include <linux/init.h> +#include <asm/intrinsics.h> +#include <asm/bugs.h> +#include <asm/kprobes.h> /* for bundle_t */ +#include <asm/paravirt_core.h> + +#ifdef CONFIG_PARAVIRT_ALT +struct xen_alt_bundle_patch_elem { + const void *sbundle; + const void *ebundle; + unsigned long type; +}; + +static unsigned long __init_or_module +__xen_alt_bundle_patch(void *sbundle, void *ebundle, unsigned long type) +{ + extern const struct xen_alt_bundle_patch_elem xen_alt_bundle_array[]; + extern const unsigned long xen_alt_bundle_array_size; + + unsigned long used = 0; + unsigned long i; + + BUG_ON((((unsigned long)sbundle) % sizeof(bundle_t)) != 0); + BUG_ON((((unsigned long)ebundle) % sizeof(bundle_t)) != 0); + + for (i = 0; + i < xen_alt_bundle_array_size / sizeof(xen_alt_bundle_array[0]); + i++) { + const struct xen_alt_bundle_patch_elem *p + &xen_alt_bundle_array[i]; + if (p->type == type) { + used = p->ebundle - p->sbundle; + BUG_ON(used > ebundle - sbundle); + memcpy(sbundle, p->sbundle, used); + break; + } + } + + return used; +} + +static void __init +xen_alt_bundle_patch(void) +{ + extern struct paravirt_alt_bundle_patch __start_paravirt_bundles[]; + extern struct paravirt_alt_bundle_patch __stop_paravirt_bundles[]; + + paravirt_alt_bundle_patch_apply(__start_paravirt_bundles, + __stop_paravirt_bundles, + &__xen_alt_bundle_patch); +} + +#ifdef CONFIG_MODULES +void +xen_alt_bundle_patch_module(struct paravirt_alt_bundle_patch *start, + struct paravirt_alt_bundle_patch *end) +{ + if (is_running_on_xen()) + paravirt_alt_bundle_patch_apply(start, end, + &__xen_alt_bundle_patch); +} +#endif /* CONFIG_MODULES */ + + +/* + * all the native instructions of hyperprivops are M-form or I-form + * mov ar.<imm>=r1 I26, M29 + * mov r1=ar.<imm> I28, M31 + * mov r1=cr.<imm> M32 + * mov cr.<imm>=r1 M33 + * mov r1=psr M36 + * mov indirect<r1>=r2 M42 + * mov r1=indirect<r2> M43 + * ptc.ga M45 + * thash r1=r2 M46 + * + * break.{m, i} instrucitions format are same. + * So we can safely replace all signle instruction which is target of + * hyperpviops with break.{m, i} imm21 hyperprivops. + */ + +struct xen_alt_inst_patch_elem { + unsigned long stag; + unsigned long etag; + unsigned long type; +}; + +unsigned long +__xen_alt_inst_patch(unsigned long stag, unsigned long etag, + unsigned long type) +{ + extern const struct xen_alt_inst_patch_elem xen_alt_inst_array[]; + extern const unsigned long xen_alt_inst_array_size; + + unsigned long dest_tag = stag; + unsigned long i; + + for (i = 0; + i < xen_alt_inst_array_size / sizeof(xen_alt_inst_array[0]); + i++) { + const struct xen_alt_inst_patch_elem *p + &xen_alt_inst_array[i]; + if (p->type == type) { + unsigned long src_tag; + + for (src_tag = p->stag; + src_tag < p->etag; + src_tag = paravirt_get_next_tag(src_tag)) { + const cmp_inst_t inst + paravirt_read_inst(src_tag); + paravirt_write_inst(dest_tag, inst); + + BUG_ON(dest_tag >= etag); + dest_tag = paravirt_get_next_tag(dest_tag); + } + break; + } + } + + return dest_tag; +} + +void +xen_alt_inst_patch(void) +{ + extern struct paravirt_alt_inst_patch __start_paravirt_insts[]; + extern struct paravirt_alt_inst_patch __stop_paravirt_insts[]; + + paravirt_alt_inst_patch_apply(__start_paravirt_insts, + __stop_paravirt_insts, + &__xen_alt_inst_patch); +} + +#ifdef CONFIG_MODULES +void +xen_alt_inst_patch_module(struct paravirt_alt_inst_patch *start, + struct paravirt_alt_inst_patch *end) +{ + if (is_running_on_xen()) + paravirt_alt_inst_patch_apply(start, end, + &__xen_alt_inst_patch); +} +#endif + +#else +#define xen_alt_bundle_patch() do { } while (0) +#define xen_alt_inst_patch() do { } while (0) +#endif /* CONFIG_PARAVIRT_ALT */ + + +#ifdef CONFIG_PARAVIRT_NOP_B_PATCH +#include <asm/paravirt_nop.h> +static void __init +xen_nop_b_patch(void) +{ + extern const struct paravirt_nop_patch __start_paravirt_nop_b[]; + extern const struct paravirt_nop_patch __stop_paravirt_nop_b[]; + + paravirt_nop_b_patch_apply(__start_paravirt_nop_b, + __stop_paravirt_nop_b); +} +#else +#define xen_nop_b_patch() do { } while (0) +#endif + + +#ifdef CONFIG_PARAVIRT_ENTRY + +#include <asm/paravirt_entry.h> + +extern void *xen_switch_to; +extern void *xen_leave_syscall; +extern void *xen_leave_kernel; +extern void *xen_pal_call_static; +extern void *xen_work_processed_syscall; + +const static struct paravirt_entry xen_entries[] __initdata = { + {&xen_switch_to, PARAVIRT_ENTRY_SWITCH_TO}, + {&xen_leave_syscall, PARAVIRT_ENTRY_LEAVE_SYSCALL}, + {&xen_leave_kernel, PARAVIRT_ENTRY_LEAVE_KERNEL}, + {&xen_pal_call_static, PARAVIRT_ENTRY_PAL_CALL_STATIC}, + {&xen_work_processed_syscall, PARAVIRT_ENTRY_WORK_PROCESSED_SYSCALL}, +}; + +void __init +xen_entry_patch(void) +{ + extern const struct paravirt_entry_patch __start_paravirt_entry[]; + extern const struct paravirt_entry_patch __stop_paravirt_entry[]; + + paravirt_entry_patch_apply(__start_paravirt_entry, + __stop_paravirt_entry, + xen_entries, + sizeof(xen_entries)/sizeof(xen_entries[0])); +} +#else +#define xen_entry_patch() do { } while (0) +#endif + + +void __init +xen_paravirt_patch(void) +{ + xen_alt_bundle_patch(); + xen_alt_inst_patch(); + xen_nop_b_patch(); + xen_entry_patch(); +} + +/* + * Local variables: + * mode: C + * c-set-style: "linux" + * c-basic-offset: 8 + * tab-width: 8 + * indent-tabs-mode: t + * End: + */ diff --git a/arch/ia64/xen/privops_asm.S b/arch/ia64/xen/privops_asm.S new file mode 100644 index 0000000..40e400e --- /dev/null +++ b/arch/ia64/xen/privops_asm.S @@ -0,0 +1,221 @@ +/****************************************************************************** + * linux/arch/ia64/xen/privop_s.S + * + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <asm/intrinsics.h> +#include <linux/init.h> +#include <asm/paravirt_alt.h> + +#ifdef CONFIG_MODULES +#define __INIT_OR_MODULE .text +#define __INITDATA_OR_MODULE .data +#else +#define __INIT_OR_MODULE __INIT +#define __INITDATA_OR_MODULE __INITDATA +#endif /* CONFIG_MODULES */ + + __INIT_OR_MODULE + .align 32 + .proc nop_b_inst_bundle + .global nop_b_inst_bundle +nop_b_inst_bundle: + { + nop.b 0 + nop.b 0 + nop.b 0 + } + .endp nop_b_inst_bundle + __FINIT + + /* NOTE: nop.[mfi] has same format */ + __INIT_OR_MODULE + .align 32 + .proc nop_mfi_inst_bundle + .global nop_mfi_inst_bundle +nop_mfi_inst_bundle: + { + nop.m 0 + nop.f 0 + nop.i 0 + } + .endp nop_mfi_inst_bundle + __FINIT + + __INIT_OR_MODULE + .align 32 + .proc nop_bundle + .global nop_bundle +nop_bundle: +nop_bundle_start: + { + nop 0 + nop 0 + nop 0 + } +nop_bundle_end: + .endp nop_bundle + __FINIT + + __INITDATA_OR_MODULE + .align 8 + .global nop_bundle_size +nop_bundle_size: + data8 nop_bundle_end - nop_bundle_start + +#define DEFINE_PRIVOP(name, instr) \ + .align 32; \ + .proc xen_ ## name ## _instr; \ + xen_ ## name ## _instr:; \ + xen_ ## name ## _instr_start:; \ + {; \ + [xen_ ## name ## _stag:] \ + instr; \ + [xen_ ## name ## _etag:] \ + nop 0; \ + nop 0; \ + }; \ + xen_ ## name ## _instr_end:; \ + .endp xen_ ## name ## _instr; + + __INIT_OR_MODULE + DEFINE_PRIVOP(rfi, XEN_HYPER_RFI) + DEFINE_PRIVOP(rsm_psr_dt, XEN_HYPER_RSM_PSR_DT) + DEFINE_PRIVOP(ssm_psr_dt, XEN_HYPER_SSM_PSR_DT) + DEFINE_PRIVOP(cover, XEN_HYPER_COVER) + DEFINE_PRIVOP(itc_d, XEN_HYPER_ITC_D) + DEFINE_PRIVOP(itc_i, XEN_HYPER_ITC_I) + DEFINE_PRIVOP(ssm_i, XEN_HYPER_SSM_I) + DEFINE_PRIVOP(get_ivr, XEN_HYPER_GET_IVR) + DEFINE_PRIVOP(get_tpr, XEN_HYPER_GET_TPR) + DEFINE_PRIVOP(set_tpr, XEN_HYPER_SET_TPR) + DEFINE_PRIVOP(eoi, XEN_HYPER_EOI) + DEFINE_PRIVOP(set_itm, XEN_HYPER_SET_ITM) + DEFINE_PRIVOP(thash, XEN_HYPER_THASH) + DEFINE_PRIVOP(ptc_ga, XEN_HYPER_PTC_GA) + DEFINE_PRIVOP(itr_d, XEN_HYPER_ITR_D) + DEFINE_PRIVOP(get_rr, XEN_HYPER_GET_RR) + DEFINE_PRIVOP(set_rr, XEN_HYPER_SET_RR) + DEFINE_PRIVOP(set_kr, XEN_HYPER_SET_KR) + DEFINE_PRIVOP(fc, XEN_HYPER_FC) + DEFINE_PRIVOP(get_cpuid, XEN_HYPER_GET_CPUID) + DEFINE_PRIVOP(get_pmd, XEN_HYPER_GET_PMD) + DEFINE_PRIVOP(get_eflag, XEN_HYPER_GET_EFLAG) + DEFINE_PRIVOP(set_eflag, XEN_HYPER_SET_EFLAG) + DEFINE_PRIVOP(get_psr, XEN_HYPER_GET_PSR) + DEFINE_PRIVOP(set_rr0_to_rr4, XEN_HYPER_SET_RR0_TO_RR4) + __FINIT + + +#define PARAVIRT_ALT_BUNDLE_ELEM(name, type) \ + data8 xen_ ## name ## _instr_start; \ + data8 xen_ ## name ## _instr_end; \ + data8 type; + + __INITDATA_OR_MODULE + .align 8 + .global xen_alt_bundle_array +xen_alt_bundle_array: +xen_alt_bundle_array_start: + PARAVIRT_ALT_BUNDLE_ELEM(rfi, PARAVIRT_INST_RFI) + PARAVIRT_ALT_BUNDLE_ELEM(rsm_psr_dt, PARAVIRT_INST_RSM_DT) + PARAVIRT_ALT_BUNDLE_ELEM(ssm_psr_dt, PARAVIRT_INST_SSM_DT) + PARAVIRT_ALT_BUNDLE_ELEM(cover, PARAVIRT_INST_COVER) + PARAVIRT_ALT_BUNDLE_ELEM(itc_d, PARAVIRT_INST_ITC_D) + PARAVIRT_ALT_BUNDLE_ELEM(itc_i, PARAVIRT_INST_ITC_I) + PARAVIRT_ALT_BUNDLE_ELEM(ssm_i, PARAVIRT_INST_SSM_I) + PARAVIRT_ALT_BUNDLE_ELEM(get_ivr, PARAVIRT_INST_GET_IVR) + PARAVIRT_ALT_BUNDLE_ELEM(get_tpr, PARAVIRT_INST_GET_TPR) + PARAVIRT_ALT_BUNDLE_ELEM(set_tpr, PARAVIRT_INST_SET_TPR) + PARAVIRT_ALT_BUNDLE_ELEM(eoi, PARAVIRT_INST_EOI) + PARAVIRT_ALT_BUNDLE_ELEM(set_itm, PARAVIRT_INST_SET_ITM) + PARAVIRT_ALT_BUNDLE_ELEM(thash, PARAVIRT_INST_THASH) + PARAVIRT_ALT_BUNDLE_ELEM(ptc_ga, PARAVIRT_INST_PTC_GA) + PARAVIRT_ALT_BUNDLE_ELEM(itr_d, PARAVIRT_INST_ITR_D) + PARAVIRT_ALT_BUNDLE_ELEM(get_rr, PARAVIRT_INST_GET_RR) + PARAVIRT_ALT_BUNDLE_ELEM(set_rr, PARAVIRT_INST_SET_RR) + PARAVIRT_ALT_BUNDLE_ELEM(set_kr, PARAVIRT_INST_SET_KR) + PARAVIRT_ALT_BUNDLE_ELEM(fc, PARAVIRT_INST_FC) + PARAVIRT_ALT_BUNDLE_ELEM(get_cpuid, PARAVIRT_INST_GET_CPUID) + PARAVIRT_ALT_BUNDLE_ELEM(get_pmd, PARAVIRT_INST_GET_PMD) + PARAVIRT_ALT_BUNDLE_ELEM(get_eflag, PARAVIRT_INST_GET_EFLAG) + PARAVIRT_ALT_BUNDLE_ELEM(set_eflag, PARAVIRT_INST_SET_EFLAG) + PARAVIRT_ALT_BUNDLE_ELEM(get_psr, PARAVIRT_INST_GET_PSR) + + PARAVIRT_ALT_BUNDLE_ELEM(ssm_i, PARAVIRT_BNDL_SSM_I) + PARAVIRT_ALT_BUNDLE_ELEM(rsm_i, PARAVIRT_BNDL_RSM_I) + PARAVIRT_ALT_BUNDLE_ELEM(get_psr_i, PARAVIRT_BNDL_GET_PSR_I) + PARAVIRT_ALT_BUNDLE_ELEM(intrin_local_irq_restore, + PARAVIRT_BNDL_INTRIN_LOCAL_IRQ_RESTORE) +xen_alt_bundle_array_end: + + .align 8 + .global xen_alt_bundle_array_size +xen_alt_bundle_array_size: + .long xen_alt_bundle_array_end - xen_alt_bundle_array_start + + +#define PARAVIRT_ALT_INST_ELEM(name, type) \ + data8 xen_ ## name ## _stag ; \ + data8 xen_ ## name ## _etag ; \ + data8 type + + __INITDATA_OR_MODULE + .align 8 + .global xen_alt_inst_array +xen_alt_inst_array: +xen_alt_inst_array_start: + PARAVIRT_ALT_INST_ELEM(rfi, PARAVIRT_INST_RFI) + PARAVIRT_ALT_INST_ELEM(rsm_psr_dt, PARAVIRT_INST_RSM_DT) + PARAVIRT_ALT_INST_ELEM(ssm_psr_dt, PARAVIRT_INST_SSM_DT) + PARAVIRT_ALT_INST_ELEM(cover, PARAVIRT_INST_COVER) + PARAVIRT_ALT_INST_ELEM(itc_d, PARAVIRT_INST_ITC_D) + PARAVIRT_ALT_INST_ELEM(itc_i, PARAVIRT_INST_ITC_I) + PARAVIRT_ALT_INST_ELEM(ssm_i, PARAVIRT_INST_SSM_I) + PARAVIRT_ALT_INST_ELEM(get_ivr, PARAVIRT_INST_GET_IVR) + PARAVIRT_ALT_INST_ELEM(get_tpr, PARAVIRT_INST_GET_TPR) + PARAVIRT_ALT_INST_ELEM(set_tpr, PARAVIRT_INST_SET_TPR) + PARAVIRT_ALT_INST_ELEM(eoi, PARAVIRT_INST_EOI) + PARAVIRT_ALT_INST_ELEM(set_itm, PARAVIRT_INST_SET_ITM) + PARAVIRT_ALT_INST_ELEM(thash, PARAVIRT_INST_THASH) + PARAVIRT_ALT_INST_ELEM(ptc_ga, PARAVIRT_INST_PTC_GA) + PARAVIRT_ALT_INST_ELEM(itr_d, PARAVIRT_INST_ITR_D) + PARAVIRT_ALT_INST_ELEM(get_rr, PARAVIRT_INST_GET_RR) + PARAVIRT_ALT_INST_ELEM(set_rr, PARAVIRT_INST_SET_RR) + PARAVIRT_ALT_INST_ELEM(set_kr, PARAVIRT_INST_SET_KR) + PARAVIRT_ALT_INST_ELEM(fc, PARAVIRT_INST_FC) + PARAVIRT_ALT_INST_ELEM(get_cpuid, PARAVIRT_INST_GET_CPUID) + PARAVIRT_ALT_INST_ELEM(get_pmd, PARAVIRT_INST_GET_PMD) + PARAVIRT_ALT_INST_ELEM(get_eflag, PARAVIRT_INST_GET_EFLAG) + PARAVIRT_ALT_INST_ELEM(set_eflag, PARAVIRT_INST_SET_EFLAG) + PARAVIRT_ALT_INST_ELEM(get_psr, PARAVIRT_INST_GET_PSR) + PARAVIRT_ALT_INST_ELEM(set_rr0_to_rr4, PARAVIRT_INST_SET_RR0_TO_RR4) + + PARAVIRT_ALT_INST_ELEM(ssm_i, PARAVIRT_BNDL_SSM_I) + PARAVIRT_ALT_INST_ELEM(rsm_i, PARAVIRT_BNDL_RSM_I) + PARAVIRT_ALT_INST_ELEM(get_psr_i, PARAVIRT_BNDL_GET_PSR_I) + PARAVIRT_ALT_INST_ELEM(intrin_local_irq_restore, + PARAVIRT_BNDL_INTRIN_LOCAL_IRQ_RESTORE) +xen_alt_inst_array_end: + + .align 8 + .global xen_alt_inst_array_size +xen_alt_inst_array_size: + .long xen_alt_inst_array_end - xen_alt_inst_array_start diff --git a/arch/ia64/xen/privops_c.c b/arch/ia64/xen/privops_c.c new file mode 100644 index 0000000..0fa2e23 --- /dev/null +++ b/arch/ia64/xen/privops_c.c @@ -0,0 +1,279 @@ +/****************************************************************************** + * arch/ia64/xen/privops_c.c + * + * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <linux/linkage.h> +#include <linux/init.h> +#include <linux/module.h> + +#include <xen/interface/xen.h> + +#include <asm/asm-offsets.h> +#define XEN_PSR_I_ADDR_ADDR ((uint8_t **)(XSI_BASE + XSI_PSR_I_ADDR_OFS)) + + +void __init_or_module +xen_privop_ssm_i(void) +{ + /* + * int masked = !xen_get_virtual_psr_i(); + * // masked = *(*XEN_MAPPEDREGS->interrupt_mask_addr) + * xen_set_virtual_psr_i(1) + * // *(*XEN_MAPPEDREGS->interrupt_mask_addr) = 0 + * // compiler barrier + * if (masked) { + * uint8_t* pend_int_addr + * (uint8_t*)(*XEN_MAPPEDREGS->interrupt_mask_addr) - 1; + * uint8_t pending = *pend_int_addr; + * if (pending) + * XEN_HYPER_SSM_I + * } + */ + register uint8_t *tmp asm ("r8"); + register int masked asm ("r9"); + register uint8_t *pending_intr_addr asm ("r10"); + + asm volatile(".global xen_ssm_i_instr\n\t" + "xen_ssm_i_instr:\n\t" + ".global xen_ssm_i_instr_start\n\t" + "xen_ssm_i_instr_start:\n\t" + ".global xen_ssm_i_stag\n\t" + "[xen_ssm_i_stag:]\n\t" + /* tmp = &XEN_MAPPEDREGS->interrupt_mask_addr */ + "mov %[tmp]=%[XEN_PSR_I_ADDR_ADDR_IMM]\n\t" + ";;\n\t" + /* tmp = *XEN_MAPPEDREGS->interrupt_mask_addr */ + "ld8 %[tmp]=[%[tmp]]\n\t" + ";;\n\t" + /* pending_intr_addr = tmp - 1 */ + "add %[pending_intr_addr]=-1,%[tmp]\n\t" + /* masked = *tmp */ + "ld1 %[masked]=[%[tmp]]\n\t" + ";;\n\t" + /* *tmp = 0 */ + "st1 [%[tmp]]=r0\n\t" + /* p6 = !masked */ + "cmp.ne.unc p6,p0=%[masked],r0\n\t" + ";;\n\t" + /* tmp = *pending_intr_addr */ + "(p6) ld1 %[tmp]=[%[pending_intr_addr]]\n\t" + ";;\n\t" + /* p7 = p6 && !tmp */ + "(p6) cmp.ne.unc p7,p0=%[tmp],r0\n\t" + ";;\n\t" + "(p7) break %[HYPERPRIVOP_SSM_I_IMM]\n\t" + ".global xen_ssm_i_etag\n\t" + "[xen_ssm_i_etag:]\n\t" + ".global xen_ssm_i_instr_end\n\t" + "xen_ssm_i_instr_end:\n\t" + : + [tmp] "=r"(tmp), + [pending_intr_addr] "=r"(pending_intr_addr), + [masked] "=r"(masked), + + "=m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR)) + : + [XEN_PSR_I_ADDR_ADDR_IMM] "i"(XEN_PSR_I_ADDR_ADDR), + [HYPERPRIVOP_SSM_I_IMM] "i"(HYPERPRIVOP_SSM_I), + + "m"(*((uint8_t *)XEN_PSR_I_ADDR_ADDR)), + "m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR)), + "m"(*(*((uint8_t **)XEN_PSR_I_ADDR_ADDR) - 1)) + : + "memory", + /* + * predicate registers can't be specified as C variables + * so that we use p6, p7, p8 here. + */ + "p6", /* is_old */ + "p7" /* is_pending */ + ); +} + +void __init_or_module +xen_privop_rsm_i(void) +{ + /* + * psr_i_addr_addr = XEN_MAPPEDREGS->interrupt_mask_addr + * = XEN_PSR_I_ADDR_ADDR; + * psr_i_addr = *psr_i_addr_addr; + * *psr_i_addr = 1; + */ + register unsigned long psr_i_addr asm("r8"); + register uint8_t mask asm ("r9"); + asm volatile (".global xen_rsm_i_instr\n\t" + "xen_rsm_i_instr:\n\t" + ".global xen_rsm_i_instr_start\n\t" + "xen_rsm_i_instr_start:\n\t" + ".global xen_rsm_i_stag\n\t" + "[xen_rsm_i_stag:]\n\t" + "mov %[psr_i_addr]=%[XEN_PSR_I_ADDR_ADDR_IMM]\n\t" + "mov %[mask]=%[ONE_IMM]\n\t" + ";;\n\t" + "ld8 %[psr_i_addr]=[%[psr_i_addr]]\n\t" + ";;\n\t" + "st1 [%[psr_i_addr]]=%[mask]\n\t" + ".global xen_rsm_i_etag\n\t" + "[xen_rsm_i_etag:]\n\t" + ".global xen_rsm_i_instr_end\n\t" + "xen_rsm_i_instr_end:\n\t" + : + [psr_i_addr] "=r"(psr_i_addr), + [mask] "=r"(mask), + "=m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR)): + [XEN_PSR_I_ADDR_ADDR_IMM] "i"(XEN_PSR_I_ADDR_ADDR), + [ONE_IMM] "i"(1), + "m"(*((uint8_t **)XEN_PSR_I_ADDR_ADDR)): + "memory"); +} + +void __init_or_module +xen_privop_ia64_intrin_local_irq_restore(unsigned long val) +{ + /* + * psr_i_addr_addr = XEN_PSR_I_ADDR_ADDR + * psr_i_addr = *psr_i_addr_addr + * pending_intr_addr = psr_i_addr - 1 + * if (val & IA64_PSR_I) { + * masked = *psr_i_addr + * *psr_i_addr = 0 + * compiler barrier + * if (masked) { + * uint8_t pending = *pending_intr_addr; + * if (pending) + * XEN_HYPER_SSM_I + * } + * } else { + * *psr_i_addr = 1 + * } + */ + + register unsigned long __val asm("r8") = val; + register uint8_t *psr_i_addr asm ("r9"); + register uint8_t *pending_intr_addr asm ("r10"); + register uint8_t masked asm ("r11"); + register unsigned long one_or_pending asm ("r8"); + + asm volatile ( + ".global xen_intrin_local_irq_restore_instr\n\t" + "xen_intrin_local_irq_restore_instr:\n\t" + ".global xen_intrin_local_irq_restore_instr_start\n\t" + "xen_intrin_local_irq_restore_instr_start:\n\t" + ".global xen_intrin_local_irq_restore_stag\n\t" + "[xen_intrin_local_irq_restore_stag:]\n\t" + "tbit.nz p6,p7=%[val],%[IA64_PSR_I_BIT_IMM]\n\t" + "mov %[psr_i_addr]=%[XEN_PSR_I_ADDR_ADDR_IMM]\n\t" + ";;\n\t" + "ld8 %[psr_i_addr]=[%[psr_i_addr]]\n\t" + "(p7)mov %[one_or_pending]=%[ONE_IMM]\n\t" + ";;\n\t" + "add %[pending_intr_addr]=-1,%[psr_i_addr]\n\t" + ";;\n\t" + "(p6) ld1 %[masked]=[%[psr_i_addr]]\n\t" + "(p7) st1 [%[psr_i_addr]]=%[one_or_pending]\n\t" + ";;\n\t" + "(p6) st1 [%[psr_i_addr]]=r0\n\t" + "(p6) cmp.ne.unc p8,p0=%[masked],r0\n\t" + "(p6) ld1 %[one_or_pending]=[%[pending_intr_addr]]\n\t" + ";;\n\t" + "(p8) cmp.eq.unc p9,p0=%[one_or_pending],r0\n\t" + ";;\n\t" + "(p9) break %[HYPERPRIVOP_SSM_I_IMM]\n\t" + ".global xen_intrin_local_irq_restore_etag\n\t" + "[xen_intrin_local_irq_restore_etag:]\n\t" + ".global xen_intrin_local_irq_restore_instr_end\n\t" + "xen_intrin_local_irq_restore_instr_end:\n\t" + : + [psr_i_addr] "=r"(psr_i_addr), + [pending_intr_addr] "=r"(pending_intr_addr), + [masked] "=r"(masked), + [one_or_pending] "=r"(one_or_pending), + + "=m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR)) + : + [val] "r"(__val), + [IA64_PSR_I_BIT_IMM] "i"(IA64_PSR_I_BIT), + [ONE_IMM] "i"(1), + + [XEN_PSR_I_ADDR_ADDR_IMM] "i"(XEN_PSR_I_ADDR_ADDR), + [HYPERPRIVOP_SSM_I_IMM] "i"(HYPERPRIVOP_SSM_I), + + "m"(*((uint8_t *)XEN_PSR_I_ADDR_ADDR)), + "m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR)), + "m"(*(*((uint8_t **)XEN_PSR_I_ADDR_ADDR) - 1)) + : + "memory", + "p6", /* is_psr_i_set */ + "p7", /* not_psr_i_set */ + "p8", /* is_masked && is_psr_i_set */ + "p9" /* is_pending && is_masked && is_psr_i_set */ + ); +} + +unsigned long __init_or_module +xen_privop_get_psr_i(void) +{ + /* + * tmp = XEN_MAPPEDREGS->interrupt_mask_addr = XEN_PSR_I_ADDR_ADDR; + * tmp = *tmp + * tmp = *tmp; + * psr_i = tmp? 0: IA64_PSR_I; + */ + register unsigned long psr_i asm ("r8"); + register unsigned long tmp asm ("r9"); + + asm volatile (".global xen_get_psr_i_instr\n\t" + "xen_get_psr_i_instr:\n\t" + ".global xen_get_psr_i_instr_start\n\t" + "xen_get_psr_i_instr_start:\n\t" + ".global xen_get_psr_i_stag\n\t" + "[xen_get_psr_i_stag:]\n\t" + /* tmp = XEN_PSR_I_ADDR_ADDR */ + "mov %[tmp]=%[XEN_PSR_I_ADDR_ADDR_IMM]\n\t" + ";;\n\t" + /* tmp = *tmp = *XEN_PSR_I_ADDR_ADDR */ + "ld8 %[tmp]=[%[tmp]]\n\t" + /* psr_i = 0 */ + "mov %[psr_i]=0\n\t" + ";;\n\t" + /* tmp = *(uint8_t*)tmp */ + "ld1 %[tmp]=[%[tmp]]\n\t" + ";;\n\t" + /* if (!tmp) psr_i = IA64_PSR_I */ + "cmp.eq.unc p6,p0=%[tmp],r0\n\t" + ";;\n\t" + "(p6) mov %[psr_i]=%[IA64_PSR_I_IMM]\n\t" + ".global xen_get_psr_i_etag\n\t" + "[xen_get_psr_i_etag:]\n\t" + ".global xen_get_psr_i_instr_end\n\t" + "xen_get_psr_i_instr_end:\n\t" + : + [tmp] "=r"(tmp), + [psr_i] "=r"(psr_i) + : + [XEN_PSR_I_ADDR_ADDR_IMM] "i"(XEN_PSR_I_ADDR_ADDR), + [IA64_PSR_I_IMM] "i"(IA64_PSR_I), + "m"(*((uint8_t **)XEN_PSR_I_ADDR_ADDR)), + "m"(**((uint8_t **)XEN_PSR_I_ADDR_ADDR)) + : + "p6"); + return psr_i; +} diff --git a/arch/ia64/xen/xensetup.S b/arch/ia64/xen/xensetup.S index 17ad297..2d3d5d4 100644 --- a/arch/ia64/xen/xensetup.S +++ b/arch/ia64/xen/xensetup.S @@ -35,6 +35,16 @@ GLOBAL_ENTRY(early_xen_setup) (isBP) movl r28=XSI_BASE;; (isBP) break 0x1000;; +#ifdef CONFIG_PARAVIRT + /* patch privops */ +(isBP) mov r4=rp + ;; +(isBP) br.call.sptk.many rp=xen_paravirt_patch + ;; +(isBP) mov rp=r4 + ;; +#endif + br.ret.sptk.many rp ;; END(early_xen_setup) diff --git a/include/asm-ia64/xen/privop.h b/include/asm-ia64/xen/privop.h index 95e8e8a..d59cc31 100644 --- a/include/asm-ia64/xen/privop.h +++ b/include/asm-ia64/xen/privop.h @@ -557,6 +557,18 @@ do { \ #endif /* ASM_SUPPORTED && !CONFIG_PARAVIRT_ALT */ +#ifdef CONFIG_PARAVIRT_ALT +#if defined(CONFIG_MODULES) && defined(CONFIG_XEN) +void xen_alt_bundle_patch_module(struct paravirt_alt_bundle_patch *start, + struct paravirt_alt_bundle_patch *end); +void xen_alt_inst_patch_module(struct paravirt_alt_inst_patch *start, + struct paravirt_alt_inst_patch *end); +#else +#define xen_alt_bundle_patch_module(start, end) do { } while (0) +#define xen_alt_inst_patch_module(start, end) do { } while (0) +#endif +#endif /* CONFIG_PARAVIRT_ALT */ + #endif /* !__ASSEMBLY__ */ /* these routines utilize privilege-sensitive or performance-sensitive @@ -573,12 +585,24 @@ do { \ #ifdef CONFIG_XEN #ifdef __ASSEMBLY__ +#ifdef CONFIG_PARAVIRT_ENTRY +#define BR_IF_NATIVE(target, reg_unused, pred_unused) /* nothing */ +#elif defined(CONFIG_PARAVIRT_NOP_B_PATCH) +#define BR_IF_NATIVE(target, reg_unused, pred_unused) \ + .body ; \ + [1:] ; \ + br.cond.sptk.many target;; ; \ + .section .paravirt_nop_b, "a" ; \ + .previous ; \ + .xdata8 ".paravirt_nop_b", 1b +#else #define BR_IF_NATIVE(target, reg, pred) \ .body ; \ movl reg=running_on_xen;; ; \ ld4 reg=[reg];; ; \ cmp.eq pred,p0=reg,r0 ; \ (pred) br.cond.sptk.many target;; +#endif #endif /* __ASSEMBLY__ */ #endif -- 1.5.3
Dong, Eddie
2008-Feb-28 17:21 UTC
[PATCH 0/5] RFC: ia64/pv_ops: ia64 intrinsics paravirtualization
Isaku Yamahata wrote:> Hi. Thank you for comments on asm code paravirtualization. > Its direction is getting clear. Although it hasn't been finished yet, > I'd like to start discussion on ia64 intrinsics paravirtualization. > This patch set is just for discussion so that it is a subset of > xen Linux/ia64 domU paravirtualization, not self complete. > You can get the full patched tree by typing > git clone > http://people.valinux.co.jp/~yamahata/xen-ia64/linux-2.6-xen-ia64.git/ > > > A paravirtualized guest wants to replace ia64 intrinsics, i.e. > the operations defined in include/asm-ia64/gcc_instrin.h or > include/asm-ia64/intel_instrin.h, with its own version. > (At least xenLinux/ia64 does.) > So we need a sort of interface to do so. > I want to discuss on which direction to go for, please comment. > > > This paravirtualization corresponds to the part of x86 pv_ops, > Performance critical code written in C. They are basically indirect > function call via pv_xxx_ops. For performance, each pv instance is > allowed to binary patch in order to replace function call > instruction with their predefined instructions in place. > The ia64 intrinsics corresonds to this kind of interface. > > The discussion points so far are > - binary patching should be mandatory or optional? > The current patch requires binary patch, but some people think > requiring binary patch for pv instances is a bad idea. > I think by providing reasonable helper functions set, binary patch > won't be burden for pv instances. > > - How differ from x86 pv_ops? > Some people think that the very similarity to x86 pv_ops is > important. I guess they're thinking so considering maintenance > cost. Anyway ia64 is already different from x86, so such difference > doesn't matter as long as ia64 paravirtualization interface is > clean enough for maintenance. > > Note: the way can differ from one operation from another, but it > might cause some inconsistency. > The following ways are proposed so far. > > > * Option 1: the current way > The code would look like > static inline unsigned long > paravirt_get_cpuid(int index) > { > register __u64 ia64_intri_res asm ("r8"); > register __u64 __index asm ("r8") = index; > asm volatile (paravirt_alt_inst("mov %0=cpuid[%r1]", > PARAVIRT_INST_GET_CPUID): > "=r"(ia64_intri_res): "0O"(__index)); > return ia64_intri_res; > } > #define ia64_get_cpuid paravirt_get_cpuid > > note: > Using r8 is derived from xen hypercall abi. > We have to define which register should be used or can be > clobbered. > > Pros: > - in-place binary patch is possible. > (We may want to pad with nop. How many?) > - native case performance is good. > - native case doesn't need any modification. > > Cons: > - binary patch is required for pv instances. > - Probably current implementation might be too xen-biased. > Reviewing them would be necessary for hypervisor neutrality. > > * Option 2: direct branch > The code would look like > static inline unsigned long > paravirt_get_cpuid(int index) > { > register __u64 ia64_intri_res asm ("r8"); > register __u64 __index asm ("r8") = index; > register __u64 ret_addr asm ("r9"); > asm volatile (paravirt_alt_inst( > "br.cond b0=native_get_cpuid", > /* or brl.cond for fast hypercall */ > PARAVIRT_INST_GET_CPUID): > "=r"(ia64_intri_res), "=r"(ret_addr): > "0O"(__index)" > "b0"); > return ia64_intri_res; > } > #define ia64_get_cpuid paravirt_get_cpuid > > note: > Using r8 is derived from xen hypercall abi. > We have to define which register should be used or can be > clobbered. > > Pros: > - in-place binary patch is possible. > (We may want to pad with nop. How many?) > - so that performance would be good for native case using it. > > Cons: > - binary patch is required for pv instances. > - native case needs binary patch for optimal performance. > > * Option 3: indirect branch > The code would look like > static inline unsigned long > paravirt_get_cpuid(int index) > { > register __u64 ia64_intri_res asm ("r8"); > register __u64 __index asm ("r8") = index; > register __u64 func asm ("r9"); > asm volatile (paravirt_alt_inst( > "mov %1 = pv_cpu_ops" > "add %1 = %1, PV_CPU_GET_CPUID_OFFSET" > "ld8 %1 = [%1]" > "mov b1 = %1" > "br.cond b0=b1" > PARAVIRT_INST_GET_CPUID): > "=r"(ia64_intri_res), > "=r"(func): > "0O"(__index): > "b0", "b1"); > return ia64_intri_res; > } > #define ia64_get_cpuid paravirt_get_cpuid > > note: > Using r8 is derived from xen hypercall abi. > We have to define which register should be used or can be > clobbered. > > Pros: > - binary patching isn't required for pv instances. > - in-place binary patch is possible > (We may want to pad with nop. How many?) > - so that performance would be good for native case using it. > > Cons: > - use more spaces than the option #2. > - For optimal performance binary patch is necessary anyway. > > * Option 4: indirect function call > The code would look like > struct pv_cpu_ops { > unsigned long (*get_cpuid)(unsigned long index) > .... > }; > extern struct pv_cpu_ops pv_cpu_ops; > ... > static inline unsigned long > paravirt_get_cpuid(unsigned long index) > { > return pv_cpu_ops->get_cpuid(index); > } > #define ia64_get_cpuid paravirt_get_cpuid > > Pros: > - Binary patch isn't required. > - indirect function call is the very way x86 pv_ops adopted. > - If hypervisor supports fast hypercall using gate page, > it may want to use function call. > > Cons: > - Binary patch is difficult. > ia64 function call uses stacked registers, so that marking br.call > instruction is difficult. > - so that the performance is suboptimal especially for native case. >I am not sure if this statement is true. We can still patching it. For example using same inline asm code for paravirt_get_cpuid definition and it could be exactly same with X86.> Possibly the alternative is direct function call. At boot time, > scan all text detecting branch instructions which jumps to given > functions and binary patch branch target. > > > My current preference is option #1 or #2 making abi more hypervisor > neutral. > > thanks,
Isaku Yamahata
2008-Feb-29 09:39 UTC
[kvm-ia64-devel] [PATCH 0/5] RFC: ia64/pv_ops: ia64 intrinsicsparavirtualization
On Fri, Feb 29, 2008 at 04:19:27PM +0800, Dong, Eddie wrote:> Seems rebounded, just resend. > >wrote:>>>> Cons: >>>> - Binary patch is difficult. >>>> ia64 function call uses stacked registers, so that marking >>>> br.call instruction is difficult. - so that the performance >>>> is suboptimal especially for native case. >>>> >>> >>> I am not sure if this statement is true. We can still patching it. >>> For example using same inline asm code for paravirt_get_cpuid >>> definition and it could be exactly same with X86. >> >> Stacked registers must be allocated by alloc instruction. >> And it is issued in caller function's prologue. I.e. gcc maintains >> how many local registers (sol) and output output registers (sof - >> sol) are used. > > It depends on where do we start to patch. I.e. if the patch code will > replace the prologue code or not? I think we can solve this by > replace the prologue, but I may miss something.Yes, we can scan instruction backward looking for alloc instruction and rewrite it and know its frame size (sol and sof). Thus we can guarantee that output registers are accessible. In fact specifying "out0", "out1", ... as clobbered registers in inline assembler code, gcc allocates them. and we can clobber those registers. However we can't clobber stacked registers out of specified ones so that its conversion differs from C function calling one. For example func() // out0 and out1 are allocated. paravirt_get_cpuid(index); // asm volatile ("..." // "br.call xen_get_cpuid" // "...": // input: output: // "out0"); other_func(arg0, arg1); In xen_get_cpuid() we can't clobber out1 so that xen_get_cpuid() isn't allowed to allocate any extra stacked registers. It means that xen_get_cpuid() can't be written in C.>> So if we call function from inline assembly, we have to tell to gcc >> how many output registers are used. I haven't found the way to do >> that. > > The new (patched) code comes from the type of pv_ops, so it know > how many parameters it used and how to alloc etc. > >> On the other hand On x86, just telling clobbered registers is okay. >> >> Even if we find the way to tell it to gcc, the next issue it how to >> determin how many local registers (sol). > > The original prologue is replaced, so we only care the new code > prologue which is known to us if we still call somewhere. Some time, > it doesn't > need call other function if the code size is enough to hold the new > code.I don't say it's impossible. (The ultimate way is to add such extension to gcc.) I want to claim that C function call option is much more difficult than other options and it's worth while to consider other options. Why not some kind of static calling convension? -- yamahata
Possibly Parallel Threads
- [PATCH 0/5] RFC: ia64/pv_ops: ia64 intrinsics paravirtualization
- [PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization
- [PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization
- [PATCH 0/5] ia64/pv_ops, xen: binary patch optimization
- [PATCH 0/5] ia64/pv_ops, xen: binary patch optimization