The KVM introspection subsystem provides a facility for applications running on the host or in a separate VM, to control the execution of other VMs (pause, resume, shutdown), query the state of the vCPUs (GPRs, MSRs etc.), alter the page access bits in the shadow page tables (only for the hardware backed ones, eg. Intel's EPT) and receive notifications when events of interest have taken place (shadow page table level faults, key MSR writes, hypercalls etc.). Some notifications can be responded to with an action (like preventing an MSR from being written), others are mere informative (like breakpoint events which can be used for execution tracing). With few exceptions, all events are optional. An application using this subsystem will explicitly register for them. The use case that gave way for the creation of this subsystem is to monitor the guest OS and as such the ABI/API is highly influenced by how the guest software (kernel, applications) sees the world. For example, some events provide information specific for the host CPU architecture (eg. MSR_IA32_SYSENTER_EIP) merely because its leveraged by guest software to implement a critical feature (fast system calls). At the moment, the target audience for KVMI are security software authors that wish to perform forensics on newly discovered threats (exploits) or to implement another layer of security like preventing a large set of kernel rootkits simply by "locking" the kernel image in the shadow page tables (ie. enforce .text r-x, .rodata rw- etc.). It's the latter case that made KVMI a separate subsystem, even though many of these features are available in the device manager. The ability to build a security application that does not interfere (in terms of performance) with the guest software asks for a specialized interface that is designed for minimum overhead. This patch series is based on kvm/master, commit 3d9fdc252b52 ("KVM: MIPS: Fix build errors for 32bit kernel"). The previous version (v8) can be read here: https://lore.kernel.org/kvm/20200330101308.21702-1-alazar at bitdefender.com/ Patches 1-36: make preparatory changes Patches 38-82: add basic introspection capabilities Patch 83: support introspection tools that write-protect guest page tables Patch 84: notify the introspection tool even on emulation failures (when the read/write callbacks used by the emulator, kvm_page_preread/kvm_page_prewrite, are not invoked) Changes since v8: - rebase on 5.8 - fix non-x86 builds (avoid including the UAPI headers from kvmi_host.h) - fix the clean-up for KVMI_VCPU_SINGLESTEP [Mathieu] - extend KVMI_VM_SET_PAGE_ACCESS with the 'visible' option - improve KVMI_VM_GET_MAX_GFN (skip read-only, invalid or non-user memslots) - add KVMI_VM_CONTROL_CLEANUP [Tamas, Mathieu] - add KVMI_VCPU_GET_XCR and KVMI_VCPU_SET_XSAVE (SSE emulation) - move KVM_REQ_INTROSPECTION in the range of arch-independent requests - better split of x86 vs arch-independent code - cover more error codes with tools/testing/selftests/kvm/x86_64/kvmi_test.c - remove more error messages and close the introspection connection when an error code can't be sent back or it doesn't make sense to send it - other small changes (code refactoring, message validation, etc.). Adalbert Laz?r (22): KVM: UAPI: add error codes used by the VM introspection code KVM: add kvm_vcpu_kick_and_wait() KVM: doc: fix the hypercall numbering KVM: x86: add .control_cr3_intercept() to struct kvm_x86_ops KVM: x86: add .desc_ctrl_supported() KVM: x86: add .control_desc_intercept() KVM: x86: export kvm_vcpu_ioctl_x86_set_xsave() KVM: introspection: add hook/unhook ioctls KVM: introspection: add permission access ioctls KVM: introspection: add the read/dispatch message function KVM: introspection: add KVMI_GET_VERSION KVM: introspection: add KVMI_VM_CHECK_COMMAND and KVMI_VM_CHECK_EVENT KVM: introspection: add KVMI_EVENT_UNHOOK KVM: introspection: add KVMI_VM_CONTROL_EVENTS KVM: introspection: add a jobs list to every introspected vCPU KVM: introspection: add KVMI_VCPU_PAUSE KVM: introspection: add KVMI_EVENT_PAUSE_VCPU KVM: introspection: add KVMI_VM_CONTROL_CLEANUP KVM: introspection: add KVMI_VCPU_GET_XCR KVM: introspection: add KVMI_VCPU_SET_XSAVE KVM: introspection: extend KVMI_GET_VERSION with struct kvmi_features KVM: introspection: add KVMI_VCPU_TRANSLATE_GVA Marian Rotariu (1): KVM: introspection: add KVMI_VCPU_GET_CPUID Mathieu Tarral (1): signal: export kill_pid_info() Mihai Don?u (35): KVM: x86: add kvm_arch_vcpu_get_regs() and kvm_arch_vcpu_get_sregs() KVM: x86: avoid injecting #PF when emulate the VMCALL instruction KVM: x86: add .control_msr_intercept() KVM: x86: vmx: use a symbolic constant when checking the exit qualifications KVM: x86: save the error code during EPT/NPF exits handling KVM: x86: add .fault_gla() KVM: x86: add .spt_fault() KVM: x86: add .gpt_translation_fault() KVM: x86: extend kvm_mmu_gva_to_gpa_system() with the 'access' parameter KVM: x86: page track: provide all callbacks with the guest virtual address KVM: x86: page track: add track_create_slot() callback KVM: x86: page_track: add support for preread, prewrite and preexec KVM: x86: wire in the preread/prewrite/preexec page trackers KVM: introduce VM introspection KVM: introspection: add KVMI_VM_GET_INFO KVM: introspection: add KVMI_VM_READ_PHYSICAL/KVMI_VM_WRITE_PHYSICAL KVM: introspection: handle vCPU introspection requests KVM: introspection: handle vCPU commands KVM: introspection: add KVMI_VCPU_GET_INFO KVM: introspection: add the crash action handling on the event reply KVM: introspection: add KVMI_VCPU_CONTROL_EVENTS KVM: introspection: add KVMI_VCPU_GET_REGISTERS KVM: introspection: add KVMI_VCPU_SET_REGISTERS KVM: introspection: add KVMI_EVENT_HYPERCALL KVM: introspection: add KVMI_EVENT_BREAKPOINT KVM: introspection: add KVMI_VCPU_CONTROL_CR and KVMI_EVENT_CR KVM: introspection: add KVMI_VCPU_INJECT_EXCEPTION + KVMI_EVENT_TRAP KVM: introspection: add KVMI_EVENT_XSETBV KVM: introspection: add KVMI_VCPU_GET_XSAVE KVM: introspection: add KVMI_VCPU_GET_MTRR_TYPE KVM: introspection: add KVMI_VCPU_CONTROL_MSR and KVMI_EVENT_MSR KVM: introspection: add KVMI_VM_SET_PAGE_ACCESS KVM: introspection: add KVMI_EVENT_PF KVM: introspection: emulate a guest page table walk on SPT violations due to A/D bit updates KVM: x86: call the page tracking code on emulation failure Mircea C?rjaliu (2): KVM: x86: disable gpa_available optimization for fetch and page-walk SPT violations KVM: introspection: add vCPU related data Nicu?or C??u (21): KVM: x86: add kvm_arch_vcpu_set_regs() KVM: x86: add .bp_intercepted() to struct kvm_x86_ops KVM: x86: add .cr3_write_intercepted() KVM: svm: add support for descriptor-table exits KVM: x86: add .desc_intercepted() KVM: x86: export .msr_write_intercepted() KVM: x86: use MSR_TYPE_R, MSR_TYPE_W and MSR_TYPE_RW with AMD KVM: svm: pass struct kvm_vcpu to set_msr_interception() KVM: vmx: pass struct kvm_vcpu to the intercept msr related functions KVM: x86: add .control_singlestep() KVM: x86: export kvm_arch_vcpu_set_guest_debug() KVM: x86: export kvm_inject_pending_exception() KVM: x86: export kvm_vcpu_ioctl_x86_get_xsave() KVM: introspection: add cleanup support for vCPUs KVM: introspection: restore the state of #BP interception on unhook KVM: introspection: restore the state of CR3 interception on unhook KVM: introspection: add KVMI_EVENT_DESCRIPTOR KVM: introspection: restore the state of descriptor-table register interception on unhook KVM: introspection: restore the state of MSR interception on unhook KVM: introspection: add KVMI_VCPU_CONTROL_SINGLESTEP KVM: introspection: add KVMI_EVENT_SINGLESTEP ?tefan ?icleru (2): KVM: add kvm_get_max_gfn() KVM: introspection: add KVMI_VM_GET_MAX_GFN Documentation/virt/kvm/api.rst | 149 ++ Documentation/virt/kvm/hypercalls.rst | 39 +- Documentation/virt/kvm/kvmi.rst | 1546 ++++++++++++ arch/x86/include/asm/kvm_host.h | 41 +- arch/x86/include/asm/kvm_page_track.h | 71 +- arch/x86/include/asm/kvmi_host.h | 96 + arch/x86/include/asm/vmx.h | 2 + arch/x86/include/uapi/asm/kvmi.h | 153 ++ arch/x86/kvm/Kconfig | 13 + arch/x86/kvm/Makefile | 2 + arch/x86/kvm/emulate.c | 4 + arch/x86/kvm/kvm_emulate.h | 1 + arch/x86/kvm/kvmi.c | 1413 +++++++++++ arch/x86/kvm/mmu.h | 4 + arch/x86/kvm/mmu/mmu.c | 161 +- arch/x86/kvm/mmu/page_track.c | 142 +- arch/x86/kvm/svm/svm.c | 268 ++- arch/x86/kvm/svm/svm.h | 14 + arch/x86/kvm/vmx/capabilities.h | 7 +- arch/x86/kvm/vmx/vmx.c | 261 +- arch/x86/kvm/vmx/vmx.h | 4 - arch/x86/kvm/x86.c | 305 ++- drivers/gpu/drm/i915/gvt/kvmgt.c | 2 +- include/linux/kvm_host.h | 20 + include/linux/kvmi_host.h | 125 + include/uapi/linux/kvm.h | 20 + include/uapi/linux/kvm_para.h | 5 + include/uapi/linux/kvmi.h | 254 ++ kernel/signal.c | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 2143 +++++++++++++++++ virt/kvm/introspection/kvmi.c | 1409 +++++++++++ virt/kvm/introspection/kvmi_int.h | 146 ++ virt/kvm/introspection/kvmi_msg.c | 1059 ++++++++ virt/kvm/kvm_main.c | 92 + 35 files changed, 9795 insertions(+), 178 deletions(-) create mode 100644 Documentation/virt/kvm/kvmi.rst create mode 100644 arch/x86/include/asm/kvmi_host.h create mode 100644 arch/x86/include/uapi/asm/kvmi.h create mode 100644 arch/x86/kvm/kvmi.c create mode 100644 include/linux/kvmi_host.h create mode 100644 include/uapi/linux/kvmi.h create mode 100644 tools/testing/selftests/kvm/x86_64/kvmi_test.c create mode 100644 virt/kvm/introspection/kvmi.c create mode 100644 virt/kvm/introspection/kvmi_int.h create mode 100644 virt/kvm/introspection/kvmi_msg.c base-commit: 3d9fdc252b52023260de1d12399cb3157ed28c07 CC: Edwin Zhai <edwin.zhai at intel.com> CC: Jan Kiszka <jan.kiszka at siemens.com> CC: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com> CC: Mathieu Tarral <mathieu.tarral at protonmail.com> CC: Patrick Colp <patrick.colp at oracle.com> CC: Samuel Laur?n <samuel.lauren at iki.fi> CC: Stefan Hajnoczi <stefanha at redhat.com> CC: Tamas K Lengyel <tamas at tklengyel.com> CC: Weijiang Yang <weijiang.yang at intel.com> CC: Yu C Zhang <yu.c.zhang at intel.com> CC: Sean Christopherson <sean.j.christopherson at intel.com> CC: Joerg Roedel <joro at 8bytes.org> CC: Vitaly Kuznetsov <vkuznets at redhat.com> CC: Wanpeng Li <wanpengli at tencent.com> CC: Jim Mattson <jmattson at google.com>
From: Mathieu Tarral <mathieu.tarral at protonmail.com> This function is used by VM introspection code to ungracefully shutdown a guest at the request of the introspection tool. A security application will use this as the last resort to stop the spread of a malware from a guest. Signed-off-by: Mathieu Tarral <mathieu.tarral at protonmail.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- kernel/signal.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/signal.c b/kernel/signal.c index 5ca48cc5da76..c3af81d7b62a 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1456,6 +1456,7 @@ int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) */ } } +EXPORT_SYMBOL(kill_pid_info); static int kill_proc_info(int sig, struct kernel_siginfo *info, pid_t pid) {
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 02/84] KVM: UAPI: add error codes used by the VM introspection code
These new error codes help the introspection tool to identify the cause of the introspection command failure and to recover from some error cases or to give more information to the user. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- include/uapi/linux/kvm_para.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h index 8b86609849b9..3ce388249682 100644 --- a/include/uapi/linux/kvm_para.h +++ b/include/uapi/linux/kvm_para.h @@ -17,6 +17,10 @@ #define KVM_E2BIG E2BIG #define KVM_EPERM EPERM #define KVM_EOPNOTSUPP 95 +#define KVM_EAGAIN 11 +#define KVM_ENOENT ENOENT +#define KVM_ENOMEM ENOMEM +#define KVM_EBUSY EBUSY #define KVM_HC_VAPIC_POLL_IRQ 1 #define KVM_HC_MMU_OP 2
This function is needed for the KVMI_VCPU_PAUSE command, which sets the introspection request flag, kicks the vCPU out of guest and returns a success error code (0). The vCPU will send the KVMI_EVENT_PAUSE event as soon as possible. Once the introspection tool receives the event, it knows that the vCPU doesn't run guest code and can handle introspection commands (until the reply for the pause event is sent). To implement the "pause VM" command, the userspace code will send a KVMI_VCPU_PAUSE command for every vCPU. To know when the VM is paused, userspace has to receive and "parse" all events. For example, with a 4 vCPU VM, if the "pause VM" was sent by userspace while handling an event from vCPU0 and at the same time a new vCPU was hot-plugged (which could send another event for vCPU4), the "pause VM" command has to receive and check all events until it gets the pause events for vCPU1, vCPU2 and vCPU3 before returning to the upper layer. In order to make it easier for userspace to implement the "pause VM" command, the KVMI_VCPU_PAUSE has an optional 'wait' parameter. If this is set, kvm_vcpu_kick_and_wait() will be used instead of kvm_vcpu_kick(). And because this vCPU command (KVMI_VCPU_PAUSE) is handled by the receiving thread (instead of the vCPU thread), once a string of KVMI_VCPU_PAUSE commands with the 'wait' flag set is handled, the introspection tool can consider the VM paused, without the need to wait and check events. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 10 ++++++++++ 2 files changed, 11 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 62ec926c78a0..92490279d65a 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -810,6 +810,7 @@ void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu); void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu); bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu); void kvm_vcpu_kick(struct kvm_vcpu *vcpu); +void kvm_vcpu_kick_and_wait(struct kvm_vcpu *vcpu); int kvm_vcpu_yield_to(struct kvm_vcpu *target); void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu, bool usermode_vcpu_not_eligible); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 0a68c9d3d3ab..4d965913d347 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2802,6 +2802,16 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu) EXPORT_SYMBOL_GPL(kvm_vcpu_kick); #endif /* !CONFIG_S390 */ +void kvm_vcpu_kick_and_wait(struct kvm_vcpu *vcpu) +{ + if (kvm_vcpu_wake_up(vcpu)) + return; + + if (kvm_request_needs_ipi(vcpu, KVM_REQUEST_WAIT)) + smp_call_function_single(vcpu->cpu, ack_flush, NULL, 1); +} +EXPORT_SYMBOL_GPL(kvm_vcpu_kick_and_wait); + int kvm_vcpu_yield_to(struct kvm_vcpu *target) { struct pid *pid;
From: ?tefan ?icleru <ssicleru at bitdefender.com> This function is needed for the KVMI_VM_GET_MAX_GFN command. Signed-off-by: ?tefan ?icleru <ssicleru at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 24 ++++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 92490279d65a..a4249fc88fc2 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -776,6 +776,7 @@ struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn); bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn); unsigned long kvm_host_page_size(struct kvm_vcpu *vcpu, gfn_t gfn); void mark_page_dirty(struct kvm *kvm, gfn_t gfn); +gfn_t kvm_get_max_gfn(struct kvm *kvm); struct kvm_memslots *kvm_vcpu_memslots(struct kvm_vcpu *vcpu); struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4d965913d347..8c4bccf33c8c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1347,6 +1347,30 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm, return kvm_set_memory_region(kvm, mem); } +gfn_t kvm_get_max_gfn(struct kvm *kvm) +{ + u32 skip_mask = KVM_MEM_READONLY | KVM_MEMSLOT_INVALID; + struct kvm_memory_slot *memslot; + struct kvm_memslots *slots; + gfn_t max_gfn = 0; + int idx; + + idx = srcu_read_lock(&kvm->srcu); + spin_lock(&kvm->mmu_lock); + + slots = kvm_memslots(kvm); + kvm_for_each_memslot(memslot, slots) + if (memslot->id < KVM_USER_MEM_SLOTS && + (memslot->flags & skip_mask) == 0) + max_gfn = max(max_gfn, memslot->base_gfn + + memslot->npages); + + spin_unlock(&kvm->mmu_lock); + srcu_read_unlock(&kvm->srcu, idx); + + return max_gfn; +} + #ifndef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT /** * kvm_get_dirty_log - get a snapshot of dirty pages
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 05/84] KVM: doc: fix the hypercall numbering
The next hypercalls will be correctly numbered. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/hypercalls.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/hypercalls.rst b/Documentation/virt/kvm/hypercalls.rst index ed4fddd364ea..70e77c66b64c 100644 --- a/Documentation/virt/kvm/hypercalls.rst +++ b/Documentation/virt/kvm/hypercalls.rst @@ -137,7 +137,7 @@ compute the CLOCK_REALTIME for its clock, at the same instant. Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource, or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK. -6. KVM_HC_SEND_IPI +7. KVM_HC_SEND_IPI ------------------ :Architecture: x86 @@ -158,7 +158,7 @@ corresponds to the APIC ID a2+1, and so on. Returns the number of CPUs to which the IPIs were delivered successfully. -7. KVM_HC_SCHED_YIELD +8. KVM_HC_SCHED_YIELD --------------------- :Architecture: x86
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 06/84] KVM: x86: add kvm_arch_vcpu_get_regs() and kvm_arch_vcpu_get_sregs()
From: Mihai Don?u <mdontu at bitdefender.com> These functions are used by the VM introspection code (for the KVMI_VCPU_GET_REGISTERS command and all events sending the vCPU registers to the introspection tool). Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/x86.c | 10 ++++++++++ include/linux/kvm_host.h | 3 +++ 2 files changed, 13 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 88c593f83b28..10410ebda034 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8939,6 +8939,11 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) return 0; } +void kvm_arch_vcpu_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) +{ + __get_regs(vcpu, regs); +} + static void __set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) { vcpu->arch.emulate_regs_need_sync_from_vcpu = true; @@ -9034,6 +9039,11 @@ int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, return 0; } +void kvm_arch_vcpu_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs) +{ + __get_sregs(vcpu, sregs); +} + int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, struct kvm_mp_state *mp_state) { diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a4249fc88fc2..23ab4932f7e7 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -864,9 +864,12 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, struct kvm_translation *tr); int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs); +void kvm_arch_vcpu_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs); int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs); int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs); +void kvm_arch_vcpu_get_sregs(struct kvm_vcpu *vcpu, + struct kvm_sregs *sregs); int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs); int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 07/84] KVM: x86: add kvm_arch_vcpu_set_regs()
From: Nicu?or C??u <ncitu at bitdefender.com> This is needed for the KVMI_VCPU_SET_REGISTERS command, without clearing the pending exception. The KVMI_VCPU_SET_REGISTERS commmand allows the introspectiont tool to override the kvm_regs structure of a specific vCPU. But in most cases this is used to increment the program counter. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/x86.c | 21 ++++++++++++++------- include/linux/kvm_host.h | 2 ++ 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 10410ebda034..e973ffe04d54 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8970,16 +8970,23 @@ static void __set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) kvm_rip_write(vcpu, regs->rip); kvm_set_rflags(vcpu, regs->rflags | X86_EFLAGS_FIXED); - - vcpu->arch.exception.pending = false; - - kvm_make_request(KVM_REQ_EVENT, vcpu); } -int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) +void kvm_arch_vcpu_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs, + bool clear_exception) { - vcpu_load(vcpu); __set_regs(vcpu, regs); + + if (clear_exception) + vcpu->arch.exception.pending = false; + + kvm_make_request(KVM_REQ_EVENT, vcpu); +} + +int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) +{ + vcpu_load(vcpu); + kvm_arch_vcpu_set_regs(vcpu, regs, true); vcpu_put(vcpu); return 0; } @@ -9386,7 +9393,7 @@ static int sync_regs(struct kvm_vcpu *vcpu) return -EINVAL; if (vcpu->run->kvm_dirty_regs & KVM_SYNC_X86_REGS) { - __set_regs(vcpu, &vcpu->run->s.regs.regs); + kvm_arch_vcpu_set_regs(vcpu, &vcpu->run->s.regs.regs, true); vcpu->run->kvm_dirty_regs &= ~KVM_SYNC_X86_REGS; } if (vcpu->run->kvm_dirty_regs & KVM_SYNC_X86_SREGS) { diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 23ab4932f7e7..49cbd175f45b 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -866,6 +866,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs); void kvm_arch_vcpu_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs); int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs); +void kvm_arch_vcpu_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs, + bool clear_exception); int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs); void kvm_arch_vcpu_get_sregs(struct kvm_vcpu *vcpu,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 08/84] KVM: x86: avoid injecting #PF when emulate the VMCALL instruction
From: Mihai Don?u <mdontu at bitdefender.com> It can happened to end up emulating the VMCALL instruction as a result of the handling of an EPT write fault. In this situation, the emulator will try to unconditionally patch the correct hypercall opcode bytes using emulator_write_emulated(). However, this last call uses the fault GPA (if available) or walks the guest page tables at RIP, otherwise. The trouble begins when using VM introspection, when we forbid the use of the fault GPA and fallback to the guest pt walk: in Windows (8.1 and newer) the page that we try to write into is marked read-execute and as such emulator_write_emulated() fails and we inject a write #PF, leading to a guest crash. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/x86.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e973ffe04d54..23bce3ef26d8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7714,11 +7714,15 @@ static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt) struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt); char instruction[3]; unsigned long rip = kvm_rip_read(vcpu); + int err; kvm_x86_ops.patch_hypercall(vcpu, instruction); - return emulator_write_emulated(ctxt, rip, instruction, 3, + err = emulator_write_emulated(ctxt, rip, instruction, 3, &ctxt->exception); + if (err == X86EMUL_PROPAGATE_FAULT) + err = X86EMUL_CONTINUE; + return err; } static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 09/84] KVM: x86: add .bp_intercepted() to struct kvm_x86_ops
From: Nicu?or C??u <ncitu at bitdefender.com> Both, the introspection tool and the device manager can request #BP interception. This function will be used to check if this interception is enabled by either side. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 8 ++++++++ arch/x86/kvm/svm/svm.h | 7 +++++++ arch/x86/kvm/vmx/vmx.c | 6 ++++++ 4 files changed, 22 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index be5363b21540..78fe3c7c814c 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1098,6 +1098,7 @@ struct kvm_x86_ops { void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu); void (*vcpu_put)(struct kvm_vcpu *vcpu); + bool (*bp_intercepted)(struct kvm_vcpu *vcpu); void (*update_bp_intercept)(struct kvm_vcpu *vcpu); int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr); int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index c0da4dd78ac5..23b3cd057753 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1627,6 +1627,13 @@ static void svm_set_segment(struct kvm_vcpu *vcpu, mark_dirty(svm->vmcb, VMCB_SEG); } +static bool svm_bp_intercepted(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + return get_exception_intercept(svm, BP_VECTOR); +} + static void update_bp_intercept(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -3989,6 +3996,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .vcpu_blocking = svm_vcpu_blocking, .vcpu_unblocking = svm_vcpu_unblocking, + .bp_intercepted = svm_bp_intercepted, .update_bp_intercept = update_bp_intercept, .get_msr_feature = svm_get_msr_feature, .get_msr = svm_get_msr, diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 6ac4c00a5d82..d5c956e07c12 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -293,6 +293,13 @@ static inline void clr_exception_intercept(struct vcpu_svm *svm, int bit) recalc_intercepts(svm); } +static inline bool get_exception_intercept(struct vcpu_svm *svm, int bit) +{ + struct vmcb *vmcb = get_host_vmcb(svm); + + return (vmcb->control.intercept_exceptions & (1U << bit)); +} + static inline void set_intercept(struct vcpu_svm *svm, int bit) { struct vmcb *vmcb = get_host_vmcb(svm); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 13745f2a5ecd..069593f2f504 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -760,6 +760,11 @@ static u32 vmx_read_guest_seg_ar(struct vcpu_vmx *vmx, unsigned seg) return *p; } +static bool vmx_bp_intercepted(struct kvm_vcpu *vcpu) +{ + return (vmcs_read32(EXCEPTION_BITMAP) & (1u << BP_VECTOR)); +} + void update_exception_bitmap(struct kvm_vcpu *vcpu) { u32 eb; @@ -7859,6 +7864,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .vcpu_load = vmx_vcpu_load, .vcpu_put = vmx_vcpu_put, + .bp_intercepted = vmx_bp_intercepted, .update_bp_intercept = update_exception_bitmap, .get_msr_feature = vmx_get_msr_feature, .get_msr = vmx_get_msr,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 10/84] KVM: x86: add .control_cr3_intercept() to struct kvm_x86_ops
This function is needed for the KVMI_VCPU_CONTROL_CR command, when the introspection tool has to intercept the read/write access to CR3. Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 6 ++++++ arch/x86/kvm/svm/svm.c | 14 ++++++++++++++ arch/x86/kvm/vmx/vmx.c | 26 ++++++++++++++++++++------ 3 files changed, 40 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 78fe3c7c814c..89c0bd6529a5 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -136,6 +136,10 @@ static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t base_gfn, int level) #define KVM_NR_FIXED_MTRR_REGION 88 #define KVM_NR_VAR_MTRR 8 +#define CR_TYPE_R 1 +#define CR_TYPE_W 2 +#define CR_TYPE_RW 3 + #define ASYNC_PF_PER_VCPU 64 enum kvm_reg { @@ -1111,6 +1115,8 @@ struct kvm_x86_ops { void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l); void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0); int (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4); + void (*control_cr3_intercept)(struct kvm_vcpu *vcpu, int type, + bool enable); void (*set_efer)(struct kvm_vcpu *vcpu, u64 efer); void (*get_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); void (*set_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 23b3cd057753..f14fc940538b 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1596,6 +1596,19 @@ int svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) return 0; } +static void svm_control_cr3_intercept(struct kvm_vcpu *vcpu, int type, + bool enable) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + if (type & CR_TYPE_R) + enable ? set_cr_intercept(svm, INTERCEPT_CR3_READ) : + clr_cr_intercept(svm, INTERCEPT_CR3_READ); + if (type & CR_TYPE_W) + enable ? set_cr_intercept(svm, INTERCEPT_CR3_WRITE) : + clr_cr_intercept(svm, INTERCEPT_CR3_WRITE); +} + static void svm_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg) { @@ -4008,6 +4021,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .get_cs_db_l_bits = kvm_get_cs_db_l_bits, .set_cr0 = svm_set_cr0, .set_cr4 = svm_set_cr4, + .control_cr3_intercept = svm_control_cr3_intercept, .set_efer = svm_set_efer, .get_idt = svm_get_idt, .set_idt = svm_set_idt, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 069593f2f504..6b9639703560 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3003,24 +3003,37 @@ void ept_save_pdptrs(struct kvm_vcpu *vcpu) kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR); } +static void vmx_control_cr3_intercept(struct kvm_vcpu *vcpu, int type, + bool enable) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + u32 cr3_exec_control = 0; + + if (type & CR_TYPE_R) + cr3_exec_control |= CPU_BASED_CR3_STORE_EXITING; + if (type & CR_TYPE_W) + cr3_exec_control |= CPU_BASED_CR3_LOAD_EXITING; + + if (enable) + exec_controls_setbit(vmx, cr3_exec_control); + else + exec_controls_clearbit(vmx, cr3_exec_control); +} + static void ept_update_paging_mode_cr0(unsigned long *hw_cr0, unsigned long cr0, struct kvm_vcpu *vcpu) { - struct vcpu_vmx *vmx = to_vmx(vcpu); - if (!kvm_register_is_available(vcpu, VCPU_EXREG_CR3)) vmx_cache_reg(vcpu, VCPU_EXREG_CR3); if (!(cr0 & X86_CR0_PG)) { /* From paging/starting to nonpaging */ - exec_controls_setbit(vmx, CPU_BASED_CR3_LOAD_EXITING | - CPU_BASED_CR3_STORE_EXITING); + vmx_control_cr3_intercept(vcpu, CR_TYPE_RW, true); vcpu->arch.cr0 = cr0; vmx_set_cr4(vcpu, kvm_read_cr4(vcpu)); } else if (!is_paging(vcpu)) { /* From nonpaging to paging */ - exec_controls_clearbit(vmx, CPU_BASED_CR3_LOAD_EXITING | - CPU_BASED_CR3_STORE_EXITING); + vmx_control_cr3_intercept(vcpu, CR_TYPE_RW, false); vcpu->arch.cr0 = cr0; vmx_set_cr4(vcpu, kvm_read_cr4(vcpu)); } @@ -7876,6 +7889,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .get_cs_db_l_bits = vmx_get_cs_db_l_bits, .set_cr0 = vmx_set_cr0, .set_cr4 = vmx_set_cr4, + .control_cr3_intercept = vmx_control_cr3_intercept, .set_efer = vmx_set_efer, .get_idt = vmx_get_idt, .set_idt = vmx_set_idt,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 11/84] KVM: x86: add .cr3_write_intercepted()
From: Nicu?or C??u <ncitu at bitdefender.com> This function will be used to allow the introspection tool to disable the CR3-write interception when it is no longer interested in these events, but only if nothing else depends on these VM-exits. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 8 ++++++++ arch/x86/kvm/vmx/vmx.c | 8 ++++++++ 3 files changed, 17 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 89c0bd6529a5..ac45aacc9fc0 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1117,6 +1117,7 @@ struct kvm_x86_ops { int (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4); void (*control_cr3_intercept)(struct kvm_vcpu *vcpu, int type, bool enable); + bool (*cr3_write_intercepted)(struct kvm_vcpu *vcpu); void (*set_efer)(struct kvm_vcpu *vcpu, u64 efer); void (*get_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); void (*set_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index f14fc940538b..7a4ec6fbffb9 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1609,6 +1609,13 @@ static void svm_control_cr3_intercept(struct kvm_vcpu *vcpu, int type, clr_cr_intercept(svm, INTERCEPT_CR3_WRITE); } +static bool svm_cr3_write_intercepted(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + return is_cr_intercept(svm, INTERCEPT_CR3_WRITE); +} + static void svm_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg) { @@ -4022,6 +4029,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .set_cr0 = svm_set_cr0, .set_cr4 = svm_set_cr4, .control_cr3_intercept = svm_control_cr3_intercept, + .cr3_write_intercepted = svm_cr3_write_intercepted, .set_efer = svm_set_efer, .get_idt = svm_get_idt, .set_idt = svm_set_idt, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6b9639703560..61eb64cf25c7 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3020,6 +3020,13 @@ static void vmx_control_cr3_intercept(struct kvm_vcpu *vcpu, int type, exec_controls_clearbit(vmx, cr3_exec_control); } +static bool vmx_cr3_write_intercepted(struct kvm_vcpu *vcpu) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + + return !!(exec_controls_get(vmx) & CPU_BASED_CR3_LOAD_EXITING); +} + static void ept_update_paging_mode_cr0(unsigned long *hw_cr0, unsigned long cr0, struct kvm_vcpu *vcpu) @@ -7890,6 +7897,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .set_cr0 = vmx_set_cr0, .set_cr4 = vmx_set_cr4, .control_cr3_intercept = vmx_control_cr3_intercept, + .cr3_write_intercepted = vmx_cr3_write_intercepted, .set_efer = vmx_set_efer, .get_idt = vmx_get_idt, .set_idt = vmx_set_idt,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 12/84] KVM: x86: add .desc_ctrl_supported()
When the introspection tool tries to enable the KVMI_EVENT_DESCRIPTOR event, this function is used to check if it is supported. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 6 ++++++ arch/x86/kvm/vmx/capabilities.h | 7 ++++++- arch/x86/kvm/vmx/vmx.c | 1 + 4 files changed, 14 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ac45aacc9fc0..b3ca64a70bb5 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1123,6 +1123,7 @@ struct kvm_x86_ops { void (*set_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); void (*get_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); void (*set_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); + bool (*desc_ctrl_supported)(void); void (*sync_dirty_debug_regs)(struct kvm_vcpu *vcpu); void (*set_dr7)(struct kvm_vcpu *vcpu, unsigned long value); void (*cache_reg)(struct kvm_vcpu *vcpu, enum kvm_reg reg); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 7a4ec6fbffb9..f4d882ca0060 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1523,6 +1523,11 @@ static void svm_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) mark_dirty(svm->vmcb, VMCB_DT); } +static bool svm_desc_ctrl_supported(void) +{ + return true; +} + static void update_cr0_intercept(struct vcpu_svm *svm) { ulong gcr0 = svm->vcpu.arch.cr0; @@ -4035,6 +4040,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .set_idt = svm_set_idt, .get_gdt = svm_get_gdt, .set_gdt = svm_set_gdt, + .desc_ctrl_supported = svm_desc_ctrl_supported, .set_dr7 = svm_set_dr7, .sync_dirty_debug_regs = svm_sync_dirty_debug_regs, .cache_reg = svm_cache_reg, diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 4bbd8b448d22..e7d7fcb7e17f 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -142,12 +142,17 @@ static inline bool cpu_has_vmx_ept(void) SECONDARY_EXEC_ENABLE_EPT; } -static inline bool vmx_umip_emulated(void) +static inline bool vmx_desc_ctrl_supported(void) { return vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_DESC; } +static inline bool vmx_umip_emulated(void) +{ + return vmx_desc_ctrl_supported(); +} + static inline bool cpu_has_vmx_rdtscp(void) { return vmcs_config.cpu_based_2nd_exec_ctrl & diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 61eb64cf25c7..ecd4c50bf1a2 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7903,6 +7903,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .set_idt = vmx_set_idt, .get_gdt = vmx_get_gdt, .set_gdt = vmx_set_gdt, + .desc_ctrl_supported = vmx_desc_ctrl_supported, .set_dr7 = vmx_set_dr7, .sync_dirty_debug_regs = vmx_sync_dirty_debug_regs, .cache_reg = vmx_cache_reg,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 13/84] KVM: svm: add support for descriptor-table exits
From: Nicu?or C??u <ncitu at bitdefender.com> This function is needed for the KVMI_EVENT_DESCRIPTOR event. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/svm/svm.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index f4d882ca0060..b540af04b384 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -2220,6 +2220,13 @@ static int rsm_interception(struct vcpu_svm *svm) return kvm_emulate_instruction_from_buffer(&svm->vcpu, rsm_ins_bytes, 2); } +static int descriptor_access_interception(struct vcpu_svm *svm) +{ + struct kvm_vcpu *vcpu = &svm->vcpu; + + return kvm_emulate_instruction(vcpu, 0); +} + static int rdpmc_interception(struct vcpu_svm *svm) { int err; @@ -2815,6 +2822,14 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = { [SVM_EXIT_RSM] = rsm_interception, [SVM_EXIT_AVIC_INCOMPLETE_IPI] = avic_incomplete_ipi_interception, [SVM_EXIT_AVIC_UNACCELERATED_ACCESS] = avic_unaccelerated_access_interception, + [SVM_EXIT_IDTR_READ] = descriptor_access_interception, + [SVM_EXIT_GDTR_READ] = descriptor_access_interception, + [SVM_EXIT_LDTR_READ] = descriptor_access_interception, + [SVM_EXIT_TR_READ] = descriptor_access_interception, + [SVM_EXIT_IDTR_WRITE] = descriptor_access_interception, + [SVM_EXIT_GDTR_WRITE] = descriptor_access_interception, + [SVM_EXIT_LDTR_WRITE] = descriptor_access_interception, + [SVM_EXIT_TR_WRITE] = descriptor_access_interception, }; static void dump_vmcb(struct kvm_vcpu *vcpu)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 14/84] KVM: x86: add .control_desc_intercept()
This function is needed to intercept descriptor-table registers access. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 26 ++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++-- 3 files changed, 40 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index b3ca64a70bb5..83dfa0247130 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1124,6 +1124,7 @@ struct kvm_x86_ops { void (*get_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); void (*set_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); bool (*desc_ctrl_supported)(void); + void (*control_desc_intercept)(struct kvm_vcpu *vcpu, bool enable); void (*sync_dirty_debug_regs)(struct kvm_vcpu *vcpu); void (*set_dr7)(struct kvm_vcpu *vcpu, unsigned long value); void (*cache_reg)(struct kvm_vcpu *vcpu, enum kvm_reg reg); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index b540af04b384..c70c14461483 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1528,6 +1528,31 @@ static bool svm_desc_ctrl_supported(void) return true; } +static void svm_control_desc_intercept(struct kvm_vcpu *vcpu, bool enable) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + if (enable) { + set_intercept(svm, INTERCEPT_STORE_IDTR); + set_intercept(svm, INTERCEPT_STORE_GDTR); + set_intercept(svm, INTERCEPT_STORE_LDTR); + set_intercept(svm, INTERCEPT_STORE_TR); + set_intercept(svm, INTERCEPT_LOAD_IDTR); + set_intercept(svm, INTERCEPT_LOAD_GDTR); + set_intercept(svm, INTERCEPT_LOAD_LDTR); + set_intercept(svm, INTERCEPT_LOAD_TR); + } else { + clr_intercept(svm, INTERCEPT_STORE_IDTR); + clr_intercept(svm, INTERCEPT_STORE_GDTR); + clr_intercept(svm, INTERCEPT_STORE_LDTR); + clr_intercept(svm, INTERCEPT_STORE_TR); + clr_intercept(svm, INTERCEPT_LOAD_IDTR); + clr_intercept(svm, INTERCEPT_LOAD_GDTR); + clr_intercept(svm, INTERCEPT_LOAD_LDTR); + clr_intercept(svm, INTERCEPT_LOAD_TR); + } +} + static void update_cr0_intercept(struct vcpu_svm *svm) { ulong gcr0 = svm->vcpu.arch.cr0; @@ -4056,6 +4081,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .get_gdt = svm_get_gdt, .set_gdt = svm_set_gdt, .desc_ctrl_supported = svm_desc_ctrl_supported, + .control_desc_intercept = svm_control_desc_intercept, .set_dr7 = svm_set_dr7, .sync_dirty_debug_regs = svm_sync_dirty_debug_regs, .cache_reg = svm_cache_reg, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index ecd4c50bf1a2..199ffd318145 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3151,6 +3151,16 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd) vmcs_writel(GUEST_CR3, guest_cr3); } +static void vmx_control_desc_intercept(struct kvm_vcpu *vcpu, bool enable) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + + if (enable) + secondary_exec_controls_setbit(vmx, SECONDARY_EXEC_DESC); + else + secondary_exec_controls_clearbit(vmx, SECONDARY_EXEC_DESC); +} + int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -3171,11 +3181,11 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) if (!boot_cpu_has(X86_FEATURE_UMIP) && vmx_umip_emulated()) { if (cr4 & X86_CR4_UMIP) { - secondary_exec_controls_setbit(vmx, SECONDARY_EXEC_DESC); + vmx_control_desc_intercept(vcpu, true); hw_cr4 &= ~X86_CR4_UMIP; } else if (!is_guest_mode(vcpu) || !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC)) { - secondary_exec_controls_clearbit(vmx, SECONDARY_EXEC_DESC); + vmx_control_desc_intercept(vcpu, false); } } @@ -7904,6 +7914,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .get_gdt = vmx_get_gdt, .set_gdt = vmx_set_gdt, .desc_ctrl_supported = vmx_desc_ctrl_supported, + .control_desc_intercept = vmx_control_desc_intercept, .set_dr7 = vmx_set_dr7, .sync_dirty_debug_regs = vmx_sync_dirty_debug_regs, .cache_reg = vmx_cache_reg,
From: Nicu?or C??u <ncitu at bitdefender.com> This function will be used to test if the descriptor-table registers access is already tracked by another user. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 15 +++++++++++++++ arch/x86/kvm/svm/svm.h | 7 +++++++ arch/x86/kvm/vmx/vmx.c | 8 ++++++++ 4 files changed, 31 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 83dfa0247130..2ed1e5621ccf 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1125,6 +1125,7 @@ struct kvm_x86_ops { void (*set_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); bool (*desc_ctrl_supported)(void); void (*control_desc_intercept)(struct kvm_vcpu *vcpu, bool enable); + bool (*desc_intercepted)(struct kvm_vcpu *vcpu); void (*sync_dirty_debug_regs)(struct kvm_vcpu *vcpu); void (*set_dr7)(struct kvm_vcpu *vcpu, unsigned long value); void (*cache_reg)(struct kvm_vcpu *vcpu, enum kvm_reg reg); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index c70c14461483..cc55c571fe86 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1553,6 +1553,20 @@ static void svm_control_desc_intercept(struct kvm_vcpu *vcpu, bool enable) } } +static inline bool svm_desc_intercepted(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + return (get_intercept(svm, INTERCEPT_STORE_IDTR) || + get_intercept(svm, INTERCEPT_STORE_GDTR) || + get_intercept(svm, INTERCEPT_STORE_LDTR) || + get_intercept(svm, INTERCEPT_STORE_TR) || + get_intercept(svm, INTERCEPT_LOAD_IDTR) || + get_intercept(svm, INTERCEPT_LOAD_GDTR) || + get_intercept(svm, INTERCEPT_LOAD_LDTR) || + get_intercept(svm, INTERCEPT_LOAD_TR)); +} + static void update_cr0_intercept(struct vcpu_svm *svm) { ulong gcr0 = svm->vcpu.arch.cr0; @@ -4082,6 +4096,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .set_gdt = svm_set_gdt, .desc_ctrl_supported = svm_desc_ctrl_supported, .control_desc_intercept = svm_control_desc_intercept, + .desc_intercepted = svm_desc_intercepted, .set_dr7 = svm_set_dr7, .sync_dirty_debug_regs = svm_sync_dirty_debug_regs, .cache_reg = svm_cache_reg, diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index d5c956e07c12..f86fababe413 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -318,6 +318,13 @@ static inline void clr_intercept(struct vcpu_svm *svm, int bit) recalc_intercepts(svm); } +static inline bool get_intercept(struct vcpu_svm *svm, int bit) +{ + struct vmcb *vmcb = get_host_vmcb(svm); + + return (vmcb->control.intercept & (1ULL << bit)); +} + static inline bool is_intercept(struct vcpu_svm *svm, int bit) { return (svm->vmcb->control.intercept & (1ULL << bit)) != 0; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 199ffd318145..3b5778003b58 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3388,6 +3388,13 @@ static void vmx_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) vmcs_writel(GUEST_GDTR_BASE, dt->address); } +static bool vmx_desc_intercepted(struct kvm_vcpu *vcpu) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + + return !!(secondary_exec_controls_get(vmx) & SECONDARY_EXEC_DESC); +} + static bool rmode_segment_valid(struct kvm_vcpu *vcpu, int seg) { struct kvm_segment var; @@ -7915,6 +7922,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .set_gdt = vmx_set_gdt, .desc_ctrl_supported = vmx_desc_ctrl_supported, .control_desc_intercept = vmx_control_desc_intercept, + .desc_intercepted = vmx_desc_intercepted, .set_dr7 = vmx_set_dr7, .sync_dirty_debug_regs = vmx_sync_dirty_debug_regs, .cache_reg = vmx_cache_reg,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 16/84] KVM: x86: export .msr_write_intercepted()
From: Nicu?or C??u <ncitu at bitdefender.com> This function will be used to check if the access for a specific MSR is already intercepted. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 1 + arch/x86/kvm/vmx/vmx.c | 1 + 3 files changed, 3 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2ed1e5621ccf..6be832ba9c97 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1106,6 +1106,7 @@ struct kvm_x86_ops { void (*update_bp_intercept)(struct kvm_vcpu *vcpu); int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr); int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr); + bool (*msr_write_intercepted)(struct kvm_vcpu *vcpu, u32 msr); u64 (*get_segment_base)(struct kvm_vcpu *vcpu, int seg); void (*get_segment)(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index cc55c571fe86..4e5b07606891 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4080,6 +4080,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .get_msr_feature = svm_get_msr_feature, .get_msr = svm_get_msr, .set_msr = svm_set_msr, + .msr_write_intercepted = msr_write_intercepted, .get_segment_base = svm_get_segment_base, .get_segment = svm_get_segment, .set_segment = svm_set_segment, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 3b5778003b58..cf07db129670 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7906,6 +7906,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .get_msr_feature = vmx_get_msr_feature, .get_msr = vmx_get_msr, .set_msr = vmx_set_msr, + .msr_write_intercepted = msr_write_intercepted, .get_segment_base = vmx_get_segment_base, .get_segment = vmx_get_segment, .set_segment = vmx_set_segment,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 17/84] KVM: x86: use MSR_TYPE_R, MSR_TYPE_W and MSR_TYPE_RW with AMD
From: Nicu?or C??u <ncitu at bitdefender.com> This is a preparatory patch in order to use a common interface to enable/disable the MSR interception. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 4 +++ arch/x86/kvm/svm/svm.c | 43 ++++++++++++++++++++++----------- arch/x86/kvm/vmx/vmx.h | 4 --- 3 files changed, 33 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 6be832ba9c97..a3230ab377db 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -140,6 +140,10 @@ static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t base_gfn, int level) #define CR_TYPE_W 2 #define CR_TYPE_RW 3 +#define MSR_TYPE_R 1 +#define MSR_TYPE_W 2 +#define MSR_TYPE_RW 3 + #define ASYNC_PF_PER_VCPU 64 enum kvm_reg { diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 4e5b07606891..e16be80edd7e 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -584,7 +584,7 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, unsigned msr) } static void set_msr_interception(u32 *msrpm, unsigned msr, - int read, int write) + int type, bool value) { u8 bit_read, bit_write; unsigned long tmp; @@ -603,8 +603,10 @@ static void set_msr_interception(u32 *msrpm, unsigned msr, BUG_ON(offset == MSR_INVALID); - read ? clear_bit(bit_read, &tmp) : set_bit(bit_read, &tmp); - write ? clear_bit(bit_write, &tmp) : set_bit(bit_write, &tmp); + if (type & MSR_TYPE_R) + value ? clear_bit(bit_read, &tmp) : set_bit(bit_read, &tmp); + if (type & MSR_TYPE_W) + value ? clear_bit(bit_write, &tmp) : set_bit(bit_write, &tmp); msrpm[offset] = tmp; } @@ -619,7 +621,8 @@ static void svm_vcpu_init_msrpm(u32 *msrpm) if (!direct_access_msrs[i].always) continue; - set_msr_interception(msrpm, direct_access_msrs[i].index, 1, 1); + set_msr_interception(msrpm, direct_access_msrs[i].index, + MSR_TYPE_RW, 1); } } @@ -671,10 +674,14 @@ static void svm_enable_lbrv(struct vcpu_svm *svm) u32 *msrpm = svm->msrpm; svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, 1, 1); - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, 1, 1); - set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, 1, 1); - set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, 1, 1); + set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, + MSR_TYPE_RW, 1); + set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, + MSR_TYPE_RW, 1); + set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, + MSR_TYPE_RW, 1); + set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, + MSR_TYPE_RW, 1); } static void svm_disable_lbrv(struct vcpu_svm *svm) @@ -682,10 +689,14 @@ static void svm_disable_lbrv(struct vcpu_svm *svm) u32 *msrpm = svm->msrpm; svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK; - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, 0, 0); - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, 0, 0); - set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, 0, 0); - set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, 0, 0); + set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, + MSR_TYPE_RW, 0); + set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, + MSR_TYPE_RW, 0); + set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, + MSR_TYPE_RW, 0); + set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, + MSR_TYPE_RW, 0); } void disable_nmi_singlestep(struct vcpu_svm *svm) @@ -2618,7 +2629,8 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) * We update the L1 MSR bit as well since it will end up * touching the MSR anyway now. */ - set_msr_interception(svm->msrpm, MSR_IA32_SPEC_CTRL, 1, 1); + set_msr_interception(svm->msrpm, MSR_IA32_SPEC_CTRL, + MSR_TYPE_RW, 1); break; case MSR_IA32_PRED_CMD: if (!msr->host_initiated && @@ -2633,7 +2645,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) break; wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB); - set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, 0, 1); + set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, + MSR_TYPE_R, 0); + set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, + MSR_TYPE_W, 1); break; case MSR_AMD64_VIRT_SPEC_CTRL: if (!msr->host_initiated && diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 639798e4a6ca..aa0c7ffd588b 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -14,10 +14,6 @@ extern const u32 vmx_msr_index[]; -#define MSR_TYPE_R 1 -#define MSR_TYPE_W 2 -#define MSR_TYPE_RW 3 - #define X2APIC_MSR(r) (APIC_BASE_MSR + ((r) >> 4)) #ifdef CONFIG_X86_64
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 18/84] KVM: svm: pass struct kvm_vcpu to set_msr_interception()
From: Nicu?or C??u <ncitu at bitdefender.com> This is preparatory patch to mediate the MSR interception between the introspection tool and the device manager (one must not disable the interception if the other one has enabled the interception). Passing NULL during initialization is OK because a vCPU can be introspected only after initialization. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/svm/svm.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e16be80edd7e..dfa1a6e74bf7 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -583,7 +583,8 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, unsigned msr) return !!test_bit(bit_write, &tmp); } -static void set_msr_interception(u32 *msrpm, unsigned msr, +static void set_msr_interception(struct kvm_vcpu *vcpu, + u32 *msrpm, unsigned msr, int type, bool value) { u8 bit_read, bit_write; @@ -621,7 +622,7 @@ static void svm_vcpu_init_msrpm(u32 *msrpm) if (!direct_access_msrs[i].always) continue; - set_msr_interception(msrpm, direct_access_msrs[i].index, + set_msr_interception(NULL, msrpm, direct_access_msrs[i].index, MSR_TYPE_RW, 1); } } @@ -674,13 +675,13 @@ static void svm_enable_lbrv(struct vcpu_svm *svm) u32 *msrpm = svm->msrpm; svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, + set_msr_interception(&svm->vcpu, msrpm, MSR_IA32_LASTBRANCHFROMIP, MSR_TYPE_RW, 1); - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, + set_msr_interception(&svm->vcpu, msrpm, MSR_IA32_LASTBRANCHTOIP, MSR_TYPE_RW, 1); - set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, + set_msr_interception(&svm->vcpu, msrpm, MSR_IA32_LASTINTFROMIP, MSR_TYPE_RW, 1); - set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, + set_msr_interception(&svm->vcpu, msrpm, MSR_IA32_LASTINTTOIP, MSR_TYPE_RW, 1); } @@ -689,13 +690,13 @@ static void svm_disable_lbrv(struct vcpu_svm *svm) u32 *msrpm = svm->msrpm; svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK; - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, + set_msr_interception(&svm->vcpu, msrpm, MSR_IA32_LASTBRANCHFROMIP, MSR_TYPE_RW, 0); - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, + set_msr_interception(&svm->vcpu, msrpm, MSR_IA32_LASTBRANCHTOIP, MSR_TYPE_RW, 0); - set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, + set_msr_interception(&svm->vcpu, msrpm, MSR_IA32_LASTINTFROMIP, MSR_TYPE_RW, 0); - set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, + set_msr_interception(&svm->vcpu, msrpm, MSR_IA32_LASTINTTOIP, MSR_TYPE_RW, 0); } @@ -2629,7 +2630,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) * We update the L1 MSR bit as well since it will end up * touching the MSR anyway now. */ - set_msr_interception(svm->msrpm, MSR_IA32_SPEC_CTRL, + set_msr_interception(&svm->vcpu, svm->msrpm, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW, 1); break; case MSR_IA32_PRED_CMD: @@ -2645,9 +2646,9 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) break; wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB); - set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, + set_msr_interception(&svm->vcpu, svm->msrpm, MSR_IA32_PRED_CMD, MSR_TYPE_R, 0); - set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, + set_msr_interception(&svm->vcpu, svm->msrpm, MSR_IA32_PRED_CMD, MSR_TYPE_W, 1); break; case MSR_AMD64_VIRT_SPEC_CTRL:
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 19/84] KVM: vmx: pass struct kvm_vcpu to the intercept msr related functions
From: Nicu?or C??u <ncitu at bitdefender.com> This is preparatory patch to mediate the MSR interception between the introspection tool and the device manager (one must not disable the interception if the other one has enabled the interception). Passing NULL during initialization is OK because a vCPU can be introspected only after initialization. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/vmx/vmx.c | 74 ++++++++++++++++++++++++------------------ 1 file changed, 42 insertions(+), 32 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index cf07db129670..ecf7fb21b812 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -342,7 +342,8 @@ module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644); static bool guest_state_valid(struct kvm_vcpu *vcpu); static u32 vmx_segment_access_rights(struct kvm_segment *var); -static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu, + unsigned long *msr_bitmap, u32 msr, int type); void vmx_vmexit(void); @@ -2086,7 +2087,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) * in the merging. We update the vmcs01 here for L1 as well * since it will end up touching the MSR anyway now. */ - vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, + vmx_disable_intercept_for_msr(vcpu, vmx->vmcs01.msr_bitmap, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW); break; @@ -2122,8 +2123,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) * vmcs02.msr_bitmap here since it gets completely overwritten * in the merging. */ - vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD, - MSR_TYPE_W); + vmx_disable_intercept_for_msr(vcpu, vmx->vmcs01.msr_bitmap, + MSR_IA32_PRED_CMD, MSR_TYPE_W); break; case MSR_IA32_CR_PAT: if (!kvm_pat_valid(data)) @@ -3733,7 +3734,8 @@ void free_vpid(int vpid) spin_unlock(&vmx_vpid_lock); } -static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu, + unsigned long *msr_bitmap, u32 msr, int type) { int f = sizeof(unsigned long); @@ -3771,7 +3773,8 @@ static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bit } } -static __always_inline void vmx_enable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, + unsigned long *msr_bitmap, u32 msr, int type) { int f = sizeof(unsigned long); @@ -3809,13 +3812,14 @@ static __always_inline void vmx_enable_intercept_for_msr(unsigned long *msr_bitm } } -static __always_inline void vmx_set_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, + unsigned long *msr_bitmap, u32 msr, int type, bool value) { if (value) - vmx_enable_intercept_for_msr(msr_bitmap, msr, type); + vmx_enable_intercept_for_msr(vcpu, msr_bitmap, msr, type); else - vmx_disable_intercept_for_msr(msr_bitmap, msr, type); + vmx_disable_intercept_for_msr(vcpu, msr_bitmap, msr, type); } static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu) @@ -3833,7 +3837,8 @@ static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu) return mode; } -static void vmx_update_msr_bitmap_x2apic(unsigned long *msr_bitmap, +static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu, + unsigned long *msr_bitmap, u8 mode) { int msr; @@ -3849,11 +3854,11 @@ static void vmx_update_msr_bitmap_x2apic(unsigned long *msr_bitmap, * TPR reads and writes can be virtualized even if virtual interrupt * delivery is not in use. */ - vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_RW); + vmx_disable_intercept_for_msr(vcpu, msr_bitmap, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_RW); if (mode & MSR_BITMAP_MODE_X2APIC_APICV) { - vmx_enable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R); - vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_EOI), MSR_TYPE_W); - vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W); + vmx_enable_intercept_for_msr(vcpu, msr_bitmap, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R); + vmx_disable_intercept_for_msr(vcpu, msr_bitmap, X2APIC_MSR(APIC_EOI), MSR_TYPE_W); + vmx_disable_intercept_for_msr(vcpu, msr_bitmap, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W); } } } @@ -3869,7 +3874,7 @@ void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu) return; if (changed & (MSR_BITMAP_MODE_X2APIC | MSR_BITMAP_MODE_X2APIC_APICV)) - vmx_update_msr_bitmap_x2apic(msr_bitmap, mode); + vmx_update_msr_bitmap_x2apic(vcpu, msr_bitmap, mode); vmx->msr_bitmap_mode = mode; } @@ -3878,20 +3883,21 @@ void pt_update_intercept_for_msr(struct vcpu_vmx *vmx) { unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap; bool flag = !(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN); + struct kvm_vcpu *vcpu = &vmx->vcpu; u32 i; - vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_STATUS, + vmx_set_intercept_for_msr(vcpu, msr_bitmap, MSR_IA32_RTIT_STATUS, MSR_TYPE_RW, flag); - vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_BASE, + vmx_set_intercept_for_msr(vcpu, msr_bitmap, MSR_IA32_RTIT_OUTPUT_BASE, MSR_TYPE_RW, flag); - vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_MASK, + vmx_set_intercept_for_msr(vcpu, msr_bitmap, MSR_IA32_RTIT_OUTPUT_MASK, MSR_TYPE_RW, flag); - vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_CR3_MATCH, + vmx_set_intercept_for_msr(vcpu, msr_bitmap, MSR_IA32_RTIT_CR3_MATCH, MSR_TYPE_RW, flag); for (i = 0; i < vmx->pt_desc.addr_range; i++) { - vmx_set_intercept_for_msr(msr_bitmap, + vmx_set_intercept_for_msr(vcpu, msr_bitmap, MSR_IA32_RTIT_ADDR0_A + i * 2, MSR_TYPE_RW, flag); - vmx_set_intercept_for_msr(msr_bitmap, + vmx_set_intercept_for_msr(vcpu, msr_bitmap, MSR_IA32_RTIT_ADDR0_B + i * 2, MSR_TYPE_RW, flag); } } @@ -6947,18 +6953,22 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu) goto free_pml; msr_bitmap = vmx->vmcs01.msr_bitmap; - vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_TSC, MSR_TYPE_R); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_IA32_TSC, MSR_TYPE_R); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); if (kvm_cstate_in_guest(vcpu->kvm)) { - vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C1_RES, MSR_TYPE_R); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_CORE_C1_RES, + MSR_TYPE_R); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_CORE_C3_RESIDENCY, + MSR_TYPE_R); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_CORE_C6_RESIDENCY, + MSR_TYPE_R); + vmx_disable_intercept_for_msr(NULL, msr_bitmap, MSR_CORE_C7_RESIDENCY, + MSR_TYPE_R); } vmx->msr_bitmap_mode = 0;
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 20/84] KVM: x86: add .control_msr_intercept()
From: Mihai Don?u <mdontu at bitdefender.com> This is needed for the KVMI_EVENT_MSR event, which is used notify the introspection tool about any change made to a MSR of interest. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/svm/svm.c | 11 +++++++++++ arch/x86/kvm/vmx/vmx.c | 10 ++++++++++ 3 files changed, 23 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index a3230ab377db..f04a01dac423 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1110,6 +1110,8 @@ struct kvm_x86_ops { void (*update_bp_intercept)(struct kvm_vcpu *vcpu); int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr); int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr); + void (*control_msr_intercept)(struct kvm_vcpu *vcpu, unsigned int msr, + int type, bool enable); bool (*msr_write_intercepted)(struct kvm_vcpu *vcpu, u32 msr); u64 (*get_segment_base)(struct kvm_vcpu *vcpu, int seg); void (*get_segment)(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index dfa1a6e74bf7..9c8e77193f98 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -612,6 +612,16 @@ static void set_msr_interception(struct kvm_vcpu *vcpu, msrpm[offset] = tmp; } +static void svm_control_msr_intercept(struct kvm_vcpu *vcpu, unsigned int msr, + int type, bool enable) +{ + const struct vcpu_svm *svm = to_svm(vcpu); + u32 *msrpm = is_guest_mode(vcpu) ? svm->nested.msrpm : + svm->msrpm; + + set_msr_interception(vcpu, msrpm, msr, type, !enable); +} + static void svm_vcpu_init_msrpm(u32 *msrpm) { int i; @@ -4096,6 +4106,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .get_msr_feature = svm_get_msr_feature, .get_msr = svm_get_msr, .set_msr = svm_set_msr, + .control_msr_intercept = svm_control_msr_intercept, .msr_write_intercepted = msr_write_intercepted, .get_segment_base = svm_get_segment_base, .get_segment = svm_get_segment, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index ecf7fb21b812..fed661eb65a7 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3822,6 +3822,15 @@ static __always_inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, vmx_disable_intercept_for_msr(vcpu, msr_bitmap, msr, type); } +static void vmx_control_msr_intercept(struct kvm_vcpu *vcpu, unsigned int msr, + int type, bool enable) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap; + + vmx_set_intercept_for_msr(vcpu, msr_bitmap, msr, type, enable); +} + static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu) { u8 mode = 0; @@ -7916,6 +7925,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .get_msr_feature = vmx_get_msr_feature, .get_msr = vmx_get_msr, .set_msr = vmx_set_msr, + .control_msr_intercept = vmx_control_msr_intercept, .msr_write_intercepted = msr_write_intercepted, .get_segment_base = vmx_get_segment_base, .get_segment = vmx_get_segment,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 21/84] KVM: x86: vmx: use a symbolic constant when checking the exit qualifications
From: Mihai Don?u <mdontu at bitdefender.com> This should make the code more readable. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/vmx/vmx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index fed661eb65a7..cd498ece8b52 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5361,8 +5361,8 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) EPT_VIOLATION_EXECUTABLE)) ? PFERR_PRESENT_MASK : 0; - error_code |= (exit_qualification & 0x100) != 0 ? - PFERR_GUEST_FINAL_MASK : PFERR_GUEST_PAGE_MASK; + error_code |= (exit_qualification & EPT_VIOLATION_GVA_TRANSLATED) + ? PFERR_GUEST_FINAL_MASK : PFERR_GUEST_PAGE_MASK; vcpu->arch.exit_qualification = exit_qualification; return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0);
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 22/84] KVM: x86: save the error code during EPT/NPF exits handling
From: Mihai Don?u <mdontu at bitdefender.com> This is needed for kvm_page_track_emulation_failure(). When the introspection tool {read,write,exec}-protect a guest memory page, it is notified from the read/write/fetch callbacks used by the KVM emulator. If the emulation fails it is possible that the read/write callbacks were not used. In such cases, the emulator will call kvm_page_track_emulation_failure() to ensure that the introspection tool is notified of the read/write #PF (based on this saved error code), which in turn can emulate the instruction or unprotect the memory page (and let the guest execute the instruction). Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 3 +++ arch/x86/kvm/svm/svm.c | 2 ++ arch/x86/kvm/vmx/vmx.c | 1 + 3 files changed, 6 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index f04a01dac423..2530af4420cf 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -837,6 +837,9 @@ struct kvm_vcpu_arch { /* AMD MSRC001_0015 Hardware Configuration */ u64 msr_hwcr; + + /* #PF translated error code from EPT/NPT exit reason */ + u64 error_code; }; struct kvm_lpage_info { diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 9c8e77193f98..1ec88ff241ab 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1799,6 +1799,8 @@ static int npf_interception(struct vcpu_svm *svm) u64 fault_address = __sme_clr(svm->vmcb->control.exit_info_2); u64 error_code = svm->vmcb->control.exit_info_1; + svm->vcpu.arch.error_code = error_code; + trace_kvm_page_fault(fault_address, error_code); return kvm_mmu_page_fault(&svm->vcpu, fault_address, error_code, static_cpu_has(X86_FEATURE_DECODEASSISTS) ? diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index cd498ece8b52..6554c2278176 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5365,6 +5365,7 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) ? PFERR_GUEST_FINAL_MASK : PFERR_GUEST_PAGE_MASK; vcpu->arch.exit_qualification = exit_qualification; + vcpu->arch.error_code = error_code; return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); }
From: Mihai Don?u <mdontu at bitdefender.com> This function is needed for kvmi_update_ad_flags() and kvm_page_track_emulation_failure(). kvmi_update_ad_flags() uses the the existing guest page table walk code to update the A/D bits and return to guest (on SPT page faults caused by guest page table walks when the introspection tool write-protects the guest page tables). kvm_page_track_emulation_failure() calls the page tracking code, which will be changed with a following patch to receive the GVA in addition to the GPA. Both might be needed by the introspection tool. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/include/asm/vmx.h | 2 ++ arch/x86/kvm/svm/svm.c | 9 +++++++++ arch/x86/kvm/vmx/vmx.c | 9 +++++++++ 4 files changed, 22 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2530af4420cf..ccf2804f46b9 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1290,6 +1290,8 @@ struct kvm_x86_ops { int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu); void (*migrate_timers)(struct kvm_vcpu *vcpu); + + u64 (*fault_gla)(struct kvm_vcpu *vcpu); }; struct kvm_x86_nested_ops { diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index cd7de4b401fe..04487eb38b5c 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -543,6 +543,7 @@ enum vm_entry_failure_code { #define EPT_VIOLATION_READABLE_BIT 3 #define EPT_VIOLATION_WRITABLE_BIT 4 #define EPT_VIOLATION_EXECUTABLE_BIT 5 +#define EPT_VIOLATION_GLA_VALID_BIT 7 #define EPT_VIOLATION_GVA_TRANSLATED_BIT 8 #define EPT_VIOLATION_ACC_READ (1 << EPT_VIOLATION_ACC_READ_BIT) #define EPT_VIOLATION_ACC_WRITE (1 << EPT_VIOLATION_ACC_WRITE_BIT) @@ -550,6 +551,7 @@ enum vm_entry_failure_code { #define EPT_VIOLATION_READABLE (1 << EPT_VIOLATION_READABLE_BIT) #define EPT_VIOLATION_WRITABLE (1 << EPT_VIOLATION_WRITABLE_BIT) #define EPT_VIOLATION_EXECUTABLE (1 << EPT_VIOLATION_EXECUTABLE_BIT) +#define EPT_VIOLATION_GLA_VALID (1 << EPT_VIOLATION_GLA_VALID_BIT) #define EPT_VIOLATION_GVA_TRANSLATED (1 << EPT_VIOLATION_GVA_TRANSLATED_BIT) /* diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 1ec88ff241ab..86b670ff33dd 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4082,6 +4082,13 @@ static int svm_vm_init(struct kvm *kvm) return 0; } +static u64 svm_fault_gla(struct kvm_vcpu *vcpu) +{ + const struct vcpu_svm *svm = to_svm(vcpu); + + return svm->vcpu.arch.cr2 ? svm->vcpu.arch.cr2 : ~0ull; +} + static struct kvm_x86_ops svm_x86_ops __initdata = { .hardware_unsetup = svm_hardware_teardown, .hardware_enable = svm_hardware_enable, @@ -4208,6 +4215,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .need_emulation_on_page_fault = svm_need_emulation_on_page_fault, .apic_init_signal_blocked = svm_apic_init_signal_blocked, + + .fault_gla = svm_fault_gla, }; static struct kvm_x86_init_ops svm_init_ops __initdata = { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6554c2278176..a04c46cde5b3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7902,6 +7902,13 @@ static bool vmx_check_apicv_inhibit_reasons(ulong bit) return supported & BIT(bit); } +static u64 vmx_fault_gla(struct kvm_vcpu *vcpu) +{ + if (vcpu->arch.exit_qualification & EPT_VIOLATION_GLA_VALID) + return vmcs_readl(GUEST_LINEAR_ADDRESS); + return ~0ull; +} + static struct kvm_x86_ops vmx_x86_ops __initdata = { .hardware_unsetup = hardware_unsetup, @@ -8038,6 +8045,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .need_emulation_on_page_fault = vmx_need_emulation_on_page_fault, .apic_init_signal_blocked = vmx_apic_init_signal_blocked, .migrate_timers = vmx_migrate_timers, + + .fault_gla = vmx_fault_gla, }; static __init int hardware_setup(void)
From: Mihai Don?u <mdontu at bitdefender.com> This function is needed for the KVMI_EVENT_PF event, to avoid sending such events to the introspection tool if not caused by a SPT page fault. The code path is: emulator -> {read,write,fetch} callbacks -> page tracking -> page tracking callbacks -> KVMI_EVENT_PF. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 9 +++++++++ arch/x86/kvm/vmx/vmx.c | 8 ++++++++ 3 files changed, 18 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ccf2804f46b9..fb41199b33fc 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1292,6 +1292,7 @@ struct kvm_x86_ops { void (*migrate_timers)(struct kvm_vcpu *vcpu); u64 (*fault_gla)(struct kvm_vcpu *vcpu); + bool (*spt_fault)(struct kvm_vcpu *vcpu); }; struct kvm_x86_nested_ops { diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 86b670ff33dd..7ecfa10dce5d 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4089,6 +4089,14 @@ static u64 svm_fault_gla(struct kvm_vcpu *vcpu) return svm->vcpu.arch.cr2 ? svm->vcpu.arch.cr2 : ~0ull; } +static bool svm_spt_fault(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + struct vmcb *vmcb = get_host_vmcb(svm); + + return (vmcb->control.exit_code == SVM_EXIT_NPF); +} + static struct kvm_x86_ops svm_x86_ops __initdata = { .hardware_unsetup = svm_hardware_teardown, .hardware_enable = svm_hardware_enable, @@ -4217,6 +4225,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .apic_init_signal_blocked = svm_apic_init_signal_blocked, .fault_gla = svm_fault_gla, + .spt_fault = svm_spt_fault, }; static struct kvm_x86_init_ops svm_init_ops __initdata = { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index a04c46cde5b3..17b88345dfb5 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7909,6 +7909,13 @@ static u64 vmx_fault_gla(struct kvm_vcpu *vcpu) return ~0ull; } +static bool vmx_spt_fault(struct kvm_vcpu *vcpu) +{ + const struct vcpu_vmx *vmx = to_vmx(vcpu); + + return (vmx->exit_reason == EXIT_REASON_EPT_VIOLATION); +} + static struct kvm_x86_ops vmx_x86_ops __initdata = { .hardware_unsetup = hardware_unsetup, @@ -8047,6 +8054,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .migrate_timers = vmx_migrate_timers, .fault_gla = vmx_fault_gla, + .spt_fault = vmx_spt_fault, }; static __init int hardware_setup(void)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 25/84] KVM: x86: add .gpt_translation_fault()
From: Mihai Don?u <mdontu at bitdefender.com> This function is needed for the KVMI_EVENT_PF event, to avoid sending such events to the introspection tool if caused by a guest page table walk. The code path is: emulator -> {read,write,fetch} callbacks -> page tracking -> page tracking callbacks -> KVMI_EVENT_PF. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 12 ++++++++++++ arch/x86/kvm/vmx/vmx.c | 8 ++++++++ 3 files changed, 21 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index fb41199b33fc..a905e14e4c75 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1293,6 +1293,7 @@ struct kvm_x86_ops { u64 (*fault_gla)(struct kvm_vcpu *vcpu); bool (*spt_fault)(struct kvm_vcpu *vcpu); + bool (*gpt_translation_fault)(struct kvm_vcpu *vcpu); }; struct kvm_x86_nested_ops { diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 7ecfa10dce5d..580997701b1c 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4097,6 +4097,17 @@ static bool svm_spt_fault(struct kvm_vcpu *vcpu) return (vmcb->control.exit_code == SVM_EXIT_NPF); } +static bool svm_gpt_translation_fault(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + struct vmcb *vmcb = get_host_vmcb(svm); + + if (vmcb->control.exit_info_1 & PFERR_GUEST_PAGE_MASK) + return true; + + return false; +} + static struct kvm_x86_ops svm_x86_ops __initdata = { .hardware_unsetup = svm_hardware_teardown, .hardware_enable = svm_hardware_enable, @@ -4226,6 +4237,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .fault_gla = svm_fault_gla, .spt_fault = svm_spt_fault, + .gpt_translation_fault = svm_gpt_translation_fault, }; static struct kvm_x86_init_ops svm_init_ops __initdata = { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 17b88345dfb5..a043e3e7d09a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7916,6 +7916,13 @@ static bool vmx_spt_fault(struct kvm_vcpu *vcpu) return (vmx->exit_reason == EXIT_REASON_EPT_VIOLATION); } +static bool vmx_gpt_translation_fault(struct kvm_vcpu *vcpu) +{ + if (vcpu->arch.exit_qualification & EPT_VIOLATION_GVA_TRANSLATED) + return false; + return true; +} + static struct kvm_x86_ops vmx_x86_ops __initdata = { .hardware_unsetup = hardware_unsetup, @@ -8055,6 +8062,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .fault_gla = vmx_fault_gla, .spt_fault = vmx_spt_fault, + .gpt_translation_fault = vmx_gpt_translation_fault, }; static __init int hardware_setup(void)
From: Nicu?or C??u <ncitu at bitdefender.com> This function is needed for KVMI_VCPU_CONTROL_SINGLESTEP. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/vmx.c | 11 +++++++++++ 2 files changed, 12 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index a905e14e4c75..487d1fa6e76d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1294,6 +1294,7 @@ struct kvm_x86_ops { u64 (*fault_gla)(struct kvm_vcpu *vcpu); bool (*spt_fault)(struct kvm_vcpu *vcpu); bool (*gpt_translation_fault)(struct kvm_vcpu *vcpu); + void (*control_singlestep)(struct kvm_vcpu *vcpu, bool enable); }; struct kvm_x86_nested_ops { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index a043e3e7d09a..4ef4f3c1b78a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7923,6 +7923,16 @@ static bool vmx_gpt_translation_fault(struct kvm_vcpu *vcpu) return true; } +static void vmx_control_singlestep(struct kvm_vcpu *vcpu, bool enable) +{ + if (enable) + exec_controls_setbit(to_vmx(vcpu), + CPU_BASED_MONITOR_TRAP_FLAG); + else + exec_controls_clearbit(to_vmx(vcpu), + CPU_BASED_MONITOR_TRAP_FLAG); +} + static struct kvm_x86_ops vmx_x86_ops __initdata = { .hardware_unsetup = hardware_unsetup, @@ -8063,6 +8073,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .fault_gla = vmx_fault_gla, .spt_fault = vmx_spt_fault, .gpt_translation_fault = vmx_gpt_translation_fault, + .control_singlestep = vmx_control_singlestep, }; static __init int hardware_setup(void)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 27/84] KVM: x86: export kvm_arch_vcpu_set_guest_debug()
From: Nicu?or C??u <ncitu at bitdefender.com> This function is needed in order to notify the introspection tool through KVMI_EVENT_BP events on guest breakpoints. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/x86.c | 18 +++++++++++++----- include/linux/kvm_host.h | 2 ++ 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 23bce3ef26d8..5611b6cd6d19 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9254,14 +9254,12 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, return ret; } -int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, - struct kvm_guest_debug *dbg) +int kvm_arch_vcpu_set_guest_debug(struct kvm_vcpu *vcpu, + struct kvm_guest_debug *dbg) { unsigned long rflags; int i, r; - vcpu_load(vcpu); - if (dbg->control & (KVM_GUESTDBG_INJECT_DB | KVM_GUESTDBG_INJECT_BP)) { r = -EBUSY; if (vcpu->arch.exception.pending) @@ -9307,10 +9305,20 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, r = 0; out: - vcpu_put(vcpu); return r; } +int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, + struct kvm_guest_debug *dbg) +{ + int ret; + + vcpu_load(vcpu); + ret = kvm_arch_vcpu_set_guest_debug(vcpu, dbg); + vcpu_put(vcpu); + return ret; +} + /* * Translate a guest virtual address to a guest physical address. */ diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 49cbd175f45b..01628f7bcbcd 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -881,6 +881,8 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg); int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu); +int kvm_arch_vcpu_set_guest_debug(struct kvm_vcpu *vcpu, + struct kvm_guest_debug *dbg); int kvm_arch_init(void *opaque); void kvm_arch_exit(void);
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 28/84] KVM: x86: extend kvm_mmu_gva_to_gpa_system() with the 'access' parameter
From: Mihai Don?u <mdontu at bitdefender.com> This is needed for kvmi_update_ad_flags() to emulate a guest page table walk on SPT violations due to A/D bit updates. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/x86.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 487d1fa6e76d..e92a12647f4d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1556,7 +1556,7 @@ gpa_t kvm_mmu_gva_to_gpa_fetch(struct kvm_vcpu *vcpu, gva_t gva, gpa_t kvm_mmu_gva_to_gpa_write(struct kvm_vcpu *vcpu, gva_t gva, struct x86_exception *exception); gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva, - struct x86_exception *exception); + u32 access, struct x86_exception *exception); bool kvm_apicv_activated(struct kvm *kvm); void kvm_apicv_init(struct kvm *kvm, bool enable); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5611b6cd6d19..0bfa800d0ca8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5498,9 +5498,9 @@ gpa_t kvm_mmu_gva_to_gpa_write(struct kvm_vcpu *vcpu, gva_t gva, /* uses this to access any guest's mapped memory without checking CPL */ gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva, - struct x86_exception *exception) + u32 access, struct x86_exception *exception) { - return vcpu->arch.walk_mmu->gva_to_gpa(vcpu, gva, 0, exception); + return vcpu->arch.walk_mmu->gva_to_gpa(vcpu, gva, access, exception); } static int kvm_read_guest_virt_helper(gva_t addr, void *val, unsigned int bytes, @@ -9332,7 +9332,7 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, vcpu_load(vcpu); idx = srcu_read_lock(&vcpu->kvm->srcu); - gpa = kvm_mmu_gva_to_gpa_system(vcpu, vaddr, NULL); + gpa = kvm_mmu_gva_to_gpa_system(vcpu, vaddr, 0, NULL); srcu_read_unlock(&vcpu->kvm->srcu, idx); tr->physical_address = gpa; tr->valid = gpa != UNMAPPED_GVA;
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 29/84] KVM: x86: export kvm_inject_pending_exception()
From: Nicu?or C??u <ncitu at bitdefender.com> This function is needed for the KVMI_VCPU_INJECT_EXCEPTION command. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 53 +++++++++++++++++++-------------- 2 files changed, 32 insertions(+), 22 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e92a12647f4d..4992afc19cf6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1502,6 +1502,7 @@ unsigned long kvm_get_rflags(struct kvm_vcpu *vcpu); void kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags); bool kvm_rdpmc(struct kvm_vcpu *vcpu); +bool kvm_inject_pending_exception(struct kvm_vcpu *vcpu); void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr); void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code); void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, unsigned long payload); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0bfa800d0ca8..52181eb131dd 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7770,6 +7770,36 @@ static void update_cr8_intercept(struct kvm_vcpu *vcpu) kvm_x86_ops.update_cr8_intercept(vcpu, tpr, max_irr); } +bool kvm_inject_pending_exception(struct kvm_vcpu *vcpu) +{ + if (vcpu->arch.exception.pending) { + trace_kvm_inj_exception(vcpu->arch.exception.nr, + vcpu->arch.exception.has_error_code, + vcpu->arch.exception.error_code); + + WARN_ON_ONCE(vcpu->arch.exception.injected); + vcpu->arch.exception.pending = false; + vcpu->arch.exception.injected = true; + + if (exception_type(vcpu->arch.exception.nr) == EXCPT_FAULT) + __kvm_set_rflags(vcpu, kvm_get_rflags(vcpu) | + X86_EFLAGS_RF); + + if (vcpu->arch.exception.nr == DB_VECTOR) { + kvm_deliver_exception_payload(vcpu); + if (vcpu->arch.dr7 & DR7_GD) { + vcpu->arch.dr7 &= ~DR7_GD; + kvm_update_dr7(vcpu); + } + } + + kvm_x86_ops.queue_exception(vcpu); + return true; + } + + return false; +} + static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) { int r; @@ -7821,29 +7851,8 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit } /* try to inject new event if pending */ - if (vcpu->arch.exception.pending) { - trace_kvm_inj_exception(vcpu->arch.exception.nr, - vcpu->arch.exception.has_error_code, - vcpu->arch.exception.error_code); - - vcpu->arch.exception.pending = false; - vcpu->arch.exception.injected = true; - - if (exception_type(vcpu->arch.exception.nr) == EXCPT_FAULT) - __kvm_set_rflags(vcpu, kvm_get_rflags(vcpu) | - X86_EFLAGS_RF); - - if (vcpu->arch.exception.nr == DB_VECTOR) { - kvm_deliver_exception_payload(vcpu); - if (vcpu->arch.dr7 & DR7_GD) { - vcpu->arch.dr7 &= ~DR7_GD; - kvm_update_dr7(vcpu); - } - } - - kvm_x86_ops.queue_exception(vcpu); + if (kvm_inject_pending_exception(vcpu)) can_inject = false; - } /* * Finally, inject interrupt events. If an event cannot be injected
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 30/84] KVM: x86: export kvm_vcpu_ioctl_x86_get_xsave()
From: Nicu?or C??u <ncitu at bitdefender.com> This function is needed for the KVMI_VCPU_GET_XSAVE command. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/x86.c | 4 ++-- include/linux/kvm_host.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 52181eb131dd..4d5be48b5239 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4207,8 +4207,8 @@ static void load_xsave(struct kvm_vcpu *vcpu, u8 *src) } } -static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, - struct kvm_xsave *guest_xsave) +void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, + struct kvm_xsave *guest_xsave) { if (boot_cpu_has(X86_FEATURE_XSAVE)) { memset(guest_xsave, 0, sizeof(struct kvm_xsave)); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 01628f7bcbcd..f138d56450c0 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -883,6 +883,8 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu); int kvm_arch_vcpu_set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg); +void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, + struct kvm_xsave *guest_xsave); int kvm_arch_init(void *opaque); void kvm_arch_exit(void);
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 31/84] KVM: x86: export kvm_vcpu_ioctl_x86_set_xsave()
This function is needed for the KVMI_VCPU_SET_XSAVE command. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/x86.c | 4 ++-- include/linux/kvm_host.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 4d5be48b5239..b7eb223dc1aa 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4224,8 +4224,8 @@ void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, #define XSAVE_MXCSR_OFFSET 24 -static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu, - struct kvm_xsave *guest_xsave) +int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu, + struct kvm_xsave *guest_xsave) { u64 xstate_bv *(u64 *)&guest_xsave->region[XSAVE_HDR_OFFSET / sizeof(u32)]; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f138d56450c0..5b6f1338de74 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -885,6 +885,8 @@ int kvm_arch_vcpu_set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg); void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, struct kvm_xsave *guest_xsave); +int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu, + struct kvm_xsave *guest_xsave); int kvm_arch_init(void *opaque); void kvm_arch_exit(void);
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 32/84] KVM: x86: page track: provide all callbacks with the guest virtual address
From: Mihai Don?u <mdontu at bitdefender.com> This is needed because the emulator calls the page tracking code irrespective of the current VM-exit reason or available information. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/include/asm/kvm_page_track.h | 10 ++++++---- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/mmu/page_track.c | 6 +++--- arch/x86/kvm/x86.c | 16 ++++++++-------- drivers/gpu/drm/i915/gvt/kvmgt.c | 2 +- 6 files changed, 20 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 4992afc19cf6..b6a1704e0f89 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1384,7 +1384,7 @@ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned long kvm_nr_mmu_pages); int load_pdptrs(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, unsigned long cr3); bool pdptrs_changed(struct kvm_vcpu *vcpu); -int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, +int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, const void *val, int bytes); struct kvm_irq_mask_notifier { diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h index 87bd6025d91d..9a261e463eb3 100644 --- a/arch/x86/include/asm/kvm_page_track.h +++ b/arch/x86/include/asm/kvm_page_track.h @@ -28,12 +28,14 @@ struct kvm_page_track_notifier_node { * * @vcpu: the vcpu where the write access happened. * @gpa: the physical address written by guest. + * @gva: the virtual address written by guest. * @new: the data was written to the address. * @bytes: the written length. * @node: this node */ - void (*track_write)(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new, - int bytes, struct kvm_page_track_notifier_node *node); + void (*track_write)(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + const u8 *new, int bytes, + struct kvm_page_track_notifier_node *node); /* * It is called when memory slot is being moved or removed * users can drop write-protection for the pages in that memory slot @@ -68,7 +70,7 @@ kvm_page_track_register_notifier(struct kvm *kvm, void kvm_page_track_unregister_notifier(struct kvm *kvm, struct kvm_page_track_notifier_node *n); -void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new, - int bytes); +void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + const u8 *new, int bytes); void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot); #endif diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6d6a0ae7800c..038a0e028e77 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5318,7 +5318,7 @@ static const union kvm_mmu_page_role role_ign = { .invalid = 0x1, }; -static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, +static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, const u8 *new, int bytes, struct kvm_page_track_notifier_node *node) { diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c index a7bcde34d1f2..9642af1b2c21 100644 --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -216,8 +216,8 @@ EXPORT_SYMBOL_GPL(kvm_page_track_unregister_notifier); * The node should figure out if the written page is the one that node is * interested in by itself. */ -void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new, - int bytes) +void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + const u8 *new, int bytes) { struct kvm_page_track_notifier_head *head; struct kvm_page_track_notifier_node *n; @@ -231,7 +231,7 @@ void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new, idx = srcu_read_lock(&head->track_srcu); hlist_for_each_entry_rcu(n, &head->track_notifier_list, node) if (n->track_write) - n->track_write(vcpu, gpa, new, bytes, n); + n->track_write(vcpu, gpa, gva, new, bytes, n); srcu_read_unlock(&head->track_srcu, idx); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b7eb223dc1aa..a59c935f4bbe 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5720,7 +5720,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva, return vcpu_is_mmio_gpa(vcpu, gva, *gpa, write); } -int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, +int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, const void *val, int bytes) { int ret; @@ -5728,14 +5728,14 @@ int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, ret = kvm_vcpu_write_guest(vcpu, gpa, val, bytes); if (ret < 0) return 0; - kvm_page_track_write(vcpu, gpa, val, bytes); + kvm_page_track_write(vcpu, gpa, gva, val, bytes); return 1; } struct read_write_emulator_ops { int (*read_write_prepare)(struct kvm_vcpu *vcpu, void *val, int bytes); - int (*read_write_emulate)(struct kvm_vcpu *vcpu, gpa_t gpa, + int (*read_write_emulate)(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, void *val, int bytes); int (*read_write_mmio)(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes, void *val); @@ -5756,16 +5756,16 @@ static int read_prepare(struct kvm_vcpu *vcpu, void *val, int bytes) return 0; } -static int read_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, +static int read_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, void *val, int bytes) { return !kvm_vcpu_read_guest(vcpu, gpa, val, bytes); } -static int write_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, +static int write_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, void *val, int bytes) { - return emulator_write_phys(vcpu, gpa, val, bytes); + return emulator_write_phys(vcpu, gpa, gva, val, bytes); } static int write_mmio(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes, void *val) @@ -5833,7 +5833,7 @@ static int emulator_read_write_onepage(unsigned long addr, void *val, return X86EMUL_PROPAGATE_FAULT; } - if (!ret && ops->read_write_emulate(vcpu, gpa, val, bytes)) + if (!ret && ops->read_write_emulate(vcpu, gpa, addr, val, bytes)) return X86EMUL_CONTINUE; /* @@ -6002,7 +6002,7 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt, if (!exchanged) return X86EMUL_CMPXCHG_FAILED; - kvm_page_track_write(vcpu, gpa, new, bytes); + kvm_page_track_write(vcpu, gpa, addr, new, bytes); return X86EMUL_CONTINUE; diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index ad8a9df49f29..4e370b216365 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1749,7 +1749,7 @@ static int kvmgt_page_track_remove(unsigned long handle, u64 gfn) return 0; } -static void kvmgt_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, +static void kvmgt_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, const u8 *val, int len, struct kvm_page_track_notifier_node *node) {
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 33/84] KVM: x86: page track: add track_create_slot() callback
From: Mihai Don?u <mdontu at bitdefender.com> This is used to add page access notifications as soon as a slot appears. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_page_track.h | 13 ++++++++++++- arch/x86/kvm/mmu/page_track.c | 16 +++++++++++++++- arch/x86/kvm/x86.c | 7 ++++--- 3 files changed, 31 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h index 9a261e463eb3..00a66c4d4d3c 100644 --- a/arch/x86/include/asm/kvm_page_track.h +++ b/arch/x86/include/asm/kvm_page_track.h @@ -36,6 +36,17 @@ struct kvm_page_track_notifier_node { void (*track_write)(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, const u8 *new, int bytes, struct kvm_page_track_notifier_node *node); + /* + * It is called when memory slot is being created + * + * @kvm: the kvm where memory slot being moved or removed + * @slot: the memory slot being moved or removed + * @npages: the number of pages + * @node: this node + */ + void (*track_create_slot)(struct kvm *kvm, struct kvm_memory_slot *slot, + unsigned long npages, + struct kvm_page_track_notifier_node *node); /* * It is called when memory slot is being moved or removed * users can drop write-protection for the pages in that memory slot @@ -52,7 +63,7 @@ void kvm_page_track_init(struct kvm *kvm); void kvm_page_track_cleanup(struct kvm *kvm); void kvm_page_track_free_memslot(struct kvm_memory_slot *slot); -int kvm_page_track_create_memslot(struct kvm_memory_slot *slot, +int kvm_page_track_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, unsigned long npages); void kvm_slot_page_track_add_page(struct kvm *kvm, diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c index 9642af1b2c21..02759b81a04c 100644 --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -28,9 +28,12 @@ void kvm_page_track_free_memslot(struct kvm_memory_slot *slot) } } -int kvm_page_track_create_memslot(struct kvm_memory_slot *slot, +int kvm_page_track_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, unsigned long npages) { + struct kvm_page_track_notifier_head *head; + struct kvm_page_track_notifier_node *n; + int idx; int i; for (i = 0; i < KVM_PAGE_TRACK_MAX; i++) { @@ -41,6 +44,17 @@ int kvm_page_track_create_memslot(struct kvm_memory_slot *slot, goto track_free; } + head = &kvm->arch.track_notifier_head; + + if (hlist_empty(&head->track_notifier_list)) + return 0; + + idx = srcu_read_lock(&head->track_srcu); + hlist_for_each_entry_rcu(n, &head->track_notifier_list, node) + if (n->track_create_slot) + n->track_create_slot(kvm, slot, npages, n); + srcu_read_unlock(&head->track_srcu, idx); + return 0; track_free: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a59c935f4bbe..83424339ea9d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10070,7 +10070,8 @@ void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) kvm_page_track_free_memslot(slot); } -static int kvm_alloc_memslot_metadata(struct kvm_memory_slot *slot, +static int kvm_alloc_memslot_metadata(struct kvm *kvm, + struct kvm_memory_slot *slot, unsigned long npages) { int i; @@ -10122,7 +10123,7 @@ static int kvm_alloc_memslot_metadata(struct kvm_memory_slot *slot, } } - if (kvm_page_track_create_memslot(slot, npages)) + if (kvm_page_track_create_memslot(kvm, slot, npages)) goto out_free; return 0; @@ -10162,7 +10163,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, enum kvm_mr_change change) { if (change == KVM_MR_CREATE || change == KVM_MR_MOVE) - return kvm_alloc_memslot_metadata(memslot, + return kvm_alloc_memslot_metadata(kvm, memslot, mem->memory_size >> PAGE_SHIFT); return 0; }
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 34/84] KVM: x86: page_track: add support for preread, prewrite and preexec
From: Mihai Don?u <mdontu at bitdefender.com> The access to a tracked memory page leads to two types of actions from the introspection tool: either the access is allowed (maybe with different data for the source operand) or the vCPU should re-enter in guest (the page is not tracked anymore, the instruction was skipped/emulated by the introspection tool, etc.). These new callbacks must return 'true' for the first case and 'false' for the second. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_page_track.h | 48 ++++++++++- arch/x86/kvm/mmu.h | 4 + arch/x86/kvm/mmu/mmu.c | 81 +++++++++++++++++ arch/x86/kvm/mmu/page_track.c | 120 ++++++++++++++++++++++++-- 4 files changed, 243 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h index 00a66c4d4d3c..c10f0f65c77a 100644 --- a/arch/x86/include/asm/kvm_page_track.h +++ b/arch/x86/include/asm/kvm_page_track.h @@ -3,7 +3,10 @@ #define _ASM_X86_KVM_PAGE_TRACK_H enum kvm_page_track_mode { + KVM_PAGE_TRACK_PREREAD, + KVM_PAGE_TRACK_PREWRITE, KVM_PAGE_TRACK_WRITE, + KVM_PAGE_TRACK_PREEXEC, KVM_PAGE_TRACK_MAX, }; @@ -22,6 +25,33 @@ struct kvm_page_track_notifier_head { struct kvm_page_track_notifier_node { struct hlist_node node; + /* + * It is called when guest is reading the read-tracked page + * and the read emulation is about to happen. + * + * @vcpu: the vcpu where the read access happened. + * @gpa: the physical address read by guest. + * @gva: the virtual address read by guest. + * @bytes: the read length. + * @node: this node. + */ + bool (*track_preread)(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + int bytes, + struct kvm_page_track_notifier_node *node); + /* + * It is called when guest is writing the write-tracked page + * and the write emulation didn't happened yet. + * + * @vcpu: the vcpu where the write access happened. + * @gpa: the physical address written by guest. + * @gva: the virtual address written by guest. + * @new: the data was written to the address. + * @bytes: the written length. + * @node: this node + */ + bool (*track_prewrite)(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + const u8 *new, int bytes, + struct kvm_page_track_notifier_node *node); /* * It is called when guest is writing the write-tracked page * and write emulation is finished at that time. @@ -36,6 +66,17 @@ struct kvm_page_track_notifier_node { void (*track_write)(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, const u8 *new, int bytes, struct kvm_page_track_notifier_node *node); + /* + * It is called when guest is fetching from a exec-tracked page + * and the fetch emulation is about to happen. + * + * @vcpu: the vcpu where the fetch access happened. + * @gpa: the physical address fetched by guest. + * @gva: the virtual address fetched by guest. + * @node: this node. + */ + bool (*track_preexec)(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + struct kvm_page_track_notifier_node *node); /* * It is called when memory slot is being created * @@ -49,7 +90,7 @@ struct kvm_page_track_notifier_node { struct kvm_page_track_notifier_node *node); /* * It is called when memory slot is being moved or removed - * users can drop write-protection for the pages in that memory slot + * users can drop active protection for the pages in that memory slot * * @kvm: the kvm where memory slot being moved or removed * @slot: the memory slot being moved or removed @@ -81,7 +122,12 @@ kvm_page_track_register_notifier(struct kvm *kvm, void kvm_page_track_unregister_notifier(struct kvm *kvm, struct kvm_page_track_notifier_node *n); +bool kvm_page_track_preread(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + int bytes); +bool kvm_page_track_prewrite(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + const u8 *new, int bytes); void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, const u8 *new, int bytes); +bool kvm_page_track_preexec(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva); void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot); #endif diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 444bb9c54548..e2c0518af750 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -222,6 +222,10 @@ void kvm_mmu_gfn_disallow_lpage(struct kvm_memory_slot *slot, gfn_t gfn); void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, gfn_t gfn); bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, struct kvm_memory_slot *slot, u64 gfn); +bool kvm_mmu_slot_gfn_read_protect(struct kvm *kvm, + struct kvm_memory_slot *slot, u64 gfn); +bool kvm_mmu_slot_gfn_exec_protect(struct kvm *kvm, + struct kvm_memory_slot *slot, u64 gfn); int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu, gpa_t l2_gpa); int kvm_mmu_post_init_vm(struct kvm *kvm); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 038a0e028e77..ede8ef6d1e34 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1579,6 +1579,31 @@ static bool spte_write_protect(u64 *sptep, bool pt_protect) return mmu_spte_update(sptep, spte); } +static bool spte_read_protect(u64 *sptep) +{ + u64 spte = *sptep; + bool exec_only_supported = (shadow_present_mask == 0ull); + + rmap_printk("rmap_read_protect: spte %p %llx\n", sptep, *sptep); + + WARN_ON_ONCE(!exec_only_supported); + + spte = spte & ~(PT_WRITABLE_MASK | PT_PRESENT_MASK); + + return mmu_spte_update(sptep, spte); +} + +static bool spte_exec_protect(u64 *sptep) +{ + u64 spte = *sptep; + + rmap_printk("rmap_exec_protect: spte %p %llx\n", sptep, *sptep); + + spte = spte & ~PT_USER_MASK; + + return mmu_spte_update(sptep, spte); +} + static bool __rmap_write_protect(struct kvm *kvm, struct kvm_rmap_head *rmap_head, bool pt_protect) @@ -1593,6 +1618,32 @@ static bool __rmap_write_protect(struct kvm *kvm, return flush; } +static bool __rmap_read_protect(struct kvm *kvm, + struct kvm_rmap_head *rmap_head) +{ + struct rmap_iterator iter; + bool flush = false; + u64 *sptep; + + for_each_rmap_spte(rmap_head, &iter, sptep) + flush |= spte_read_protect(sptep); + + return flush; +} + +static bool __rmap_exec_protect(struct kvm *kvm, + struct kvm_rmap_head *rmap_head) +{ + struct rmap_iterator iter; + bool flush = false; + u64 *sptep; + + for_each_rmap_spte(rmap_head, &iter, sptep) + flush |= spte_exec_protect(sptep); + + return flush; +} + static bool spte_clear_dirty(u64 *sptep) { u64 spte = *sptep; @@ -1768,6 +1819,36 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, return write_protected; } +bool kvm_mmu_slot_gfn_read_protect(struct kvm *kvm, + struct kvm_memory_slot *slot, u64 gfn) +{ + struct kvm_rmap_head *rmap_head; + bool read_protected = false; + int i; + + for (i = PG_LEVEL_4K; i <= KVM_MAX_HUGEPAGE_LEVEL; ++i) { + rmap_head = __gfn_to_rmap(gfn, i, slot); + read_protected |= __rmap_read_protect(kvm, rmap_head); + } + + return read_protected; +} + +bool kvm_mmu_slot_gfn_exec_protect(struct kvm *kvm, + struct kvm_memory_slot *slot, u64 gfn) +{ + struct kvm_rmap_head *rmap_head; + bool exec_protected = false; + int i; + + for (i = PG_LEVEL_4K; i <= KVM_MAX_HUGEPAGE_LEVEL; ++i) { + rmap_head = __gfn_to_rmap(gfn, i, slot); + exec_protected |= __rmap_exec_protect(kvm, rmap_head); + } + + return exec_protected; +} + static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn) { struct kvm_memory_slot *slot; diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c index 02759b81a04c..b593bcf80be0 100644 --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only /* - * Support KVM gust page tracking + * Support KVM guest page tracking * * This feature allows us to track page access in guest. Currently, only * write access is tracked. @@ -95,7 +95,7 @@ static void update_gfn_track(struct kvm_memory_slot *slot, gfn_t gfn, * @kvm: the guest instance we are interested in. * @slot: the @gfn belongs to. * @gfn: the guest page. - * @mode: tracking mode, currently only write track is supported. + * @mode: tracking mode. */ void kvm_slot_page_track_add_page(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, @@ -113,9 +113,16 @@ void kvm_slot_page_track_add_page(struct kvm *kvm, */ kvm_mmu_gfn_disallow_lpage(slot, gfn); - if (mode == KVM_PAGE_TRACK_WRITE) + if (mode == KVM_PAGE_TRACK_PREWRITE || mode == KVM_PAGE_TRACK_WRITE) { if (kvm_mmu_slot_gfn_write_protect(kvm, slot, gfn)) kvm_flush_remote_tlbs(kvm); + } else if (mode == KVM_PAGE_TRACK_PREREAD) { + if (kvm_mmu_slot_gfn_read_protect(kvm, slot, gfn)) + kvm_flush_remote_tlbs(kvm); + } else if (mode == KVM_PAGE_TRACK_PREEXEC) { + if (kvm_mmu_slot_gfn_exec_protect(kvm, slot, gfn)) + kvm_flush_remote_tlbs(kvm); + } } EXPORT_SYMBOL_GPL(kvm_slot_page_track_add_page); @@ -130,7 +137,7 @@ EXPORT_SYMBOL_GPL(kvm_slot_page_track_add_page); * @kvm: the guest instance we are interested in. * @slot: the @gfn belongs to. * @gfn: the guest page. - * @mode: tracking mode, currently only write track is supported. + * @mode: tracking mode. */ void kvm_slot_page_track_remove_page(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, @@ -223,12 +230,78 @@ kvm_page_track_unregister_notifier(struct kvm *kvm, } EXPORT_SYMBOL_GPL(kvm_page_track_unregister_notifier); +/* + * Notify the node that a read access is about to happen. Returning false + * doesn't stop the other nodes from being called, but it will stop + * the emulation. + * + * The node should figure out if the read page is the one that the node + * is interested in by itself. + * + * The nodes will always be in conflict if they track the same page: + * - accepting a read won't guarantee that the next node will not override + * the data (filling new/bytes and setting data_ready) + * - filling new/bytes with custom data won't guarantee that the next node + * will not override that + */ +bool kvm_page_track_preread(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + int bytes) +{ + struct kvm_page_track_notifier_head *head; + struct kvm_page_track_notifier_node *n; + int idx; + bool ret = true; + + head = &vcpu->kvm->arch.track_notifier_head; + + if (hlist_empty(&head->track_notifier_list)) + return ret; + + idx = srcu_read_lock(&head->track_srcu); + hlist_for_each_entry_rcu(n, &head->track_notifier_list, node) + if (n->track_preread) + if (!n->track_preread(vcpu, gpa, gva, bytes, n)) + ret = false; + srcu_read_unlock(&head->track_srcu, idx); + return ret; +} + +/* + * Notify the node that a write access is about to happen. Returning false + * doesn't stop the other nodes from being called, but it will stop + * the emulation. + * + * The node should figure out if the written page is the one that the node + * is interested in by itself. + */ +bool kvm_page_track_prewrite(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + const u8 *new, int bytes) +{ + struct kvm_page_track_notifier_head *head; + struct kvm_page_track_notifier_node *n; + int idx; + bool ret = true; + + head = &vcpu->kvm->arch.track_notifier_head; + + if (hlist_empty(&head->track_notifier_list)) + return ret; + + idx = srcu_read_lock(&head->track_srcu); + hlist_for_each_entry_rcu(n, &head->track_notifier_list, node) + if (n->track_prewrite) + if (!n->track_prewrite(vcpu, gpa, gva, new, bytes, n)) + ret = false; + srcu_read_unlock(&head->track_srcu, idx); + return ret; +} + /* * Notify the node that write access is intercepted and write emulation is * finished at this time. * - * The node should figure out if the written page is the one that node is - * interested in by itself. + * The node should figure out if the written page is the one that the node + * is interested in by itself. */ void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, const u8 *new, int bytes) @@ -249,12 +322,41 @@ void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, srcu_read_unlock(&head->track_srcu, idx); } +/* + * Notify the node that an instruction is about to be executed. + * Returning false doesn't stop the other nodes from being called, + * but it will stop the emulation with X86EMUL_RETRY_INSTR. + * + * The node should figure out if the page is the one that the node + * is interested in by itself. + */ +bool kvm_page_track_preexec(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva) +{ + struct kvm_page_track_notifier_head *head; + struct kvm_page_track_notifier_node *n; + int idx; + bool ret = true; + + head = &vcpu->kvm->arch.track_notifier_head; + + if (hlist_empty(&head->track_notifier_list)) + return ret; + + idx = srcu_read_lock(&head->track_srcu); + hlist_for_each_entry_rcu(n, &head->track_notifier_list, node) + if (n->track_preexec) + if (!n->track_preexec(vcpu, gpa, gva, n)) + ret = false; + srcu_read_unlock(&head->track_srcu, idx); + return ret; +} + /* * Notify the node that memory slot is being removed or moved so that it can - * drop write-protection for the pages in the memory slot. + * drop active protection for the pages in the memory slot. * - * The node should figure out it has any write-protected pages in this slot - * by itself. + * The node should figure out if the page is the one that the node + * is interested in by itself. */ void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot) {
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 35/84] KVM: x86: wire in the preread/prewrite/preexec page trackers
From: Mihai Don?u <mdontu at bitdefender.com> These are needed in order to notify the introspection tool when read/write/execute access happens on one of the tracked memory pages. Also, this patch adds the case when the introspection tool requests that the vCPU re-enter in guest (and abort the emulation of the current instruction). Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Marian Rotariu <marian.c.rotariu at gmail.com> Signed-off-by: Marian Rotariu <marian.c.rotariu at gmail.com> Co-developed-by: Stefan Sicleru <ssicleru at bitdefender.com> Signed-off-by: Stefan Sicleru <ssicleru at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/emulate.c | 4 +++ arch/x86/kvm/kvm_emulate.h | 1 + arch/x86/kvm/mmu/mmu.c | 58 +++++++++++++++++++++++++++++--------- arch/x86/kvm/x86.c | 45 +++++++++++++++++++++++------ 4 files changed, 85 insertions(+), 23 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index d0e2825ae617..15a005a3b3f5 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -5442,6 +5442,8 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len) ctxt->memopp->addr.mem.ea + ctxt->_eip); done: + if (rc == X86EMUL_RETRY_INSTR) + return EMULATION_RETRY_INSTR; if (rc == X86EMUL_PROPAGATE_FAULT) ctxt->have_exception = true; return (rc != X86EMUL_CONTINUE) ? EMULATION_FAILED : EMULATION_OK; @@ -5813,6 +5815,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) if (rc == X86EMUL_INTERCEPTED) return EMULATION_INTERCEPTED; + if (rc == X86EMUL_RETRY_INSTR) + return EMULATION_RETRY_INSTR; if (rc == X86EMUL_CONTINUE) writeback_registers(ctxt); diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h index 43c93ffa76ed..5bfab8d65cd1 100644 --- a/arch/x86/kvm/kvm_emulate.h +++ b/arch/x86/kvm/kvm_emulate.h @@ -496,6 +496,7 @@ bool x86_page_table_writing_insn(struct x86_emulate_ctxt *ctxt); #define EMULATION_OK 0 #define EMULATION_RESTART 1 #define EMULATION_INTERCEPTED 2 +#define EMULATION_RETRY_INSTR 3 void init_decode_cache(struct x86_emulate_ctxt *ctxt); int x86_emulate_insn(struct x86_emulate_ctxt *ctxt); int emulator_task_switch(struct x86_emulate_ctxt *ctxt, diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index ede8ef6d1e34..da57321e0cec 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1226,9 +1226,13 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) slot = __gfn_to_memslot(slots, gfn); /* the non-leaf shadow pages are keeping readonly. */ - if (sp->role.level > PG_LEVEL_4K) - return kvm_slot_page_track_add_page(kvm, slot, gfn, - KVM_PAGE_TRACK_WRITE); + if (sp->role.level > PG_LEVEL_4K) { + kvm_slot_page_track_add_page(kvm, slot, gfn, + KVM_PAGE_TRACK_PREWRITE); + kvm_slot_page_track_add_page(kvm, slot, gfn, + KVM_PAGE_TRACK_WRITE); + return; + } kvm_mmu_gfn_disallow_lpage(slot, gfn); } @@ -1254,9 +1258,13 @@ static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) gfn = sp->gfn; slots = kvm_memslots_for_spte_role(kvm, sp->role); slot = __gfn_to_memslot(slots, gfn); - if (sp->role.level > PG_LEVEL_4K) - return kvm_slot_page_track_remove_page(kvm, slot, gfn, - KVM_PAGE_TRACK_WRITE); + if (sp->role.level > PG_LEVEL_4K) { + kvm_slot_page_track_remove_page(kvm, slot, gfn, + KVM_PAGE_TRACK_PREWRITE); + kvm_slot_page_track_remove_page(kvm, slot, gfn, + KVM_PAGE_TRACK_WRITE); + return; + } kvm_mmu_gfn_allow_lpage(slot, gfn); } @@ -2987,7 +2995,8 @@ static bool mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn, { struct kvm_mmu_page *sp; - if (kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_WRITE)) + if (kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_PREWRITE) || + kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_WRITE)) return true; for_each_gfn_indirect_valid_sp(vcpu->kvm, sp, gfn) { @@ -3405,6 +3414,21 @@ static void disallowed_hugepage_adjust(struct kvm_shadow_walk_iterator it, } } +static unsigned int kvm_mmu_apply_introspection_access(struct kvm_vcpu *vcpu, + gfn_t gfn, + unsigned int acc) +{ + if (kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_PREREAD)) + acc &= ~ACC_USER_MASK; + if (kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_PREWRITE) || + kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_WRITE)) + acc &= ~ACC_WRITE_MASK; + if (kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_PREEXEC)) + acc &= ~ACC_EXEC_MASK; + + return acc; +} + static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, int write, int map_writable, int max_level, kvm_pfn_t pfn, bool prefault, bool account_disallowed_nx_lpage) @@ -3414,6 +3438,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, int write, int level, ret; gfn_t gfn = gpa >> PAGE_SHIFT; gfn_t base_gfn = gfn; + unsigned int acc; if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa))) return RET_PF_RETRY; @@ -3443,7 +3468,9 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, int write, } } - ret = mmu_set_spte(vcpu, it.sptep, ACC_ALL, + acc = kvm_mmu_apply_introspection_access(vcpu, base_gfn, ACC_ALL); + + ret = mmu_set_spte(vcpu, it.sptep, acc, write, level, base_gfn, pfn, prefault, map_writable); direct_pte_prefetch(vcpu, it.sptep); @@ -4098,15 +4125,18 @@ static bool page_fault_handle_page_track(struct kvm_vcpu *vcpu, if (unlikely(error_code & PFERR_RSVD_MASK)) return false; - if (!(error_code & PFERR_PRESENT_MASK) || - !(error_code & PFERR_WRITE_MASK)) - return false; - /* - * guest is writing the page which is write tracked which can + * guest is reading/writing/fetching the page which is + * read/write/execute tracked which can * not be fixed by page fault handler. */ - if (kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_WRITE)) + if (((error_code & PFERR_USER_MASK) + && kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_PREREAD)) + || ((error_code & PFERR_WRITE_MASK) + && (kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_PREWRITE) + || kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_WRITE))) + || ((error_code & PFERR_FETCH_MASK) + && kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_PREEXEC))) return true; return false; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 83424339ea9d..7668ca5b8a7a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5519,6 +5519,8 @@ static int kvm_read_guest_virt_helper(gva_t addr, void *val, unsigned int bytes, if (gpa == UNMAPPED_GVA) return X86EMUL_PROPAGATE_FAULT; + if (!kvm_page_track_preread(vcpu, gpa, addr, toread)) + return X86EMUL_RETRY_INSTR; ret = kvm_vcpu_read_guest_page(vcpu, gpa >> PAGE_SHIFT, data, offset, toread); if (ret < 0) { @@ -5550,6 +5552,9 @@ static int kvm_fetch_guest_virt(struct x86_emulate_ctxt *ctxt, if (unlikely(gpa == UNMAPPED_GVA)) return X86EMUL_PROPAGATE_FAULT; + if (!kvm_page_track_preexec(vcpu, gpa, addr)) + return X86EMUL_RETRY_INSTR; + offset = addr & (PAGE_SIZE-1); if (WARN_ON(offset + bytes > PAGE_SIZE)) bytes = (unsigned)PAGE_SIZE - offset; @@ -5618,11 +5623,14 @@ static int kvm_write_guest_virt_helper(gva_t addr, void *val, unsigned int bytes if (gpa == UNMAPPED_GVA) return X86EMUL_PROPAGATE_FAULT; + if (!kvm_page_track_prewrite(vcpu, gpa, addr, data, towrite)) + return X86EMUL_RETRY_INSTR; ret = kvm_vcpu_write_guest(vcpu, gpa, data, towrite); if (ret < 0) { r = X86EMUL_IO_NEEDED; goto out; } + kvm_page_track_write(vcpu, gpa, addr, data, towrite); bytes -= towrite; data += towrite; @@ -5723,13 +5731,22 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva, int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, const void *val, int bytes) { - int ret; - - ret = kvm_vcpu_write_guest(vcpu, gpa, val, bytes); - if (ret < 0) - return 0; + if (!kvm_page_track_prewrite(vcpu, gpa, gva, val, bytes)) + return X86EMUL_RETRY_INSTR; + if (kvm_vcpu_write_guest(vcpu, gpa, val, bytes) < 0) + return X86EMUL_UNHANDLEABLE; kvm_page_track_write(vcpu, gpa, gva, val, bytes); - return 1; + return X86EMUL_CONTINUE; +} + +static int emulator_read_phys(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + void *val, int bytes) +{ + if (!kvm_page_track_preread(vcpu, gpa, gva, bytes)) + return X86EMUL_RETRY_INSTR; + if (kvm_vcpu_read_guest(vcpu, gpa, val, bytes) < 0) + return X86EMUL_UNHANDLEABLE; + return X86EMUL_CONTINUE; } struct read_write_emulator_ops { @@ -5759,7 +5776,7 @@ static int read_prepare(struct kvm_vcpu *vcpu, void *val, int bytes) static int read_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, void *val, int bytes) { - return !kvm_vcpu_read_guest(vcpu, gpa, val, bytes); + return emulator_read_phys(vcpu, gpa, gva, val, bytes); } static int write_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, @@ -5833,8 +5850,11 @@ static int emulator_read_write_onepage(unsigned long addr, void *val, return X86EMUL_PROPAGATE_FAULT; } - if (!ret && ops->read_write_emulate(vcpu, gpa, addr, val, bytes)) - return X86EMUL_CONTINUE; + if (!ret) { + ret = ops->read_write_emulate(vcpu, gpa, addr, val, bytes); + if (ret == X86EMUL_CONTINUE || ret == X86EMUL_RETRY_INSTR) + return ret; + } /* * Is this MMIO handled locally? @@ -5978,6 +5998,9 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt, if (kvm_vcpu_map(vcpu, gpa_to_gfn(gpa), &map)) goto emul_write; + if (!kvm_page_track_prewrite(vcpu, gpa, addr, new, bytes)) + return X86EMUL_RETRY_INSTR; + kaddr = map.hva + offset_in_page(gpa); switch (bytes) { @@ -6904,6 +6927,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, trace_kvm_emulate_insn_start(vcpu); ++vcpu->stat.insn_emulation; + if (r == EMULATION_RETRY_INSTR) + return 1; if (r != EMULATION_OK) { if ((emulation_type & EMULTYPE_TRAP_UD) || (emulation_type & EMULTYPE_TRAP_UD_FORCED)) { @@ -6973,6 +6998,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, r = x86_emulate_insn(ctxt); + if (r == EMULATION_RETRY_INSTR) + return 1; if (r == EMULATION_INTERCEPTED) return 1;
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 36/84] KVM: x86: disable gpa_available optimization for fetch and page-walk SPT violations
From: Mircea C?rjaliu <mcirjaliu at bitdefender.com> This change is needed because the introspection tool can write-protect guest page tables or exec-protect heap/stack pages. Signed-off-by: Mircea C?rjaliu <mcirjaliu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 5 +++++ arch/x86/kvm/mmu/mmu.c | 8 ++++++++ arch/x86/kvm/x86.c | 2 +- 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index b6a1704e0f89..8a119fb7c623 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1436,6 +1436,10 @@ extern u64 kvm_mce_cap_supported; * retry native execution under certain conditions, * Can only be set in conjunction with EMULTYPE_PF. * + * EMULTYPE_GPA_AVAILABLE_PF - Set when the emulator can avoid a page walk + * to get the GPA. + * Can only be set in conjunction with EMULTYPE_PF. + * * EMULTYPE_TRAP_UD_FORCED - Set when emulating an intercepted #UD that was * triggered by KVM's magic "force emulation" prefix, * which is opt in via module param (off by default). @@ -1458,6 +1462,7 @@ extern u64 kvm_mce_cap_supported; #define EMULTYPE_TRAP_UD_FORCED (1 << 4) #define EMULTYPE_VMWARE_GP (1 << 5) #define EMULTYPE_PF (1 << 6) +#define EMULTYPE_GPA_AVAILABLE_PF (1 << 7) int kvm_emulate_instruction(struct kvm_vcpu *vcpu, int emulation_type); int kvm_emulate_instruction_from_buffer(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index da57321e0cec..4df5b729e2c5 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5566,6 +5566,14 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, */ if (!mmio_info_in_cache(vcpu, cr2_or_gpa, direct) && !is_guest_mode(vcpu)) emulation_type |= EMULTYPE_ALLOW_RETRY_PF; + + /* + * With shadow page tables, fault_address contains a GVA or nGPA. + * On a fetch fault, fault_address contains the instruction pointer. + */ + if (direct && likely(!(error_code & PFERR_FETCH_MASK)) && + (error_code & PFERR_GUEST_FINAL_MASK)) + emulation_type |= EMULTYPE_GPA_AVAILABLE_PF; emulate: /* * On AMD platforms, under certain conditions insn_len may be zero on #NPF. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7668ca5b8a7a..ffcf09e9bf78 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6987,7 +6987,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, ctxt->exception.address = cr2_or_gpa; /* With shadow page tables, cr2 contains a GVA or nGPA. */ - if (vcpu->arch.mmu->direct_map) { + if (emulation_type & EMULTYPE_GPA_AVAILABLE_PF) { ctxt->gpa_available = true; ctxt->gpa_val = cr2_or_gpa; }
From: Mihai Don?u <mdontu at bitdefender.com> The KVM introspection subsystem provides a facility for applications to control the execution of any running VMs (pause, resume, shutdown), query the state of the vCPUs (GPRs, MSRs etc.), alter the page access bits in the shadow page tables and receive notifications when events of interest have taken place (shadow page table level faults, key MSR writes, hypercalls etc.). Some notifications can be responded to with an action (like preventing an MSR from being written), others are mere informative (like breakpoint events which can be used for execution tracing). Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Marian Rotariu <marian.c.rotariu at gmail.com> Signed-off-by: Marian Rotariu <marian.c.rotariu at gmail.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 140 ++++++++++++++++++++++++++++++ arch/x86/kvm/Kconfig | 13 +++ arch/x86/kvm/Makefile | 2 + include/linux/kvmi_host.h | 21 +++++ virt/kvm/introspection/kvmi.c | 25 ++++++ virt/kvm/introspection/kvmi_int.h | 7 ++ virt/kvm/kvm_main.c | 15 ++++ 7 files changed, 223 insertions(+) create mode 100644 Documentation/virt/kvm/kvmi.rst create mode 100644 include/linux/kvmi_host.h create mode 100644 virt/kvm/introspection/kvmi.c create mode 100644 virt/kvm/introspection/kvmi_int.h diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst new file mode 100644 index 000000000000..3a1b6c655de7 --- /dev/null +++ b/Documentation/virt/kvm/kvmi.rst @@ -0,0 +1,140 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================================================+KVMI - The kernel virtual machine introspection subsystem +========================================================+ +The KVM introspection subsystem provides a facility for applications running +on the host or in a separate VM, to control the execution of any running VMs +(pause, resume, shutdown), query the state of the vCPUs (GPRs, MSRs etc.), +alter the page access bits in the shadow page tables (only for the hardware +backed ones, eg. Intel's EPT) and receive notifications when events of +interest have taken place (shadow page table level faults, key MSR writes, +hypercalls etc.). Some notifications can be responded to with an action +(like preventing an MSR from being written), others are mere informative +(like breakpoint events which can be used for execution tracing). +With few exceptions, all events are optional. An application using this +subsystem will explicitly register for them. + +The use case that gave way for the creation of this subsystem is to monitor +the guest OS and as such the ABI/API is highly influenced by how the guest +software (kernel, applications) sees the world. For example, some events +provide information specific for the host CPU architecture +(eg. MSR_IA32_SYSENTER_EIP) merely because its leveraged by guest software +to implement a critical feature (fast system calls). + +At the moment, the target audience for KVMI are security software authors +that wish to perform forensics on newly discovered threats (exploits) or +to implement another layer of security like preventing a large set of +kernel rootkits simply by "locking" the kernel image in the shadow page +tables (ie. enforce .text r-x, .rodata rw- etc.). It's the latter case that +made KVMI a separate subsystem, even though many of these features are +available in the device manager (eg. QEMU). The ability to build a security +application that does not interfere (in terms of performance) with the +guest software asks for a specialized interface that is designed for minimum +overhead. + +API/ABI +======+ +This chapter describes the VMI interface used to monitor and control local +guests from a user application. + +Overview +-------- + +The interface is socket based, one connection for every VM. One end is in the +host kernel while the other is held by the user application (introspection +tool). + +The initial connection is established by an application running on the host +(eg. QEMU) that connects to the introspection tool and after a handshake +the socket is passed to the host kernel making all further communication +take place between it and the introspection tool. + +The socket protocol allows for commands and events to be multiplexed over +the same connection. As such, it is possible for the introspection tool to +receive an event while waiting for the result of a command. Also, it can +send a command while the host kernel is waiting for a reply to an event. + +The kernel side of the socket communication is blocking and will wait +for an answer from its peer indefinitely or until the guest is powered +off (killed), restarted or the peer goes away, at which point it will +wake up and properly cleanup as if the introspection subsystem has never +been used on that guest (if requested). Obviously, whether the guest can +really continue normal execution depends on whether the introspection +tool has made any modifications that require an active KVMI channel. + +Handshake +--------- + +Although this falls out of the scope of the introspection subsystem, below +is a proposal of a handshake that can be used by implementors. + +Based on the system administration policies, the management tool +(eg. libvirt) starts device managers (eg. QEMU) with some extra arguments: +what introspection tool could monitor/control that specific guest (and +how to connect to) and what introspection commands/events are allowed. + +The device manager will connect to the introspection tool and wait for a +cryptographic hash of a cookie that should be known by both peers. If the +hash is correct (the destination has been "authenticated"), the device +manager will send another cryptographic hash and random salt. The peer +recomputes the hash of the cookie bytes including the salt and if they match, +the device manager has been "authenticated" too. This is a rather crude +system that makes it difficult for device manager exploits to trick the +introspection tool into believing its working OK. + +The cookie would normally be generated by a management tool (eg. libvirt) +and make it available to the device manager and to a properly authenticated +client. It is the job of a third party to retrieve the cookie from the +management application and pass it over a secure channel to the introspection +tool. + +Once the basic "authentication" has taken place, the introspection tool +can receive information on the guest (its UUID) and other flags (endianness +or features supported by the host kernel). + +In the end, the device manager will pass the file handle (plus the allowed +commands/events) to KVM. It will detect when the socket is shutdown +and it will reinitiate the handshake. + +Unhooking +--------- + +During a VMI session it is possible for the guest to be patched and for +some of these patches to "talk" with the introspection tool. It thus +becomes necessary to remove them before the guest is suspended, moved +(migrated) or a snapshot with memory is created. + +The actions are normally performed by the device manager. In the case +of QEMU, it will use another ioctl to notify the introspection tool and +wait for a limited amount of time (a few seconds) for a confirmation that +is OK to proceed (it is enough for the introspection tool to close +the connection). + +Live migrations +--------------- + +Before the live migration takes place, the introspection tool has to be +notified and have a chance to unhook (see **Unhooking**). + +The QEMU instance on the receiving end, if configured for KVMI, will need +to establish a connection to the introspection tool after the migration +has been completed. + +Obviously, this creates a window in which the guest is not introspected. +The user has to be aware of this detail. Future introspection technologies +can choose not to disconnect and instead transfer the necessary context +to the introspection tool at the migration destination via a separate +channel. + +Memory access safety +-------------------- + +The KVMI API gives access to the entire guest physical address space but +provides no information on which parts of it are system RAM and which are +device-specific memory (DMA, emulated MMIO, reserved by a passthrough +device etc.). It is up to the user to determine, using the guest operating +system data structures, the areas that are safe to access (code, stack, heap +etc.). diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index b277a2db6267..34d0b1bbab95 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -107,4 +107,17 @@ config KVM_MMU_AUDIT This option adds a R/W kVM module parameter 'mmu_audit', which allows auditing of KVM MMU events at runtime. +config KVM_INTROSPECTION + bool "KVM Introspection" + depends on KVM && (KVM_INTEL || KVM_AMD) + default n + help + Provides the introspection interface, which allows the control + of any running VM. It must be explicitly enabled by setting + the module parameter 'kvm.introspection'. + +# OK, it's a little counter-intuitive to do this, but it puts it neatly under +# the virtualization menu. +source "drivers/vhost/Kconfig" + endif # VIRTUALIZATION diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 4a3081e9f4b5..880b028c7f86 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -8,10 +8,12 @@ OBJECT_FILES_NON_STANDARD_vmenter.o := y endif KVM := ../../../virt/kvm +KVMI := $(KVM)/introspection kvm-y += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \ $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o +kvm-$(CONFIG_KVM_INTROSPECTION) += $(KVMI)/kvmi.o kvm-y += x86.o emulate.o i8259.o irq.o lapic.o \ i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \ diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h new file mode 100644 index 000000000000..1e0a73c2a190 --- /dev/null +++ b/include/linux/kvmi_host.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __KVMI_HOST_H +#define __KVMI_HOST_H + +#ifdef CONFIG_KVM_INTROSPECTION + +int kvmi_init(void); +void kvmi_uninit(void); +void kvmi_create_vm(struct kvm *kvm); +void kvmi_destroy_vm(struct kvm *kvm); + +#else + +static inline int kvmi_init(void) { return 0; } +static inline void kvmi_uninit(void) { } +static inline void kvmi_create_vm(struct kvm *kvm) { } +static inline void kvmi_destroy_vm(struct kvm *kvm) { } + +#endif /* CONFIG_KVM_INTROSPECTION */ + +#endif diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c new file mode 100644 index 000000000000..af53bdcb7ec8 --- /dev/null +++ b/virt/kvm/introspection/kvmi.c @@ -0,0 +1,25 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KVM Introspection + * + * Copyright (C) 2017-2020 Bitdefender S.R.L. + * + */ +#include "kvmi_int.h" + +int kvmi_init(void) +{ + return 0; +} + +void kvmi_uninit(void) +{ +} + +void kvmi_create_vm(struct kvm *kvm) +{ +} + +void kvmi_destroy_vm(struct kvm *kvm) +{ +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h new file mode 100644 index 000000000000..34af926f9838 --- /dev/null +++ b/virt/kvm/introspection/kvmi_int.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __KVMI_INT_H__ +#define __KVMI_INT_H__ + +#include <linux/kvm_host.h> + +#endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 8c4bccf33c8c..a2b424fd2efd 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -51,6 +51,7 @@ #include <linux/io.h> #include <linux/lockdep.h> #include <linux/kthread.h> +#include <linux/kvmi_host.h> #include <asm/processor.h> #include <asm/ioctl.h> @@ -89,6 +90,9 @@ unsigned int halt_poll_ns_shrink; module_param(halt_poll_ns_shrink, uint, 0644); EXPORT_SYMBOL_GPL(halt_poll_ns_shrink); +static bool enable_introspection; +module_param_named(introspection, enable_introspection, bool, 0644); + /* * Ordering of locks: * @@ -745,6 +749,9 @@ static struct kvm *kvm_create_vm(unsigned long type) if (r) goto out_err; + if (enable_introspection) + kvmi_create_vm(kvm); + mutex_lock(&kvm_lock); list_add(&kvm->vm_list, &vm_list); mutex_unlock(&kvm_lock); @@ -797,6 +804,8 @@ static void kvm_destroy_vm(struct kvm *kvm) int i; struct mm_struct *mm = kvm->mm; + if (enable_introspection) + kvmi_destroy_vm(kvm); kvm_uevent_notify_change(KVM_EVENT_DESTROY_VM, kvm); kvm_destroy_vm_debugfs(kvm); kvm_arch_sync_events(kvm); @@ -4811,6 +4820,11 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, r = kvm_vfio_ops_init(); WARN_ON(r); + if (enable_introspection) { + r = kvmi_init(); + WARN_ON(r); + } + return 0; out_unreg: @@ -4835,6 +4849,7 @@ EXPORT_SYMBOL_GPL(kvm_init); void kvm_exit(void) { + kvmi_uninit(); debugfs_remove_recursive(kvm_debugfs_dir); misc_deregister(&kvm_dev); kmem_cache_destroy(kvm_vcpu_cache);
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 38/84] KVM: introspection: add hook/unhook ioctls
On hook, a new thread is created to handle the messages coming from the introspection tool (commands or event replies). The VM related commands are handled by this thread, while the vCPU commands and events replies are dispatched to the vCPU threads. On unhook, the socket is shut down, which will signal: the receiving thread to quit (because it might be blocked in recvmsg()) and the introspection tool to clean up. The mutex is used to protect the 'kvm->kvmi' pointer when accessed through ioctls. The reference counter is used by the receiving thread (for its entire life time) and by the vCPU threads while sending introspection events or handling introspection commands. The completion objects is set when the reference counter reaches zero and the unhook process is waiting for it in order to free the introspection structures. Co-developed-by: Mircea C?rjaliu <mcirjaliu at bitdefender.com> Signed-off-by: Mircea C?rjaliu <mcirjaliu at bitdefender.com> Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/api.rst | 55 ++++++ arch/x86/include/asm/kvmi_host.h | 8 + arch/x86/kvm/Makefile | 2 +- arch/x86/kvm/x86.c | 6 + include/linux/kvm_host.h | 5 + include/linux/kvmi_host.h | 17 ++ include/uapi/linux/kvm.h | 10 ++ include/uapi/linux/kvmi.h | 13 ++ tools/testing/selftests/kvm/Makefile | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 94 ++++++++++ virt/kvm/introspection/kvmi.c | 162 ++++++++++++++++++ virt/kvm/introspection/kvmi_int.h | 22 +++ virt/kvm/introspection/kvmi_msg.c | 39 +++++ virt/kvm/kvm_main.c | 19 ++ 14 files changed, 452 insertions(+), 1 deletion(-) create mode 100644 arch/x86/include/asm/kvmi_host.h create mode 100644 include/uapi/linux/kvmi.h create mode 100644 tools/testing/selftests/kvm/x86_64/kvmi_test.c create mode 100644 virt/kvm/introspection/kvmi_msg.c diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 320788f81a05..e34f20430eb1 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -4697,6 +4697,61 @@ KVM_PV_VM_VERIFY Verify the integrity of the unpacked image. Only if this succeeds, KVM is allowed to start protected VCPUs. +4.126 KVM_INTROSPECTION_HOOK +---------------------------- + +:Capability: KVM_CAP_INTROSPECTION +:Architectures: x86 +:Type: vm ioctl +:Parameters: struct kvm_introspection (in) +:Returns: 0 on success, a negative value on error + +Errors: + + ====== =========================================================+ ENOMEM the memory allocation failed + EEXIST the VM is already introspected + EINVAL the file descriptor doesn't correspond to an active socket + EINVAL the padding is not zero + EPERM the introspection is disabled (kvm.introspection=0) + ====== =========================================================+ +This ioctl is used to enable the introspection of the current VM. + +:: + + struct kvm_introspection { + __s32 fd; + __u32 padding; + __u8 uuid[16]; + }; + +fd is the file descriptor of a socket connected to the introspection tool, + +padding must be zero (it might be used in the future), + +uuid is used for debug and error messages. + +The KVMI version can be retrieved using the KVM_CAP_INTROSPECTION of +the KVM_CHECK_EXTENSION ioctl() at run-time. + +4.127 KVM_INTROSPECTION_UNHOOK +------------------------------ + +:Capability: KVM_CAP_INTROSPECTION +:Architectures: x86 +:Type: vm ioctl +:Parameters: none +:Returns: 0 on success, a negative value on error + +Errors: + + ====== =========================================================+ EPERM the introspection is disabled (kvm.introspection=0) + ====== =========================================================+ +This ioctl is used to free all introspection structures +related to this VM. 5. The kvm_run structure =======================diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h new file mode 100644 index 000000000000..38c398262913 --- /dev/null +++ b/arch/x86/include/asm/kvmi_host.h @@ -0,0 +1,8 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_KVMI_HOST_H +#define _ASM_X86_KVMI_HOST_H + +struct kvm_arch_introspection { +}; + +#endif /* _ASM_X86_KVMI_HOST_H */ diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 880b028c7f86..fb0242032cd1 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -13,7 +13,7 @@ KVMI := $(KVM)/introspection kvm-y += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \ $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o -kvm-$(CONFIG_KVM_INTROSPECTION) += $(KVMI)/kvmi.o +kvm-$(CONFIG_KVM_INTROSPECTION) += $(KVMI)/kvmi.o $(KVMI)/kvmi_msg.o kvm-y += x86.o emulate.o i8259.o irq.o lapic.o \ i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ffcf09e9bf78..ff0d3c82de64 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -56,6 +56,7 @@ #include <linux/sched/stat.h> #include <linux/sched/isolation.h> #include <linux/mem_encrypt.h> +#include <linux/kvmi_host.h> #include <trace/events/kvm.h> @@ -3538,6 +3539,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_HYPERV_ENLIGHTENED_VMCS: r = kvm_x86_ops.nested_ops->enable_evmcs != NULL; break; +#ifdef CONFIG_KVM_INTROSPECTION + case KVM_CAP_INTROSPECTION: + r = kvmi_version(); + break; +#endif default: break; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5b6f1338de74..c82c55085604 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -504,6 +504,11 @@ struct kvm { struct srcu_struct irq_srcu; pid_t userspace_pid; unsigned int max_halt_poll_ns; + + struct mutex kvmi_lock; + refcount_t kvmi_ref; + struct completion kvmi_complete; + struct kvm_introspection *kvmi; }; #define kvm_err(fmt, ...) \ diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 1e0a73c2a190..55ff571db40d 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -4,11 +4,28 @@ #ifdef CONFIG_KVM_INTROSPECTION +#include <asm/kvmi_host.h> + +struct kvm_introspection { + struct kvm_arch_introspection arch; + struct kvm *kvm; + + uuid_t uuid; + + struct socket *sock; + struct task_struct *recv; +}; + +int kvmi_version(void); int kvmi_init(void); void kvmi_uninit(void); void kvmi_create_vm(struct kvm *kvm); void kvmi_destroy_vm(struct kvm *kvm); +int kvmi_ioctl_hook(struct kvm *kvm, + const struct kvm_introspection_hook *hook); +int kvmi_ioctl_unhook(struct kvm *kvm); + #else static inline int kvmi_init(void) { return 0; } diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 4fdf30316582..dd84ebdfcd6d 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1031,6 +1031,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_PPC_SECURE_GUEST 181 #define KVM_CAP_HALT_POLL 182 #define KVM_CAP_ASYNC_PF_INT 183 +#define KVM_CAP_INTROSPECTION 184 #ifdef KVM_CAP_IRQ_ROUTING @@ -1612,6 +1613,15 @@ struct kvm_sev_dbg { __u32 len; }; +struct kvm_introspection_hook { + __s32 fd; + __u32 padding; + __u8 uuid[16]; +}; + +#define KVM_INTROSPECTION_HOOK _IOW(KVMIO, 0xc3, struct kvm_introspection_hook) +#define KVM_INTROSPECTION_UNHOOK _IO(KVMIO, 0xc4) + #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) #define KVM_DEV_ASSIGN_MASK_INTX (1 << 2) diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h new file mode 100644 index 000000000000..34dda91016db --- /dev/null +++ b/include/uapi/linux/kvmi.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI__LINUX_KVMI_H +#define _UAPI__LINUX_KVMI_H + +/* + * KVMI structures and definitions + */ + +enum { + KVMI_VERSION = 0x00000001 +}; + +#endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 4a166588d99f..ea8a6b08e87e 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -41,6 +41,7 @@ LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c TEST_GEN_PROGS_x86_64 = x86_64/cr4_cpuid_sync_test TEST_GEN_PROGS_x86_64 += x86_64/evmcs_test TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid +TEST_GEN_PROGS_x86_64 += x86_64/kvmi_test TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test TEST_GEN_PROGS_x86_64 += x86_64/set_sregs_test diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c new file mode 100644 index 000000000000..08ca4701c440 --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -0,0 +1,94 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KVM introspection tests + * + * Copyright (C) 2020, Bitdefender S.R.L. + */ + +#define _GNU_SOURCE /* for program_invocation_short_name */ +#include <sys/types.h> +#include <sys/socket.h> + +#include "test_util.h" + +#include "kvm_util.h" +#include "processor.h" +#include "../lib/kvm_util_internal.h" + +#include "linux/kvmi.h" + +#define VCPU_ID 5 + +static int socket_pair[2]; +#define Kvm_socket socket_pair[0] +#define Userspace_socket socket_pair[1] + +void setup_socket(void) +{ + int r; + + r = socketpair(AF_UNIX, SOCK_STREAM, 0, socket_pair); + TEST_ASSERT(r == 0, + "socketpair() failed, errno %d (%s)\n", + errno, strerror(errno)); +} + +static void do_hook_ioctl(struct kvm_vm *vm, __s32 fd, __u32 padding, + int expected_err) +{ + struct kvm_introspection_hook hook = { + .fd = fd, + .padding = padding + }; + int r; + + r = ioctl(vm->fd, KVM_INTROSPECTION_HOOK, &hook); + TEST_ASSERT(r == 0 || errno == expected_err, + "KVM_INTROSPECTION_HOOK failed, errno %d (%s), expected %d, fd %d, padding %d\n", + errno, strerror(errno), expected_err, fd, padding); +} + +static void hook_introspection(struct kvm_vm *vm) +{ + __u32 padding = 1, no_padding = 0; + + do_hook_ioctl(vm, Kvm_socket, padding, EINVAL); + do_hook_ioctl(vm, -1, no_padding, EINVAL); + do_hook_ioctl(vm, Kvm_socket, no_padding, 0); + do_hook_ioctl(vm, Kvm_socket, no_padding, EEXIST); +} + +static void unhook_introspection(struct kvm_vm *vm) +{ + int r; + + r = ioctl(vm->fd, KVM_INTROSPECTION_UNHOOK, NULL); + TEST_ASSERT(r == 0, + "KVM_INTROSPECTION_UNHOOK failed, errno %d (%s)\n", + errno, strerror(errno)); +} + +static void test_introspection(struct kvm_vm *vm) +{ + setup_socket(); + hook_introspection(vm); + unhook_introspection(vm); +} + +int main(int argc, char *argv[]) +{ + struct kvm_vm *vm; + + if (!kvm_check_cap(KVM_CAP_INTROSPECTION)) { + print_skip("KVM_CAP_INTROSPECTION not available"); + exit(KSFT_SKIP); + } + + vm = vm_create_default(VCPU_ID, 0, NULL); + vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid()); + + test_introspection(vm); + + kvm_vm_free(vm); + return 0; +} diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index af53bdcb7ec8..5d9bc4ed5060 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -5,6 +5,7 @@ * Copyright (C) 2017-2020 Bitdefender S.R.L. * */ +#include <linux/kthread.h> #include "kvmi_int.h" int kvmi_init(void) @@ -12,14 +13,175 @@ int kvmi_init(void) return 0; } +int kvmi_version(void) +{ + return KVMI_VERSION; +} + void kvmi_uninit(void) { } +static void free_kvmi(struct kvm *kvm) +{ + kfree(kvm->kvmi); + kvm->kvmi = NULL; +} + +static struct kvm_introspection * +alloc_kvmi(struct kvm *kvm, const struct kvm_introspection_hook *hook) +{ + struct kvm_introspection *kvmi; + + kvmi = kzalloc(sizeof(*kvmi), GFP_KERNEL); + if (!kvmi) + return NULL; + + BUILD_BUG_ON(sizeof(hook->uuid) != sizeof(kvmi->uuid)); + memcpy(&kvmi->uuid, &hook->uuid, sizeof(kvmi->uuid)); + + kvmi->kvm = kvm; + + return kvmi; +} + +static void kvmi_destroy(struct kvm_introspection *kvmi) +{ + struct kvm *kvm = kvmi->kvm; + + free_kvmi(kvm); +} + +static void kvmi_stop_recv_thread(struct kvm_introspection *kvmi) +{ + kvmi_sock_shutdown(kvmi); +} + +static void __kvmi_unhook(struct kvm *kvm) +{ + struct kvm_introspection *kvmi = KVMI(kvm); + + wait_for_completion_killable(&kvm->kvmi_complete); + kvmi_sock_put(kvmi); +} + +static void kvmi_unhook(struct kvm *kvm) +{ + struct kvm_introspection *kvmi; + + mutex_lock(&kvm->kvmi_lock); + + kvmi = KVMI(kvm); + if (kvmi) { + kvmi_stop_recv_thread(kvmi); + __kvmi_unhook(kvm); + kvmi_destroy(kvmi); + } + + mutex_unlock(&kvm->kvmi_lock); +} + +int kvmi_ioctl_unhook(struct kvm *kvm) +{ + kvmi_unhook(kvm); + return 0; +} + +void kvmi_put(struct kvm *kvm) +{ + if (refcount_dec_and_test(&kvm->kvmi_ref)) + complete(&kvm->kvmi_complete); +} + +static int __kvmi_hook(struct kvm *kvm, + const struct kvm_introspection_hook *hook) +{ + struct kvm_introspection *kvmi = KVMI(kvm); + + if (!kvmi_sock_get(kvmi, hook->fd)) + return -EINVAL; + + return 0; +} + +static int kvmi_recv_thread(void *arg) +{ + struct kvm_introspection *kvmi = arg; + + while (kvmi_msg_process(kvmi)) + ; + + /* + * Signal userspace (which might wait for POLLHUP only) + * and prevent the vCPUs from sending other events. + */ + kvmi_sock_shutdown(kvmi); + + kvmi_put(kvmi->kvm); + return 0; +} + +int kvmi_hook(struct kvm *kvm, const struct kvm_introspection_hook *hook) +{ + struct kvm_introspection *kvmi; + int err = 0; + + mutex_lock(&kvm->kvmi_lock); + + if (kvm->kvmi) { + err = -EEXIST; + goto out; + } + + kvmi = alloc_kvmi(kvm, hook); + if (!kvmi) { + err = -ENOMEM; + goto out; + } + + kvm->kvmi = kvmi; + + err = __kvmi_hook(kvm, hook); + if (err) + goto destroy; + + init_completion(&kvm->kvmi_complete); + + refcount_set(&kvm->kvmi_ref, 1); + + kvmi->recv = kthread_run(kvmi_recv_thread, kvmi, "kvmi-recv"); + if (IS_ERR(kvmi->recv)) { + err = -ENOMEM; + kvmi_put(kvm); + goto unhook; + } + + goto out; + +unhook: + __kvmi_unhook(kvm); +destroy: + kvmi_destroy(kvmi); +out: + mutex_unlock(&kvm->kvmi_lock); + return err; +} + +int kvmi_ioctl_hook(struct kvm *kvm, + const struct kvm_introspection_hook *hook) +{ + if (hook->padding) + return -EINVAL; + + return kvmi_hook(kvm, hook); +} + void kvmi_create_vm(struct kvm *kvm) { + mutex_init(&kvm->kvmi_lock); } void kvmi_destroy_vm(struct kvm *kvm) { + kvmi_unhook(kvm); } diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 34af926f9838..f0a8d653d79b 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -3,5 +3,27 @@ #define __KVMI_INT_H__ #include <linux/kvm_host.h> +#include <linux/kvmi_host.h> +#include <uapi/linux/kvmi.h> + +#define kvmi_warn(kvmi, fmt, ...) \ + kvm_info("%pU WARNING: " fmt, &kvmi->uuid, ## __VA_ARGS__) +#define kvmi_warn_once(kvmi, fmt, ...) ({ \ + static bool __section(.data.once) __warned; \ + if (!__warned) { \ + __warned = true; \ + kvmi_warn(kvmi, fmt, ## __VA_ARGS__); \ + } \ + }) +#define kvmi_err(kvmi, fmt, ...) \ + kvm_info("%pU ERROR: " fmt, &kvmi->uuid, ## __VA_ARGS__) + +#define KVMI(kvm) ((kvm)->kvmi) + +/* kvmi_msg.c */ +bool kvmi_sock_get(struct kvm_introspection *kvmi, int fd); +void kvmi_sock_shutdown(struct kvm_introspection *kvmi); +void kvmi_sock_put(struct kvm_introspection *kvmi); +bool kvmi_msg_process(struct kvm_introspection *kvmi); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c new file mode 100644 index 000000000000..3ae52c61f861 --- /dev/null +++ b/virt/kvm/introspection/kvmi_msg.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KVM introspection (message handling) + * + * Copyright (C) 2017-2020 Bitdefender S.R.L. + * + */ +#include <linux/net.h> +#include "kvmi_int.h" + +bool kvmi_sock_get(struct kvm_introspection *kvmi, int fd) +{ + struct socket *sock; + int err; + + sock = sockfd_lookup(fd, &err); + if (!sock) + return false; + + kvmi->sock = sock; + + return true; +} + +void kvmi_sock_put(struct kvm_introspection *kvmi) +{ + if (kvmi->sock) + sockfd_put(kvmi->sock); +} + +void kvmi_sock_shutdown(struct kvm_introspection *kvmi) +{ + kernel_sock_shutdown(kvmi->sock, SHUT_RDWR); +} + +bool kvmi_msg_process(struct kvm_introspection *kvmi) +{ + return false; +} diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index a2b424fd2efd..0d2da77ccb12 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3829,6 +3829,25 @@ static long kvm_vm_ioctl(struct file *filp, case KVM_CHECK_EXTENSION: r = kvm_vm_ioctl_check_extension_generic(kvm, arg); break; +#ifdef CONFIG_KVM_INTROSPECTION + case KVM_INTROSPECTION_HOOK: + r = -EPERM; + if (enable_introspection) { + struct kvm_introspection_hook hook; + + if (copy_from_user(&hook, argp, sizeof(hook))) + r = -EFAULT; + else + r = kvmi_ioctl_hook(kvm, &hook); + } + break; + case KVM_INTROSPECTION_UNHOOK: + if (enable_introspection) + r = kvmi_ioctl_unhook(kvm); + else + r = -EPERM; + break; +#endif /* CONFIG_KVM_INTROSPECTION */ default: r = kvm_arch_vm_ioctl(filp, ioctl, arg); }
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 39/84] KVM: introspection: add permission access ioctls
KVM_INTROSPECTION_COMMAND and KVM_INTROSPECTION_EVENTS ioctls are used by the device manager to allow/disallow access to specific (or all) introspection commands and events. The introspection tool will get the KVM_EPERM error code on any attempt to use a disallowed command. By default, all events and almost all commands are disallowed. Some commands, those querying the introspection capabilities, are always allowed. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/api.rst | 66 ++++++++++ include/linux/kvmi_host.h | 7 ++ include/uapi/linux/kvm.h | 8 ++ include/uapi/linux/kvmi.h | 8 ++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 48 +++++++ virt/kvm/introspection/kvmi.c | 119 ++++++++++++++++++ virt/kvm/kvm_main.c | 14 +++ 7 files changed, 270 insertions(+) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index e34f20430eb1..174f13f2389d 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -4753,6 +4753,72 @@ Errors: This ioctl is used to free all introspection structures related to this VM. +4.128 KVM_INTROSPECTION_COMMAND +------------------------------- + +:Capability: KVM_CAP_INTROSPECTION +:Architectures: x86 +:Type: vm ioctl +:Parameters: struct kvm_introspection_feature (in) +:Returns: 0 on success, a negative value on error + +Errors: + + ====== ==========================================================+ EFAULT the VM is not introspected yet (use KVM_INTROSPECTION_HOOK) + EINVAL the command is unknown + EPERM the command can't be disallowed (e.g. KVMI_GET_VERSION) + ====== ==========================================================+ +This ioctl is used to allow or disallow introspection commands +for the current VM. By default, almost all commands are disallowed +except for those used to query the API. + +:: + + struct kvm_introspection_feature { + __u32 allow; + __s32 id; + }; + +If allow is 1, the command specified by id is allowed. If allow is 0, +the command is disallowed. + +Unless set to -1 (meaning all commands), id must be a command ID +(e.g. KVMI_GET_VERSION) + +4.129 KVM_INTROSPECTION_EVENT +----------------------------- + +:Capability: KVM_CAP_INTROSPECTION +:Architectures: x86 +:Type: vm ioctl +:Parameters: struct kvm_introspection_feature (in) +:Returns: 0 on success, a negative value on error + +Errors: + + ====== ==========================================================+ EFAULT the VM is not introspected yet (use KVM_INTROSPECTION_HOOK) + EINVAL the event is unknown + ====== ==========================================================+ +This ioctl is used to allow or disallow introspection events +for the current VM. By default, all events are disallowed. + +:: + + struct kvm_introspection_feature { + __u32 allow; + __s32 id; + }; + +If allow is 1, the event specified by id is allowed. If allow is 0, +the event is disallowed. + +Unless set to -1 (meaning all events), id must be a event ID +(e.g. KVMI_EVENT_UNHOOK, KVMI_EVENT_CR, etc.) + 5. The kvm_run structure ======================= diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 55ff571db40d..7efd071e398d 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -14,6 +14,9 @@ struct kvm_introspection { struct socket *sock; struct task_struct *recv; + + unsigned long *cmd_allow_mask; + unsigned long *event_allow_mask; }; int kvmi_version(void); @@ -25,6 +28,10 @@ void kvmi_destroy_vm(struct kvm *kvm); int kvmi_ioctl_hook(struct kvm *kvm, const struct kvm_introspection_hook *hook); int kvmi_ioctl_unhook(struct kvm *kvm); +int kvmi_ioctl_command(struct kvm *kvm, + const struct kvm_introspection_feature *feat); +int kvmi_ioctl_event(struct kvm *kvm, + const struct kvm_introspection_feature *feat); #else diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index dd84ebdfcd6d..17df03ceb483 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1622,6 +1622,14 @@ struct kvm_introspection_hook { #define KVM_INTROSPECTION_HOOK _IOW(KVMIO, 0xc3, struct kvm_introspection_hook) #define KVM_INTROSPECTION_UNHOOK _IO(KVMIO, 0xc4) +struct kvm_introspection_feature { + __u32 allow; + __s32 id; +}; + +#define KVM_INTROSPECTION_COMMAND _IOW(KVMIO, 0xc5, struct kvm_introspection_feature) +#define KVM_INTROSPECTION_EVENT _IOW(KVMIO, 0xc6, struct kvm_introspection_feature) + #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) #define KVM_DEV_ASSIGN_MASK_INTX (1 << 2) diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 34dda91016db..d7b18ffef4fa 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -10,4 +10,12 @@ enum { KVMI_VERSION = 0x00000001 }; +enum { + KVMI_NUM_MESSAGES +}; + +enum { + KVMI_NUM_EVENTS +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 08ca4701c440..09b8989317d7 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -48,14 +48,62 @@ static void do_hook_ioctl(struct kvm_vm *vm, __s32 fd, __u32 padding, errno, strerror(errno), expected_err, fd, padding); } +static void set_perm(struct kvm_vm *vm, __s32 id, __u32 allow, + int expected_err, int ioctl_id, + const char *ioctl_str) +{ + struct kvm_introspection_feature feat = { + .allow = allow, + .id = id + }; + int r; + + r = ioctl(vm->fd, ioctl_id, &feat); + TEST_ASSERT(r == 0 || errno == expected_err, + "%s failed, id %d, errno %d (%s), expected %d\n", + ioctl_str, id, errno, strerror(errno), expected_err); +} + +static void set_event_perm(struct kvm_vm *vm, __s32 id, __u32 allow, + int expected_err) +{ + set_perm(vm, id, allow, expected_err, KVM_INTROSPECTION_EVENT, + "KVM_INTROSPECTION_EVENT"); +} + +static void allow_event(struct kvm_vm *vm, __s32 event_id) +{ + set_event_perm(vm, event_id, 1, 0); +} + +static void set_command_perm(struct kvm_vm *vm, __s32 id, __u32 allow, + int expected_err) +{ + set_perm(vm, id, allow, expected_err, KVM_INTROSPECTION_COMMAND, + "KVM_INTROSPECTION_COMMAND"); +} + static void hook_introspection(struct kvm_vm *vm) { + __u32 allow = 1, disallow = 0, allow_inval = 2; __u32 padding = 1, no_padding = 0; + __s32 all_IDs = -1; + + set_command_perm(vm, all_IDs, allow, EFAULT); + set_event_perm(vm, all_IDs, allow, EFAULT); do_hook_ioctl(vm, Kvm_socket, padding, EINVAL); do_hook_ioctl(vm, -1, no_padding, EINVAL); do_hook_ioctl(vm, Kvm_socket, no_padding, 0); do_hook_ioctl(vm, Kvm_socket, no_padding, EEXIST); + + set_command_perm(vm, all_IDs, allow_inval, EINVAL); + set_command_perm(vm, all_IDs, disallow, 0); + set_command_perm(vm, all_IDs, allow, 0); + + set_event_perm(vm, all_IDs, allow_inval, EINVAL); + set_event_perm(vm, all_IDs, disallow, 0); + allow_event(vm, all_IDs); } static void unhook_introspection(struct kvm_vm *vm) diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 5d9bc4ed5060..b1ea39f35481 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -8,6 +8,8 @@ #include <linux/kthread.h> #include "kvmi_int.h" +#define KVMI_NUM_COMMANDS KVMI_NUM_MESSAGES + int kvmi_init(void) { return 0; @@ -24,6 +26,9 @@ void kvmi_uninit(void) static void free_kvmi(struct kvm *kvm) { + bitmap_free(kvm->kvmi->cmd_allow_mask); + bitmap_free(kvm->kvmi->event_allow_mask); + kfree(kvm->kvmi); kvm->kvmi = NULL; } @@ -37,6 +42,15 @@ alloc_kvmi(struct kvm *kvm, const struct kvm_introspection_hook *hook) if (!kvmi) return NULL; + kvmi->cmd_allow_mask = bitmap_zalloc(KVMI_NUM_COMMANDS, GFP_KERNEL); + kvmi->event_allow_mask = bitmap_zalloc(KVMI_NUM_EVENTS, GFP_KERNEL); + if (!kvmi->cmd_allow_mask || !kvmi->event_allow_mask) { + bitmap_free(kvmi->cmd_allow_mask); + bitmap_free(kvmi->event_allow_mask); + kfree(kvmi); + return NULL; + } + BUILD_BUG_ON(sizeof(hook->uuid) != sizeof(kvmi->uuid)); memcpy(&kvmi->uuid, &hook->uuid, sizeof(kvmi->uuid)); @@ -185,3 +199,108 @@ void kvmi_destroy_vm(struct kvm *kvm) { kvmi_unhook(kvm); } + +static int +kvmi_ioctl_get_feature(const struct kvm_introspection_feature *feat, + bool *allow, s32 *id, unsigned int nbits) +{ + s32 all_bits = -1; + + if (feat->id < 0 && feat->id != all_bits) + return -EINVAL; + + if (feat->id > 0 && feat->id >= nbits) + return -EINVAL; + + if (feat->allow > 1) + return -EINVAL; + + *allow = feat->allow == 1; + *id = feat->id; + + return 0; +} + +static void kvmi_control_allowed_events(struct kvm_introspection *kvmi, + s32 id, bool allow) +{ + s32 all_events = -1; + + if (allow) { + if (id == all_events) + bitmap_fill(kvmi->event_allow_mask, KVMI_NUM_EVENTS); + else + set_bit(id, kvmi->event_allow_mask); + } else { + if (id == all_events) + bitmap_zero(kvmi->event_allow_mask, KVMI_NUM_EVENTS); + else + clear_bit(id, kvmi->event_allow_mask); + } +} + +int kvmi_ioctl_event(struct kvm *kvm, + const struct kvm_introspection_feature *feat) +{ + struct kvm_introspection *kvmi; + bool allow; + int err; + s32 id; + + err = kvmi_ioctl_get_feature(feat, &allow, &id, KVMI_NUM_EVENTS); + if (err) + return err; + + mutex_lock(&kvm->kvmi_lock); + + kvmi = KVMI(kvm); + if (kvmi) + kvmi_control_allowed_events(kvmi, id, allow); + else + err = -EFAULT; + + mutex_unlock(&kvm->kvmi_lock); + return err; +} + +static void kvmi_control_allowed_commands(struct kvm_introspection *kvmi, + s32 id, bool allow) +{ + s32 all_commands = -1; + + if (allow) { + if (id == all_commands) + bitmap_fill(kvmi->cmd_allow_mask, KVMI_NUM_COMMANDS); + else + set_bit(id, kvmi->cmd_allow_mask); + } else { + if (id == all_commands) + bitmap_zero(kvmi->cmd_allow_mask, KVMI_NUM_COMMANDS); + else + clear_bit(id, kvmi->cmd_allow_mask); + } +} + +int kvmi_ioctl_command(struct kvm *kvm, + const struct kvm_introspection_feature *feat) +{ + struct kvm_introspection *kvmi; + bool allow; + int err; + s32 id; + + err = kvmi_ioctl_get_feature(feat, &allow, &id, KVMI_NUM_COMMANDS); + if (err) + return err; + + mutex_lock(&kvm->kvmi_lock); + + kvmi = KVMI(kvm); + if (kvmi) + kvmi_control_allowed_commands(kvmi, id, allow); + else + err = -EFAULT; + + mutex_unlock(&kvm->kvmi_lock); + return err; +} diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 0d2da77ccb12..9aeb57e93165 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3847,6 +3847,20 @@ static long kvm_vm_ioctl(struct file *filp, else r = -EPERM; break; + case KVM_INTROSPECTION_COMMAND: + case KVM_INTROSPECTION_EVENT: + r = -EPERM; + if (enable_introspection) { + struct kvm_introspection_feature feat; + + if (copy_from_user(&feat, argp, sizeof(feat))) + r = -EFAULT; + else if (ioctl == KVM_INTROSPECTION_EVENT) + r = kvmi_ioctl_event(kvm, &feat); + else + r = kvmi_ioctl_command(kvm, &feat); + } + break; #endif /* CONFIG_KVM_INTROSPECTION */ default: r = kvm_arch_vm_ioctl(filp, ioctl, arg);
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 40/84] KVM: introspection: add the read/dispatch message function
Based on the common header (struct kvmi_msg_hdr), the receiving thread will read/validate all messages, execute the VM introspection commands (eg. KVMI_VM_GET_INFO) and dispatch the vCPU introspection commands (eg. KVMI_VCPU_GET_REGISTERS) to the vCPU threads. The vCPU threads will reply to vCPU introspection commands without the help of the receiving thread. Same for sending vCPU events, but the vCPU thread will wait for the receiving thread to get the event reply. Meanwhile, it will execute any queued vCPU introspection command. The receiving thread will end when the socket is closed or on the first API error (eg. wrong message size). Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 86 ++++++++++ include/uapi/linux/kvmi.h | 21 +++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 100 ++++++++++++ virt/kvm/introspection/kvmi.c | 42 ++++- virt/kvm/introspection/kvmi_int.h | 5 + virt/kvm/introspection/kvmi_msg.c | 149 +++++++++++++++++- 6 files changed, 401 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 3a1b6c655de7..f3d16971ba2b 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -65,6 +65,85 @@ been used on that guest (if requested). Obviously, whether the guest can really continue normal execution depends on whether the introspection tool has made any modifications that require an active KVMI channel. +All messages (commands or events) have a common header:: + + struct kvmi_msg_hdr { + __u16 id; + __u16 size; + __u32 seq; + }; + +The replies have the same header, with the sequence number (``seq``) +and message id (``id``) matching the command/event. + +After ``kvmi_msg_hdr``, ``id`` specific data of ``size`` bytes will +follow. + +The message header and its data must be sent with one ``sendmsg()`` call +to the socket. This simplifies the receiver loop and avoids +the reconstruction of messages on the other side. + +The wire protocol uses the host native byte-order. The introspection tool +must check this during the handshake and do the necessary conversion. + +A command reply begins with:: + + struct kvmi_error_code { + __s32 err; + __u32 padding; + } + +followed by the command specific data if the error code ``err`` is zero. + +The error code -KVM_ENOSYS is returned for unsupported commands. + +The error code -KVM_EPERM is returned for disallowed commands (see **Hooking**). + +The error code is related to the message processing, including unsupported +commands. For all the other errors (incomplete messages, wrong sequence +numbers, socket errors etc.) the socket will be closed. The device +manager should reconnect. + +While all commands will have a reply as soon as possible, the replies +to events will probably be delayed until a set of (new) commands will +complete:: + + Host kernel Introspection tool + ----------- ------------------ + event 1 -> + <- command 1 + command 1 reply -> + <- command 2 + command 2 reply -> + <- event 1 reply + +If both ends send a message at the same time:: + + Host kernel Tool + ----------- ---- + event X -> <- command X + +the host kernel will reply to 'command X', regardless of the receive time +(before or after the 'event X' was sent). + +As it can be seen below, the wire protocol specifies occasional padding. This +is to permit working with the data by directly using C structures or to round +the structure size to a multiple of 8 bytes (64bit) to improve the copy +operations that happen during ``recvmsg()`` or ``sendmsg()``. The members +should have the native alignment of the host. All padding must be +initialized with zero otherwise the respective command will fail with +-KVM_EINVAL. + +To describe the commands/events, we reuse some conventions from api.txt: + + - Architectures: which instruction set architectures provide this command/event + + - Versions: which versions provide this command/event + + - Parameters: incoming message data + + - Returns: outgoing/reply message data + Handshake --------- @@ -99,6 +178,13 @@ In the end, the device manager will pass the file handle (plus the allowed commands/events) to KVM. It will detect when the socket is shutdown and it will reinitiate the handshake. +Once the file handle reaches KVM, the introspection tool should +use the *KVMI_GET_VERSION* command to get the API version and/or the +*KVMI_VM_CHECK_COMMAND* and *KVMI_VM_CHECK_EVENT* commands to see which +commands/events are allowed for this guest. The error code -KVM_EPERM +will be returned if the introspection tool uses a command or enables an +event which is disallowed. + Unhooking --------- diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index d7b18ffef4fa..9bfff484fd6f 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -18,4 +18,25 @@ enum { KVMI_NUM_EVENTS }; +struct kvmi_msg_hdr { + __u16 id; + __u16 size; + __u32 seq; +}; + +/* + * The kernel side will close the socket if kvmi_msg_hdr.size + * is bigger than KVMI_MSG_SIZE. + * This limit is used to accommodate the biggest known message, + * the commands to read/write a 4K page from/to guest memory. + */ +enum { + KVMI_MSG_SIZE = (4096 * 2 - sizeof(struct kvmi_msg_hdr)) +}; + +struct kvmi_error_code { + __s32 err; + __u32 padding; +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 09b8989317d7..9c591e0d9c8a 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -15,6 +15,7 @@ #include "processor.h" #include "../lib/kvm_util_internal.h" +#include "linux/kvm_para.h" #include "linux/kvmi.h" #define VCPU_ID 5 @@ -116,10 +117,109 @@ static void unhook_introspection(struct kvm_vm *vm) errno, strerror(errno)); } +static void receive_data(void *dest, size_t size) +{ + ssize_t r; + + r = recv(Userspace_socket, dest, size, MSG_WAITALL); + TEST_ASSERT(r == size, + "recv() failed, expected %zd, result %zd, errno %d (%s)\n", + size, r, errno, strerror(errno)); +} + +static int receive_cmd_reply(struct kvmi_msg_hdr *req, void *rpl, + size_t rpl_size) +{ + struct kvmi_msg_hdr hdr; + struct kvmi_error_code ec; + + receive_data(&hdr, sizeof(hdr)); + + TEST_ASSERT(hdr.seq == req->seq, + "Unexpected messages sequence 0x%x, expected 0x%x\n", + hdr.seq, req->seq); + + TEST_ASSERT(hdr.size >= sizeof(ec), + "Invalid message size %d, expected %zd bytes (at least)\n", + hdr.size, sizeof(ec)); + + receive_data(&ec, sizeof(ec)); + + if (ec.err) { + TEST_ASSERT(hdr.size == sizeof(ec), + "Invalid command reply on error\n"); + } else { + TEST_ASSERT(hdr.size == sizeof(ec) + rpl_size, + "Invalid command reply\n"); + + if (rpl && rpl_size) + receive_data(rpl, rpl_size); + } + + return ec.err; +} + +static unsigned int new_seq(void) +{ + static unsigned int seq; + + return seq++; +} + +static void send_message(int msg_id, struct kvmi_msg_hdr *hdr, size_t size) +{ + ssize_t r; + + hdr->id = msg_id; + hdr->seq = new_seq(); + hdr->size = size - sizeof(*hdr); + + r = send(Userspace_socket, hdr, size, 0); + TEST_ASSERT(r == size, + "send() failed, sending %zd, result %zd, errno %d (%s)\n", + size, r, errno, strerror(errno)); +} + +static const char *kvm_strerror(int error) +{ + switch (error) { + case KVM_ENOSYS: + return "Invalid system call number"; + case KVM_EOPNOTSUPP: + return "Operation not supported on transport endpoint"; + case KVM_EAGAIN: + return "Try again"; + default: + return strerror(error); + } +} + +static int do_command(int cmd_id, struct kvmi_msg_hdr *req, + size_t req_size, void *rpl, size_t rpl_size) +{ + send_message(cmd_id, req, req_size); + return receive_cmd_reply(req, rpl, rpl_size); +} + +static void test_cmd_invalid(void) +{ + int invalid_msg_id = 0xffff; + struct kvmi_msg_hdr req; + int r; + + r = do_command(invalid_msg_id, &req, sizeof(req), NULL, 0); + TEST_ASSERT(r == -KVM_ENOSYS, + "Invalid command didn't failed with KVM_ENOSYS, error %d (%s)\n", + -r, kvm_strerror(-r)); +} + static void test_introspection(struct kvm_vm *vm) { setup_socket(); hook_introspection(vm); + + test_cmd_invalid(); + unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index b1ea39f35481..547d3388ff8a 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -9,10 +9,49 @@ #include "kvmi_int.h" #define KVMI_NUM_COMMANDS KVMI_NUM_MESSAGES +#define KVMI_MSG_SIZE_ALLOC (sizeof(struct kvmi_msg_hdr) + KVMI_MSG_SIZE) + +static struct kmem_cache *msg_cache; + +void *kvmi_msg_alloc(void) +{ + return kmem_cache_zalloc(msg_cache, GFP_KERNEL); +} + +void kvmi_msg_free(void *addr) +{ + if (addr) + kmem_cache_free(msg_cache, addr); +} + +static void kvmi_cache_destroy(void) +{ + kmem_cache_destroy(msg_cache); + msg_cache = NULL; +} + +static int kvmi_cache_create(void) +{ + msg_cache = kmem_cache_create("kvmi_msg", KVMI_MSG_SIZE_ALLOC, + 4096, SLAB_ACCOUNT, NULL); + + if (!msg_cache) { + kvmi_cache_destroy(); + + return -1; + } + + return 0; +} + +bool kvmi_is_command_allowed(struct kvm_introspection *kvmi, u16 id) +{ + return id < KVMI_NUM_COMMANDS && test_bit(id, kvmi->cmd_allow_mask); +} int kvmi_init(void) { - return 0; + return kvmi_cache_create(); } int kvmi_version(void) @@ -22,6 +61,7 @@ int kvmi_version(void) void kvmi_uninit(void) { + kvmi_cache_destroy(); } static void free_kvmi(struct kvm *kvm) diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index f0a8d653d79b..5e4eabeefc5b 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -26,4 +26,9 @@ void kvmi_sock_shutdown(struct kvm_introspection *kvmi); void kvmi_sock_put(struct kvm_introspection *kvmi); bool kvmi_msg_process(struct kvm_introspection *kvmi); +/* kvmi.c */ +void *kvmi_msg_alloc(void); +void kvmi_msg_free(void *addr); +bool kvmi_is_command_allowed(struct kvm_introspection *kvmi, u16 id); + #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 3ae52c61f861..4e7b55ec7071 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -33,7 +33,154 @@ void kvmi_sock_shutdown(struct kvm_introspection *kvmi) kernel_sock_shutdown(kvmi->sock, SHUT_RDWR); } +static int kvmi_sock_read(struct kvm_introspection *kvmi, void *buf, + size_t size) +{ + struct kvec vec = { + .iov_base = buf, + .iov_len = size, + }; + struct msghdr m = { }; + int rc; + + rc = kernel_recvmsg(kvmi->sock, &m, &vec, 1, size, MSG_WAITALL); + + if (unlikely(rc != size && rc >= 0)) + rc = -EPIPE; + + return rc >= 0 ? 0 : rc; +} + +static int kvmi_sock_write(struct kvm_introspection *kvmi, struct kvec *vec, + size_t n, size_t size) +{ + struct msghdr m = { }; + int rc; + + rc = kernel_sendmsg(kvmi->sock, &m, vec, n, size); + + if (unlikely(rc != size && rc >= 0)) + rc = -EPIPE; + + return rc >= 0 ? 0 : rc; +} + +static int kvmi_msg_reply(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, int err, + const void *rpl, size_t rpl_size) +{ + struct kvmi_error_code ec; + struct kvmi_msg_hdr h; + struct kvec vec[3] = { + { .iov_base = &h, .iov_len = sizeof(h) }, + { .iov_base = &ec, .iov_len = sizeof(ec) }, + { .iov_base = (void *)rpl, .iov_len = rpl_size }, + }; + size_t size = sizeof(h) + sizeof(ec) + (err ? 0 : rpl_size); + size_t n = ARRAY_SIZE(vec) - (err ? 1 : 0); + + memset(&h, 0, sizeof(h)); + h.id = msg->id; + h.seq = msg->seq; + h.size = size - sizeof(h); + + memset(&ec, 0, sizeof(ec)); + ec.err = err; + + return kvmi_sock_write(kvmi, vec, n, size); +} + +static int kvmi_msg_vm_reply(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + int err, const void *rpl, + size_t rpl_size) +{ + return kvmi_msg_reply(kvmi, msg, err, rpl, rpl_size); +} + +/* + * These commands are executed by the receiving thread. + */ +static int(*const msg_vm[])(struct kvm_introspection *, + const struct kvmi_msg_hdr *, const void *) = { +}; + +static bool is_vm_command(u16 id) +{ + return id < ARRAY_SIZE(msg_vm) && !!msg_vm[id]; +} + +static struct kvmi_msg_hdr *kvmi_msg_recv(struct kvm_introspection *kvmi) +{ + struct kvmi_msg_hdr *msg; + int err; + + msg = kvmi_msg_alloc(); + if (!msg) + goto out; + + err = kvmi_sock_read(kvmi, msg, sizeof(*msg)); + if (err) + goto out_err; + + if (msg->size) { + if (msg->size > KVMI_MSG_SIZE) + goto out_err; + + err = kvmi_sock_read(kvmi, msg + 1, msg->size); + if (err) + goto out_err; + } + + return msg; + +out_err: + kvmi_msg_free(msg); +out: + return NULL; +} + +static int kvmi_msg_do_vm_cmd(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg) +{ + return msg_vm[msg->id](kvmi, msg, msg + 1); +} + +static bool is_message_allowed(struct kvm_introspection *kvmi, u16 id) +{ + return kvmi_is_command_allowed(kvmi, id); +} + +static int kvmi_msg_vm_reply_ec(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, int ec) +{ + return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); +} + +static int kvmi_msg_handle_vm_cmd(struct kvm_introspection *kvmi, + struct kvmi_msg_hdr *msg) +{ + if (!is_message_allowed(kvmi, msg->id)) + return kvmi_msg_vm_reply_ec(kvmi, msg, -KVM_EPERM); + + return kvmi_msg_do_vm_cmd(kvmi, msg); +} + bool kvmi_msg_process(struct kvm_introspection *kvmi) { - return false; + struct kvmi_msg_hdr *msg; + int err = -1; + + msg = kvmi_msg_recv(kvmi); + if (!msg) + goto out; + + if (is_vm_command(msg->id)) + err = kvmi_msg_handle_vm_cmd(kvmi, msg); + else + err = kvmi_msg_vm_reply_ec(kvmi, msg, -KVM_ENOSYS); + + kvmi_msg_free(msg); +out: + return err == 0; }
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 41/84] KVM: introspection: add KVMI_GET_VERSION
The kernel side will accept older and newer versions of an introspection command (having a smaller/larger message size), but it will not accept newer versions for event replies (larger messages). Even if the introspection tool can use the KVMI_GET_VERSION command to check the supported features of the introspection API, the most important usage of this command is to avoid sending newer versions of event replies that the kernel side doesn't know. Any attempt from the device manager to explicitly disallow this command through the KVM_INTROSPECTION_COMMAND ioctl will get -EPERM, unless all commands are disallowed (using id=-1) in which case KVMI_GET_VERSION is silently allowed, without error. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 37 +++++++++++++++++++ include/uapi/linux/kvmi.h | 10 +++++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 26 +++++++++++++ virt/kvm/introspection/kvmi.c | 27 ++++++++++++-- virt/kvm/introspection/kvmi_msg.c | 12 ++++++ 5 files changed, 108 insertions(+), 4 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index f3d16971ba2b..41fd48222bcb 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -224,3 +224,40 @@ device-specific memory (DMA, emulated MMIO, reserved by a passthrough device etc.). It is up to the user to determine, using the guest operating system data structures, the areas that are safe to access (code, stack, heap etc.). + +Commands +-------- + +The following C structures are meant to be used directly when communicating +over the wire. The peer that detects any size mismatch should simply close +the connection and report the error. + +1. KVMI_GET_VERSION +------------------- + +:Architectures: all +:Versions: >= 1 +:Parameters: none +:Returns: + +:: + + struct kvmi_error_code; + struct kvmi_get_version_reply { + __u32 version; + __u32 padding; + }; + +Returns the introspection API version. + +This command is always allowed and successful. + +The messages used for introspection commands/events might be extended +in future versions and while the kernel will accept commands with +shorter messages (older versions) or larger messages (newer versions, +ignoring the extra information), it will not accept event replies with +larger/newer messages. + +The introspection tool should use this command to identify the features +supported by the kernel side and what messages must be used for event +replies. diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 9bfff484fd6f..896fcb6abf2c 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -6,11 +6,16 @@ * KVMI structures and definitions */ +#include <linux/kernel.h> +#include <linux/types.h> + enum { KVMI_VERSION = 0x00000001 }; enum { + KVMI_GET_VERSION = 1, + KVMI_NUM_MESSAGES }; @@ -39,4 +44,9 @@ struct kvmi_error_code { __u32 padding; }; +struct kvmi_get_version_reply { + __u32 version; + __u32 padding; +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 9c591e0d9c8a..d15eccc330e5 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -98,6 +98,7 @@ static void hook_introspection(struct kvm_vm *vm) do_hook_ioctl(vm, Kvm_socket, no_padding, 0); do_hook_ioctl(vm, Kvm_socket, no_padding, EEXIST); + set_command_perm(vm, KVMI_GET_VERSION, disallow, EPERM); set_command_perm(vm, all_IDs, allow_inval, EINVAL); set_command_perm(vm, all_IDs, disallow, 0); set_command_perm(vm, all_IDs, allow, 0); @@ -213,12 +214,37 @@ static void test_cmd_invalid(void) -r, kvm_strerror(-r)); } +static void test_vm_command(int cmd_id, struct kvmi_msg_hdr *req, + size_t req_size, void *rpl, size_t rpl_size) +{ + int r; + + r = do_command(cmd_id, req, req_size, rpl, rpl_size); + TEST_ASSERT(r == 0, + "Command %d failed, error %d (%s)\n", + cmd_id, -r, kvm_strerror(-r)); +} + +static void test_cmd_get_version(void) +{ + struct kvmi_get_version_reply rpl; + struct kvmi_msg_hdr req; + + test_vm_command(KVMI_GET_VERSION, &req, sizeof(req), &rpl, sizeof(rpl)); + TEST_ASSERT(rpl.version == KVMI_VERSION, + "Unexpected KVMI version %d, expecting %d\n", + rpl.version, KVMI_VERSION); + + pr_info("KVMI version: %u\n", rpl.version); +} + static void test_introspection(struct kvm_vm *vm) { setup_socket(); hook_introspection(vm); test_cmd_invalid(); + test_cmd_get_version(); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 547d3388ff8a..c44aa49dc6b5 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -11,6 +11,8 @@ #define KVMI_NUM_COMMANDS KVMI_NUM_MESSAGES #define KVMI_MSG_SIZE_ALLOC (sizeof(struct kvmi_msg_hdr) + KVMI_MSG_SIZE) +static DECLARE_BITMAP(Kvmi_always_allowed_commands, KVMI_NUM_COMMANDS); + static struct kmem_cache *msg_cache; void *kvmi_msg_alloc(void) @@ -49,8 +51,16 @@ bool kvmi_is_command_allowed(struct kvm_introspection *kvmi, u16 id) return id < KVMI_NUM_COMMANDS && test_bit(id, kvmi->cmd_allow_mask); } +static void setup_always_allowed_commands(void) +{ + bitmap_zero(Kvmi_always_allowed_commands, KVMI_NUM_COMMANDS); + set_bit(KVMI_GET_VERSION, Kvmi_always_allowed_commands); +} + int kvmi_init(void) { + setup_always_allowed_commands(); + return kvmi_cache_create(); } @@ -94,6 +104,9 @@ alloc_kvmi(struct kvm *kvm, const struct kvm_introspection_hook *hook) BUILD_BUG_ON(sizeof(hook->uuid) != sizeof(kvmi->uuid)); memcpy(&kvmi->uuid, &hook->uuid, sizeof(kvmi->uuid)); + bitmap_copy(kvmi->cmd_allow_mask, Kvmi_always_allowed_commands, + KVMI_NUM_COMMANDS); + kvmi->kvm = kvm; return kvmi; @@ -303,8 +316,8 @@ int kvmi_ioctl_event(struct kvm *kvm, return err; } -static void kvmi_control_allowed_commands(struct kvm_introspection *kvmi, - s32 id, bool allow) +static int kvmi_control_allowed_commands(struct kvm_introspection *kvmi, + s32 id, bool allow) { s32 all_commands = -1; @@ -315,10 +328,16 @@ static void kvmi_control_allowed_commands(struct kvm_introspection *kvmi, set_bit(id, kvmi->cmd_allow_mask); } else { if (id == all_commands) - bitmap_zero(kvmi->cmd_allow_mask, KVMI_NUM_COMMANDS); + bitmap_copy(kvmi->cmd_allow_mask, + Kvmi_always_allowed_commands, + KVMI_NUM_COMMANDS); + else if (test_bit(id, Kvmi_always_allowed_commands)) + return -EPERM; else clear_bit(id, kvmi->cmd_allow_mask); } + + return 0; } int kvmi_ioctl_command(struct kvm *kvm, @@ -337,7 +356,7 @@ int kvmi_ioctl_command(struct kvm *kvm, kvmi = KVMI(kvm); if (kvmi) - kvmi_control_allowed_commands(kvmi, id, allow); + err = kvmi_control_allowed_commands(kvmi, id, allow); else err = -EFAULT; diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 4e7b55ec7071..386636aa9832 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -98,11 +98,23 @@ static int kvmi_msg_vm_reply(struct kvm_introspection *kvmi, return kvmi_msg_reply(kvmi, msg, err, rpl, rpl_size); } +static int handle_get_version(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, const void *req) +{ + struct kvmi_get_version_reply rpl; + + memset(&rpl, 0, sizeof(rpl)); + rpl.version = kvmi_version(); + + return kvmi_msg_vm_reply(kvmi, msg, 0, &rpl, sizeof(rpl)); +} + /* * These commands are executed by the receiving thread. */ static int(*const msg_vm[])(struct kvm_introspection *, const struct kvmi_msg_hdr *, const void *) = { + [KVMI_GET_VERSION] = handle_get_version, }; static bool is_vm_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 42/84] KVM: introspection: add KVMI_VM_CHECK_COMMAND and KVMI_VM_CHECK_EVENT
These commands are used to check what introspection commands and events are supported (kernel) and allowed (device manager). These are alternative methods to KVMI_GET_VERSION in checking if the introspection supports a specific command/event. As with the KVMI_GET_VERSION command, these two commands can never be disallowed by the device manager. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 62 +++++++++++++++++++ include/uapi/linux/kvmi.h | 16 ++++- .../testing/selftests/kvm/x86_64/kvmi_test.c | 59 ++++++++++++++++++ virt/kvm/introspection/kvmi.c | 14 +++++ virt/kvm/introspection/kvmi_int.h | 1 + virt/kvm/introspection/kvmi_msg.c | 45 +++++++++++++- 6 files changed, 195 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 41fd48222bcb..a2cda3268da0 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -261,3 +261,65 @@ larger/newer messages. The introspection tool should use this command to identify the features supported by the kernel side and what messages must be used for event replies. + +2. KVMI_VM_CHECK_COMMAND +------------------------ + +:Architectures: all +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vm_check_command { + __u16 id; + __u16 padding1; + __u32 padding2; + }; + +:Returns: + +:: + + struct kvmi_error_code; + +Checks if the command specified by ``id`` is supported and allowed. + +This command is always allowed. + +:Errors: + +* -KVM_ENOENT - the command specified by ``id`` is unsupported +* -KVM_EPERM - the command specified by ``id`` is disallowed +* -KVM_EINVAL - the padding is not zero + +3. KVMI_VM_CHECK_EVENT +---------------------- + +:Architectures: all +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vm_check_event { + __u16 id; + __u16 padding1; + __u32 padding2; + }; + +:Returns: + +:: + + struct kvmi_error_code; + +Checks if the event specified by ``id`` is supported and allowed. + +This command is always allowed. + +:Errors: + +* -KVM_ENOENT - the event specified by ``id`` is unsupported +* -KVM_EPERM - the event specified by ``id`` is disallowed +* -KVM_EINVAL - the padding is not zero diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 896fcb6abf2c..e55a0fa66ac5 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -14,7 +14,9 @@ enum { }; enum { - KVMI_GET_VERSION = 1, + KVMI_GET_VERSION = 1, + KVMI_VM_CHECK_COMMAND = 2, + KVMI_VM_CHECK_EVENT = 3, KVMI_NUM_MESSAGES }; @@ -49,4 +51,16 @@ struct kvmi_get_version_reply { __u32 padding; }; +struct kvmi_vm_check_command { + __u16 id; + __u16 padding1; + __u32 padding2; +}; + +struct kvmi_vm_check_event { + __u16 id; + __u16 padding1; + __u32 padding2; +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index d15eccc330e5..28216c4e8b9d 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -99,6 +99,8 @@ static void hook_introspection(struct kvm_vm *vm) do_hook_ioctl(vm, Kvm_socket, no_padding, EEXIST); set_command_perm(vm, KVMI_GET_VERSION, disallow, EPERM); + set_command_perm(vm, KVMI_VM_CHECK_COMMAND, disallow, EPERM); + set_command_perm(vm, KVMI_VM_CHECK_EVENT, disallow, EPERM); set_command_perm(vm, all_IDs, allow_inval, EINVAL); set_command_perm(vm, all_IDs, disallow, 0); set_command_perm(vm, all_IDs, allow, 0); @@ -238,6 +240,61 @@ static void test_cmd_get_version(void) pr_info("KVMI version: %u\n", rpl.version); } +static void cmd_vm_check_command(__u16 id, __u16 padding, int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vm_check_command cmd; + } req = {}; + int r; + + req.cmd.id = id; + req.cmd.padding1 = padding; + req.cmd.padding2 = padding; + + r = do_command(KVMI_VM_CHECK_COMMAND, &req.hdr, sizeof(req), NULL, 0); + TEST_ASSERT(r == expected_err, + "KVMI_VM_CHECK_COMMAND failed, error %d (%s), expected %d\n", + -r, kvm_strerror(-r), expected_err); +} + +static void test_cmd_vm_check_command(void) +{ + __u16 valid_id = KVMI_GET_VERSION, invalid_id = 0xffff; + __u16 padding = 1, no_padding = 0; + + cmd_vm_check_command(valid_id, no_padding, 0); + cmd_vm_check_command(valid_id, padding, -KVM_EINVAL); + cmd_vm_check_command(invalid_id, no_padding, -KVM_ENOENT); +} + +static void cmd_vm_check_event(__u16 id, __u16 padding, int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vm_check_event cmd; + } req = {}; + int r; + + req.cmd.id = id; + req.cmd.padding1 = padding; + req.cmd.padding2 = padding; + + r = do_command(KVMI_VM_CHECK_EVENT, &req.hdr, sizeof(req), NULL, 0); + TEST_ASSERT(r == expected_err, + "KVMI_VM_CHECK_EVENT failed, error %d (%s), expected %d\n", + -r, kvm_strerror(-r), expected_err); +} + +static void test_cmd_vm_check_event(void) +{ + __u16 invalid_id = 0xffff; + __u16 padding = 1, no_padding = 0; + + cmd_vm_check_event(invalid_id, padding, -KVM_EINVAL); + cmd_vm_check_event(invalid_id, no_padding, -KVM_ENOENT); +} + static void test_introspection(struct kvm_vm *vm) { setup_socket(); @@ -245,6 +302,8 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_invalid(); test_cmd_get_version(); + test_cmd_vm_check_command(); + test_cmd_vm_check_event(); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index c44aa49dc6b5..f5ca49167f70 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -12,6 +12,7 @@ #define KVMI_MSG_SIZE_ALLOC (sizeof(struct kvmi_msg_hdr) + KVMI_MSG_SIZE) static DECLARE_BITMAP(Kvmi_always_allowed_commands, KVMI_NUM_COMMANDS); +static DECLARE_BITMAP(Kvmi_known_events, KVMI_NUM_EVENTS); static struct kmem_cache *msg_cache; @@ -51,15 +52,28 @@ bool kvmi_is_command_allowed(struct kvm_introspection *kvmi, u16 id) return id < KVMI_NUM_COMMANDS && test_bit(id, kvmi->cmd_allow_mask); } +bool kvmi_is_known_event(u8 id) +{ + return id < KVMI_NUM_EVENTS && test_bit(id, Kvmi_known_events); +} + static void setup_always_allowed_commands(void) { bitmap_zero(Kvmi_always_allowed_commands, KVMI_NUM_COMMANDS); set_bit(KVMI_GET_VERSION, Kvmi_always_allowed_commands); + set_bit(KVMI_VM_CHECK_COMMAND, Kvmi_always_allowed_commands); + set_bit(KVMI_VM_CHECK_EVENT, Kvmi_always_allowed_commands); +} + +static void setup_known_events(void) +{ + bitmap_zero(Kvmi_known_events, KVMI_NUM_EVENTS); } int kvmi_init(void) { setup_always_allowed_commands(); + setup_known_events(); return kvmi_cache_create(); } diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 5e4eabeefc5b..0bca4bd0a415 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -30,5 +30,6 @@ bool kvmi_msg_process(struct kvm_introspection *kvmi); void *kvmi_msg_alloc(void); void kvmi_msg_free(void *addr); bool kvmi_is_command_allowed(struct kvm_introspection *kvmi, u16 id); +bool kvmi_is_known_event(u8 id); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 386636aa9832..86c356afc154 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -8,6 +8,8 @@ #include <linux/net.h> #include "kvmi_int.h" +static bool is_vm_command(u16 id); + bool kvmi_sock_get(struct kvm_introspection *kvmi, int fd) { struct socket *sock; @@ -109,12 +111,53 @@ static int handle_get_version(struct kvm_introspection *kvmi, return kvmi_msg_vm_reply(kvmi, msg, 0, &rpl, sizeof(rpl)); } +static int handle_vm_check_command(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vm_check_command *req = _req; + int ec = 0; + + if (req->padding1 || req->padding2) + ec = -KVM_EINVAL; + else if (!is_vm_command(req->id)) + ec = -KVM_ENOENT; + else if (!kvmi_is_command_allowed(kvmi, req->id)) + ec = -KVM_EPERM; + + return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); +} + +static bool is_event_allowed(struct kvm_introspection *kvmi, u16 id) +{ + return id < KVMI_NUM_EVENTS && test_bit(id, kvmi->event_allow_mask); +} + +static int handle_vm_check_event(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vm_check_event *req = _req; + int ec = 0; + + if (req->padding1 || req->padding2) + ec = -KVM_EINVAL; + else if (!kvmi_is_known_event(req->id)) + ec = -KVM_ENOENT; + else if (!is_event_allowed(kvmi, req->id)) + ec = -KVM_EPERM; + + return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); +} + /* * These commands are executed by the receiving thread. */ static int(*const msg_vm[])(struct kvm_introspection *, const struct kvmi_msg_hdr *, const void *) = { - [KVMI_GET_VERSION] = handle_get_version, + [KVMI_GET_VERSION] = handle_get_version, + [KVMI_VM_CHECK_COMMAND] = handle_vm_check_command, + [KVMI_VM_CHECK_EVENT] = handle_vm_check_event, }; static bool is_vm_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 43/84] KVM: introspection: add KVMI_VM_GET_INFO
From: Mihai Don?u <mdontu at bitdefender.com> For now, this command returns only the number of online vCPUs. The introspection tool uses the vCPU index to specify to which vCPU the introspection command applies to. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 18 ++++++++++ include/uapi/linux/kvmi.h | 6 ++++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 35 +++++++++++++++++-- virt/kvm/introspection/kvmi_msg.c | 13 +++++++ 4 files changed, 69 insertions(+), 3 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index a2cda3268da0..a81f22cb8c18 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -323,3 +323,21 @@ This command is always allowed. * -KVM_ENOENT - the event specified by ``id`` is unsupported * -KVM_EPERM - the event specified by ``id`` is disallowed * -KVM_EINVAL - the padding is not zero + +4. KVMI_VM_GET_INFO +------------------- + +:Architectures: all +:Versions: >= 1 +:Parameters: none +:Returns: + +:: + + struct kvmi_error_code; + struct kvmi_vm_get_info_reply { + __u32 vcpu_count; + __u32 padding[3]; + }; + +Returns the number of online vCPUs. diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index e55a0fa66ac5..eabaf7cea1df 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -17,6 +17,7 @@ enum { KVMI_GET_VERSION = 1, KVMI_VM_CHECK_COMMAND = 2, KVMI_VM_CHECK_EVENT = 3, + KVMI_VM_GET_INFO = 4, KVMI_NUM_MESSAGES }; @@ -63,4 +64,9 @@ struct kvmi_vm_check_event { __u32 padding2; }; +struct kvmi_vm_get_info_reply { + __u32 vcpu_count; + __u32 padding[3]; +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 28216c4e8b9d..1f4a165ab640 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -84,6 +84,16 @@ static void set_command_perm(struct kvm_vm *vm, __s32 id, __u32 allow, "KVM_INTROSPECTION_COMMAND"); } +static void disallow_command(struct kvm_vm *vm, __s32 id) +{ + set_command_perm(vm, id, 0, 0); +} + +static void allow_command(struct kvm_vm *vm, __s32 id) +{ + set_command_perm(vm, id, 1, 0); +} + static void hook_introspection(struct kvm_vm *vm) { __u32 allow = 1, disallow = 0, allow_inval = 2; @@ -258,14 +268,18 @@ static void cmd_vm_check_command(__u16 id, __u16 padding, int expected_err) -r, kvm_strerror(-r), expected_err); } -static void test_cmd_vm_check_command(void) +static void test_cmd_vm_check_command(struct kvm_vm *vm) { - __u16 valid_id = KVMI_GET_VERSION, invalid_id = 0xffff; + __u16 valid_id = KVMI_VM_GET_INFO, invalid_id = 0xffff; __u16 padding = 1, no_padding = 0; cmd_vm_check_command(valid_id, no_padding, 0); cmd_vm_check_command(valid_id, padding, -KVM_EINVAL); cmd_vm_check_command(invalid_id, no_padding, -KVM_ENOENT); + + disallow_command(vm, valid_id); + cmd_vm_check_command(valid_id, no_padding, -KVM_EPERM); + allow_command(vm, valid_id); } static void cmd_vm_check_event(__u16 id, __u16 padding, int expected_err) @@ -295,6 +309,20 @@ static void test_cmd_vm_check_event(void) cmd_vm_check_event(invalid_id, no_padding, -KVM_ENOENT); } +static void test_cmd_vm_get_info(void) +{ + struct kvmi_vm_get_info_reply rpl; + struct kvmi_msg_hdr req; + + test_vm_command(KVMI_VM_GET_INFO, &req, sizeof(req), &rpl, + sizeof(rpl)); + TEST_ASSERT(rpl.vcpu_count == 1, + "Unexpected number of vCPU count %u\n", + rpl.vcpu_count); + + pr_info("vcpu count: %u\n", rpl.vcpu_count); +} + static void test_introspection(struct kvm_vm *vm) { setup_socket(); @@ -302,8 +330,9 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_invalid(); test_cmd_get_version(); - test_cmd_vm_check_command(); + test_cmd_vm_check_command(vm); test_cmd_vm_check_event(); + test_cmd_vm_get_info(); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 86c356afc154..3df18f7965c0 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -150,6 +150,18 @@ static int handle_vm_check_event(struct kvm_introspection *kvmi, return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); } +static int handle_vm_get_info(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + struct kvmi_vm_get_info_reply rpl; + + memset(&rpl, 0, sizeof(rpl)); + rpl.vcpu_count = atomic_read(&kvmi->kvm->online_vcpus); + + return kvmi_msg_vm_reply(kvmi, msg, 0, &rpl, sizeof(rpl)); +} + /* * These commands are executed by the receiving thread. */ @@ -158,6 +170,7 @@ static int(*const msg_vm[])(struct kvm_introspection *, [KVMI_GET_VERSION] = handle_get_version, [KVMI_VM_CHECK_COMMAND] = handle_vm_check_command, [KVMI_VM_CHECK_EVENT] = handle_vm_check_event, + [KVMI_VM_GET_INFO] = handle_vm_get_info, }; static bool is_vm_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 44/84] KVM: introspection: add KVMI_EVENT_UNHOOK
In certain situations (when the guest has to be paused, suspended, migrated, etc.), the device manager will use the KVM_INTROSPECTION_PREUNHOOK ioctl in order to trigger the KVMI_EVENT_UNHOOK event. If the event is sent successfully (the VM has an active introspection channel), the device manager should delay the action (pause/suspend/...) to give the introspection tool the chance to remove its hooks (eg. breakpoints) while the guest is still running. Once a timeout is reached or the introspection tool has closed the socket, the device manager should resume the action. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/api.rst | 28 ++++++++ Documentation/virt/kvm/kvmi.rst | 70 +++++++++++++++++-- arch/x86/include/asm/kvmi_host.h | 2 + arch/x86/include/uapi/asm/kvmi.h | 29 ++++++++ include/linux/kvmi_host.h | 3 + include/uapi/linux/kvm.h | 2 + include/uapi/linux/kvmi.h | 13 ++++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 65 ++++++++++++++++- virt/kvm/introspection/kvmi.c | 42 ++++++++++- virt/kvm/introspection/kvmi_int.h | 1 + virt/kvm/introspection/kvmi_msg.c | 38 ++++++++++ virt/kvm/kvm_main.c | 6 ++ 12 files changed, 291 insertions(+), 8 deletions(-) create mode 100644 arch/x86/include/uapi/asm/kvmi.h diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 174f13f2389d..fb3ccafac90d 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -4819,6 +4819,34 @@ the event is disallowed. Unless set to -1 (meaning all events), id must be a event ID (e.g. KVMI_EVENT_UNHOOK, KVMI_EVENT_CR, etc.) +4.130 KVM_INTROSPECTION_PREUNHOOK +--------------------------------- + +:Capability: KVM_CAP_INTROSPECTION +:Architectures: x86 +:Type: vm ioctl +:Parameters: none +:Returns: 0 on success, a negative value on error + +Errors: + + ====== ===========================================================+ EFAULT the VM is not introspected yet (use KVM_INTROSPECTION_HOOK) + ENOENT the socket (passed with KVM_INTROSPECTION_HOOK) had an error + ENOENT the introspection tool didn't subscribed + to this type of introspection event (unhook) + ====== ===========================================================+ +This ioctl is used to inform that the current VM is +paused/suspended/migrated/etc. + +KVM should send an 'unhook' introspection event to the introspection tool. + +If this ioctl is successful, the userspace should give the +introspection tool a chance to unhook the VM and then it should use +KVM_INTROSPECTION_UNHOOK to make sure all the introspection structures +are freed. + 5. The kvm_run structure ======================= diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index a81f22cb8c18..4174c969cb47 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -194,10 +194,10 @@ becomes necessary to remove them before the guest is suspended, moved (migrated) or a snapshot with memory is created. The actions are normally performed by the device manager. In the case -of QEMU, it will use another ioctl to notify the introspection tool and -wait for a limited amount of time (a few seconds) for a confirmation that -is OK to proceed (it is enough for the introspection tool to close -the connection). +of QEMU, it will use the *KVM_INTROSPECTION_PREUNHOOK* ioctl to trigger +the *KVMI_EVENT_UNHOOK* event and wait for a limited amount of time +(a few seconds) for a confirmation that is OK to proceed (it is enough +for the introspection tool to close the connection). Live migrations --------------- @@ -341,3 +341,65 @@ This command is always allowed. }; Returns the number of online vCPUs. + +Events +=====+ +All introspection events (VM or vCPU related) are sent +using the *KVMI_EVENT* message id. + +The *KVMI_EVENT_UNHOOK* event doesn't have a reply and share the kvmi_event +structure, for consistency with the vCPU events. + +The message data begins with a common structure, having the size of the +structure, the vCPU index and the event id:: + + struct kvmi_event { + __u16 size; + __u16 vcpu; + __u8 event; + __u8 padding[3]; + struct kvmi_event_arch arch; + } + +On x86 the structure looks like this:: + + struct kvmi_event_arch { + __u8 mode; + __u8 padding[7]; + struct kvm_regs regs; + struct kvm_sregs sregs; + struct { + __u64 sysenter_cs; + __u64 sysenter_esp; + __u64 sysenter_eip; + __u64 efer; + __u64 star; + __u64 lstar; + __u64 cstar; + __u64 pat; + __u64 shadow_gs; + } msrs; + }; + +It contains information about the vCPU state at the time of the event. + +Specific event data can follow these common structures. + +1. KVMI_EVENT_UNHOOK +-------------------- + +:Architecture: all +:Versions: >= 1 +:Actions: none +:Parameters: + +:: + + struct kvmi_event; + +:Returns: none + +This event is sent when the device manager has to pause/stop/migrate the +guest (see **Unhooking**). The introspection tool has a chance to unhook +and close the KVMI channel (signaling that the operation can proceed). diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 38c398262913..747f3d779e15 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -2,6 +2,8 @@ #ifndef _ASM_X86_KVMI_HOST_H #define _ASM_X86_KVMI_HOST_H +#include <asm/kvmi.h> + struct kvm_arch_introspection { }; diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h new file mode 100644 index 000000000000..551f9ed1ed9c --- /dev/null +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_ASM_X86_KVMI_H +#define _UAPI_ASM_X86_KVMI_H + +/* + * KVM introspection - x86 specific structures and definitions + */ + +#include <asm/kvm.h> + +struct kvmi_event_arch { + __u8 mode; /* 2, 4 or 8 */ + __u8 padding[7]; + struct kvm_regs regs; + struct kvm_sregs sregs; + struct { + __u64 sysenter_cs; + __u64 sysenter_esp; + __u64 sysenter_eip; + __u64 efer; + __u64 star; + __u64 lstar; + __u64 cstar; + __u64 pat; + __u64 shadow_gs; + } msrs; +}; + +#endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 7efd071e398d..8d21e031788e 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -17,6 +17,8 @@ struct kvm_introspection { unsigned long *cmd_allow_mask; unsigned long *event_allow_mask; + + atomic_t ev_seq; }; int kvmi_version(void); @@ -32,6 +34,7 @@ int kvmi_ioctl_command(struct kvm *kvm, const struct kvm_introspection_feature *feat); int kvmi_ioctl_event(struct kvm *kvm, const struct kvm_introspection_feature *feat); +int kvmi_ioctl_preunhook(struct kvm *kvm); #else diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 17df03ceb483..06d88157de20 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1630,6 +1630,8 @@ struct kvm_introspection_feature { #define KVM_INTROSPECTION_COMMAND _IOW(KVMIO, 0xc5, struct kvm_introspection_feature) #define KVM_INTROSPECTION_EVENT _IOW(KVMIO, 0xc6, struct kvm_introspection_feature) +#define KVM_INTROSPECTION_PREUNHOOK _IO(KVMIO, 0xc7) + #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) #define KVM_DEV_ASSIGN_MASK_INTX (1 << 2) diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index eabaf7cea1df..9fbe52caf96c 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -8,12 +8,15 @@ #include <linux/kernel.h> #include <linux/types.h> +#include <asm/kvmi.h> enum { KVMI_VERSION = 0x00000001 }; enum { + KVMI_EVENT = 0, + KVMI_GET_VERSION = 1, KVMI_VM_CHECK_COMMAND = 2, KVMI_VM_CHECK_EVENT = 3, @@ -23,6 +26,8 @@ enum { }; enum { + KVMI_EVENT_UNHOOK = 0, + KVMI_NUM_EVENTS }; @@ -69,4 +74,12 @@ struct kvmi_vm_get_info_reply { __u32 padding[3]; }; +struct kvmi_event { + __u16 size; + __u16 vcpu; + __u8 event; + __u8 padding[3]; + struct kvmi_event_arch arch; +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 1f4a165ab640..3d46d6e6b38c 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -72,6 +72,11 @@ static void set_event_perm(struct kvm_vm *vm, __s32 id, __u32 allow, "KVM_INTROSPECTION_EVENT"); } +static void disallow_event(struct kvm_vm *vm, __s32 event_id) +{ + set_event_perm(vm, event_id, 0, 0); +} + static void allow_event(struct kvm_vm *vm, __s32 event_id) { set_event_perm(vm, event_id, 1, 0); @@ -300,13 +305,20 @@ static void cmd_vm_check_event(__u16 id, __u16 padding, int expected_err) -r, kvm_strerror(-r), expected_err); } -static void test_cmd_vm_check_event(void) +static void test_cmd_vm_check_event(struct kvm_vm *vm) { - __u16 invalid_id = 0xffff; + __u16 valid_id = KVMI_EVENT_UNHOOK, invalid_id = 0xffff; __u16 padding = 1, no_padding = 0; cmd_vm_check_event(invalid_id, padding, -KVM_EINVAL); cmd_vm_check_event(invalid_id, no_padding, -KVM_ENOENT); + + cmd_vm_check_event(valid_id, no_padding, 0); + cmd_vm_check_event(valid_id, padding, -KVM_EINVAL); + + disallow_event(vm, valid_id); + cmd_vm_check_event(valid_id, 0, -KVM_EPERM); + allow_event(vm, valid_id); } static void test_cmd_vm_get_info(void) @@ -323,6 +335,52 @@ static void test_cmd_vm_get_info(void) pr_info("vcpu count: %u\n", rpl.vcpu_count); } +static void trigger_event_unhook_notification(struct kvm_vm *vm) +{ + int r; + + r = ioctl(vm->fd, KVM_INTROSPECTION_PREUNHOOK, NULL); + TEST_ASSERT(r == 0, + "KVM_INTROSPECTION_PREUNHOOK failed, errno %d (%s)\n", + errno, strerror(errno)); +} + +static void receive_event(struct kvmi_msg_hdr *hdr, struct kvmi_event *ev, + size_t ev_size, int event_id) +{ + size_t to_read = ev_size; + + receive_data(hdr, sizeof(*hdr)); + + TEST_ASSERT(hdr->id == KVMI_EVENT, + "Unexpected messages id %d, expected %d\n", + hdr->id, KVMI_EVENT); + + if (to_read > hdr->size) + to_read = hdr->size; + + receive_data(ev, to_read); + + TEST_ASSERT(ev->event == event_id, + "Unexpected event %d, expected %d\n", + ev->event, event_id); + + TEST_ASSERT(hdr->size == ev_size, + "Invalid event size %d, expected %zd bytes\n", + hdr->size, ev_size); +} + +static void test_event_unhook(struct kvm_vm *vm) +{ + __u16 id = KVMI_EVENT_UNHOOK; + struct kvmi_msg_hdr hdr; + struct kvmi_event ev; + + trigger_event_unhook_notification(vm); + + receive_event(&hdr, &ev, sizeof(ev), id); +} + static void test_introspection(struct kvm_vm *vm) { setup_socket(); @@ -331,8 +389,9 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_invalid(); test_cmd_get_version(); test_cmd_vm_check_command(vm); - test_cmd_vm_check_event(); + test_cmd_vm_check_event(vm); test_cmd_vm_get_info(); + test_event_unhook(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index f5ca49167f70..f128b1407c84 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -13,6 +13,8 @@ static DECLARE_BITMAP(Kvmi_always_allowed_commands, KVMI_NUM_COMMANDS); static DECLARE_BITMAP(Kvmi_known_events, KVMI_NUM_EVENTS); +static DECLARE_BITMAP(Kvmi_known_vm_events, KVMI_NUM_EVENTS); +static DECLARE_BITMAP(Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); static struct kmem_cache *msg_cache; @@ -67,7 +69,13 @@ static void setup_always_allowed_commands(void) static void setup_known_events(void) { - bitmap_zero(Kvmi_known_events, KVMI_NUM_EVENTS); + bitmap_zero(Kvmi_known_vm_events, KVMI_NUM_EVENTS); + set_bit(KVMI_EVENT_UNHOOK, Kvmi_known_vm_events); + + bitmap_zero(Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); + + bitmap_or(Kvmi_known_events, Kvmi_known_vm_events, + Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); } int kvmi_init(void) @@ -121,6 +129,8 @@ alloc_kvmi(struct kvm *kvm, const struct kvm_introspection_hook *hook) bitmap_copy(kvmi->cmd_allow_mask, Kvmi_always_allowed_commands, KVMI_NUM_COMMANDS); + atomic_set(&kvmi->ev_seq, 0); + kvmi->kvm = kvm; return kvmi; @@ -377,3 +387,33 @@ int kvmi_ioctl_command(struct kvm *kvm, mutex_unlock(&kvm->kvmi_lock); return err; } + +static bool kvmi_unhook_event(struct kvm_introspection *kvmi) +{ + int err; + + err = kvmi_msg_send_unhook(kvmi); + + return !err; +} + +int kvmi_ioctl_preunhook(struct kvm *kvm) +{ + struct kvm_introspection *kvmi; + int err = 0; + + mutex_lock(&kvm->kvmi_lock); + + kvmi = KVMI(kvm); + if (!kvmi) { + err = -EFAULT; + goto out; + } + + if (!kvmi_unhook_event(kvmi)) + err = -ENOENT; + +out: + mutex_unlock(&kvm->kvmi_lock); + return err; +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 0bca4bd0a415..c385dc0eb708 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -25,6 +25,7 @@ bool kvmi_sock_get(struct kvm_introspection *kvmi, int fd); void kvmi_sock_shutdown(struct kvm_introspection *kvmi); void kvmi_sock_put(struct kvm_introspection *kvmi); bool kvmi_msg_process(struct kvm_introspection *kvmi); +int kvmi_msg_send_unhook(struct kvm_introspection *kvmi); /* kvmi.c */ void *kvmi_msg_alloc(void); diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 3df18f7965c0..596f5c02bb8c 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -252,3 +252,41 @@ bool kvmi_msg_process(struct kvm_introspection *kvmi) out: return err == 0; } + +static void kvmi_setup_event_msg_hdr(struct kvm_introspection *kvmi, + struct kvmi_msg_hdr *hdr, + size_t msg_size) +{ + memset(hdr, 0, sizeof(*hdr)); + + hdr->id = KVMI_EVENT; + hdr->seq = atomic_inc_return(&kvmi->ev_seq); + hdr->size = msg_size - sizeof(*hdr); +} + +static void kvmi_setup_event_common(struct kvmi_event *ev, u32 ev_id, + u16 vcpu_idx) +{ + memset(ev, 0, sizeof(*ev)); + + ev->vcpu = vcpu_idx; + ev->event = ev_id; + ev->size = sizeof(*ev); +} + +int kvmi_msg_send_unhook(struct kvm_introspection *kvmi) +{ + struct kvmi_msg_hdr hdr; + struct kvmi_event common; + struct kvec vec[] = { + {.iov_base = &hdr, .iov_len = sizeof(hdr) }, + {.iov_base = &common, .iov_len = sizeof(common)}, + }; + size_t msg_size = sizeof(hdr) + sizeof(common); + size_t n = ARRAY_SIZE(vec); + + kvmi_setup_event_msg_hdr(kvmi, &hdr, msg_size); + kvmi_setup_event_common(&common, KVMI_EVENT_UNHOOK, 0); + + return kvmi_sock_write(kvmi, vec, n, msg_size); +} diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9aeb57e93165..86910e6c8d93 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3861,6 +3861,12 @@ static long kvm_vm_ioctl(struct file *filp, r = kvmi_ioctl_command(kvm, &feat); } break; + case KVM_INTROSPECTION_PREUNHOOK: + if (enable_introspection) + r = kvmi_ioctl_preunhook(kvm); + else + r = -EPERM; + break; #endif /* CONFIG_KVM_INTROSPECTION */ default: r = kvm_arch_vm_ioctl(filp, ioctl, arg);
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 45/84] KVM: introspection: add KVMI_VM_CONTROL_EVENTS
By default, all introspection VM events are disabled. The introspection tool must explicitly enable the VM events it wants to receive. With this command (KVMI_VM_CONTROL_EVENTS) it can enable/disable any VM event (e.g. KVMI_EVENT_UNHOOK) if allowed by the device manager. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 44 +++++++++++++-- include/linux/kvmi_host.h | 2 + include/uapi/linux/kvmi.h | 16 ++++-- .../testing/selftests/kvm/x86_64/kvmi_test.c | 54 +++++++++++++++++++ virt/kvm/introspection/kvmi.c | 30 ++++++++++- virt/kvm/introspection/kvmi_int.h | 3 ++ virt/kvm/introspection/kvmi_msg.c | 29 ++++++++-- 7 files changed, 165 insertions(+), 13 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 4174c969cb47..4ec0046b4138 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -342,11 +342,45 @@ This command is always allowed. Returns the number of online vCPUs. +5. KVMI_VM_CONTROL_EVENTS +------------------------- + +:Architectures: all +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vm_control_events { + __u16 event_id; + __u8 enable; + __u8 padding1; + __u32 padding2; + }; + +:Returns: + +:: + + struct kvmi_error_code + +Enables/disables VM introspection events. This command can be used with +the following events:: + + KVMI_EVENT_UNHOOK + +:Errors: + +* -KVM_EINVAL - the padding is not zero +* -KVM_EINVAL - the event ID is unknown (use *KVMI_VM_CHECK_EVENT* first) +* -KVM_EPERM - the access is disallowed (use *KVMI_VM_CHECK_EVENT* first) + Events ===== All introspection events (VM or vCPU related) are sent -using the *KVMI_EVENT* message id. +using the *KVMI_EVENT* message id. No event will be sent unless +it is explicitly enabled. The *KVMI_EVENT_UNHOOK* event doesn't have a reply and share the kvmi_event structure, for consistency with the vCPU events. @@ -400,6 +434,8 @@ Specific event data can follow these common structures. :Returns: none -This event is sent when the device manager has to pause/stop/migrate the -guest (see **Unhooking**). The introspection tool has a chance to unhook -and close the KVMI channel (signaling that the operation can proceed). +This event is sent when the device manager has to pause/stop/migrate +the guest (see **Unhooking**) and the introspection has been enabled +for this event (see **KVMI_VM_CONTROL_EVENTS**). The introspection tool +has a chance to unhook and close the KVMI channel (signaling that the +operation can proceed). diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 8d21e031788e..8e142096ba47 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -18,6 +18,8 @@ struct kvm_introspection { unsigned long *cmd_allow_mask; unsigned long *event_allow_mask; + unsigned long *vm_event_enable_mask; + atomic_t ev_seq; }; diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 9fbe52caf96c..f9e2cb8a2c5e 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -17,10 +17,11 @@ enum { enum { KVMI_EVENT = 0, - KVMI_GET_VERSION = 1, - KVMI_VM_CHECK_COMMAND = 2, - KVMI_VM_CHECK_EVENT = 3, - KVMI_VM_GET_INFO = 4, + KVMI_GET_VERSION = 1, + KVMI_VM_CHECK_COMMAND = 2, + KVMI_VM_CHECK_EVENT = 3, + KVMI_VM_GET_INFO = 4, + KVMI_VM_CONTROL_EVENTS = 5, KVMI_NUM_MESSAGES }; @@ -74,6 +75,13 @@ struct kvmi_vm_get_info_reply { __u32 padding[3]; }; +struct kvmi_vm_control_events { + __u16 event_id; + __u8 enable; + __u8 padding1; + __u32 padding2; +}; + struct kvmi_event { __u16 size; __u16 vcpu; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 3d46d6e6b38c..bb2daaca0291 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -370,15 +370,68 @@ static void receive_event(struct kvmi_msg_hdr *hdr, struct kvmi_event *ev, hdr->size, ev_size); } +static void cmd_vm_control_events(__u16 event_id, __u8 enable, __u16 padding, + int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vm_control_events cmd; + } req = {}; + int r; + + req.cmd.event_id = event_id; + req.cmd.enable = enable; + req.cmd.padding1 = padding; + req.cmd.padding2 = padding; + + r = do_command(KVMI_VM_CONTROL_EVENTS, &req.hdr, sizeof(req), + NULL, 0); + TEST_ASSERT(r == expected_err, + "KVMI_VM_CONTROL_EVENTS failed to enable VM event %d, error %d (%s), expected error %d\n", + event_id, -r, kvm_strerror(-r), expected_err); +} + +static void enable_vm_event(__u16 event_id) +{ + cmd_vm_control_events(event_id, 1, 0, 0); +} + +static void disable_vm_event(__u16 event_id) +{ + cmd_vm_control_events(event_id, 0, 0, 0); +} + static void test_event_unhook(struct kvm_vm *vm) { __u16 id = KVMI_EVENT_UNHOOK; struct kvmi_msg_hdr hdr; struct kvmi_event ev; + enable_vm_event(id); + trigger_event_unhook_notification(vm); receive_event(&hdr, &ev, sizeof(ev), id); + + disable_vm_event(id); +} + +static void test_cmd_vm_control_events(struct kvm_vm *vm) +{ + __u16 id = KVMI_EVENT_UNHOOK, invalid_id = 0xffff; + __u16 padding = 1, no_padding = 0; + __u8 enable = 1, enable_inval = 2; + + enable_vm_event(id); + disable_vm_event(id); + + cmd_vm_control_events(id, enable, padding, -KVM_EINVAL); + cmd_vm_control_events(id, enable_inval, no_padding, -KVM_EINVAL); + cmd_vm_control_events(invalid_id, enable, no_padding, -KVM_EINVAL); + + disallow_event(vm, id); + cmd_vm_control_events(id, enable, no_padding, -KVM_EPERM); + allow_event(vm, id); } static void test_introspection(struct kvm_vm *vm) @@ -392,6 +445,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vm_check_event(vm); test_cmd_vm_get_info(); test_event_unhook(vm); + test_cmd_vm_control_events(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index f128b1407c84..5af6ea041035 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -59,6 +59,16 @@ bool kvmi_is_known_event(u8 id) return id < KVMI_NUM_EVENTS && test_bit(id, Kvmi_known_events); } +bool kvmi_is_known_vm_event(u8 id) +{ + return id < KVMI_NUM_EVENTS && test_bit(id, Kvmi_known_vm_events); +} + +static bool is_vm_event_enabled(struct kvm_introspection *kvmi, int event) +{ + return test_bit(event, kvmi->vm_event_enable_mask); +} + static void setup_always_allowed_commands(void) { bitmap_zero(Kvmi_always_allowed_commands, KVMI_NUM_COMMANDS); @@ -100,6 +110,7 @@ static void free_kvmi(struct kvm *kvm) { bitmap_free(kvm->kvmi->cmd_allow_mask); bitmap_free(kvm->kvmi->event_allow_mask); + bitmap_free(kvm->kvmi->vm_event_enable_mask); kfree(kvm->kvmi); kvm->kvmi = NULL; @@ -116,9 +127,12 @@ alloc_kvmi(struct kvm *kvm, const struct kvm_introspection_hook *hook) kvmi->cmd_allow_mask = bitmap_zalloc(KVMI_NUM_COMMANDS, GFP_KERNEL); kvmi->event_allow_mask = bitmap_zalloc(KVMI_NUM_EVENTS, GFP_KERNEL); - if (!kvmi->cmd_allow_mask || !kvmi->event_allow_mask) { + kvmi->vm_event_enable_mask = bitmap_zalloc(KVMI_NUM_EVENTS, GFP_KERNEL); + if (!kvmi->cmd_allow_mask || !kvmi->event_allow_mask || + !kvmi->vm_event_enable_mask) { bitmap_free(kvmi->cmd_allow_mask); bitmap_free(kvmi->event_allow_mask); + bitmap_free(kvmi->vm_event_enable_mask); kfree(kvmi); return NULL; } @@ -392,6 +406,9 @@ static bool kvmi_unhook_event(struct kvm_introspection *kvmi) { int err; + if (!is_vm_event_enabled(kvmi, KVMI_EVENT_UNHOOK)) + return false; + err = kvmi_msg_send_unhook(kvmi); return !err; @@ -417,3 +434,14 @@ int kvmi_ioctl_preunhook(struct kvm *kvm) mutex_unlock(&kvm->kvmi_lock); return err; } + +int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, + unsigned int event_id, bool enable) +{ + if (enable) + set_bit(event_id, kvmi->vm_event_enable_mask); + else + clear_bit(event_id, kvmi->vm_event_enable_mask); + + return 0; +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index c385dc0eb708..7c503b8ca043 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -32,5 +32,8 @@ void *kvmi_msg_alloc(void); void kvmi_msg_free(void *addr); bool kvmi_is_command_allowed(struct kvm_introspection *kvmi, u16 id); bool kvmi_is_known_event(u8 id); +bool kvmi_is_known_vm_event(u8 id); +int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, + unsigned int event_id, bool enable); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 596f5c02bb8c..a148ed1e767c 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -162,15 +162,36 @@ static int handle_vm_get_info(struct kvm_introspection *kvmi, return kvmi_msg_vm_reply(kvmi, msg, 0, &rpl, sizeof(rpl)); } +static int handle_vm_control_events(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vm_control_events *req = _req; + int ec; + + if (req->padding1 || req->padding2 || req->enable > 1) + ec = -KVM_EINVAL; + else if (!kvmi_is_known_vm_event(req->event_id)) + ec = -KVM_EINVAL; + else if (!is_event_allowed(kvmi, req->event_id)) + ec = -KVM_EPERM; + else + ec = kvmi_cmd_vm_control_events(kvmi, req->event_id, + req->enable == 1); + + return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); +} + /* * These commands are executed by the receiving thread. */ static int(*const msg_vm[])(struct kvm_introspection *, const struct kvmi_msg_hdr *, const void *) = { - [KVMI_GET_VERSION] = handle_get_version, - [KVMI_VM_CHECK_COMMAND] = handle_vm_check_command, - [KVMI_VM_CHECK_EVENT] = handle_vm_check_event, - [KVMI_VM_GET_INFO] = handle_vm_get_info, + [KVMI_GET_VERSION] = handle_get_version, + [KVMI_VM_CHECK_COMMAND] = handle_vm_check_command, + [KVMI_VM_CHECK_EVENT] = handle_vm_check_event, + [KVMI_VM_CONTROL_EVENTS] = handle_vm_control_events, + [KVMI_VM_GET_INFO] = handle_vm_get_info, }; static bool is_vm_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 46/84] KVM: introspection: add KVMI_VM_READ_PHYSICAL/KVMI_VM_WRITE_PHYSICAL
From: Mihai Don?u <mdontu at bitdefender.com> These commands allow the introspection tool to read/write from/to the guest memory. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 68 +++++++ include/uapi/linux/kvmi.h | 17 ++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 170 ++++++++++++++++++ virt/kvm/introspection/kvmi.c | 108 +++++++++++ virt/kvm/introspection/kvmi_int.h | 7 + virt/kvm/introspection/kvmi_msg.c | 44 +++++ 6 files changed, 414 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 4ec0046b4138..be5a92e20173 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -375,6 +375,74 @@ the following events:: * -KVM_EINVAL - the event ID is unknown (use *KVMI_VM_CHECK_EVENT* first) * -KVM_EPERM - the access is disallowed (use *KVMI_VM_CHECK_EVENT* first) +6. KVMI_VM_READ_PHYSICAL +------------------------ + +:Architectures: all +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vm_read_physical { + __u64 gpa; + __u16 size; + __u16 padding1; + __u32 padding2; + }; + +:Returns: + +:: + + struct kvmi_error_code; + __u8 data[0]; + +Reads from the guest memory. + +Currently, the size must be non-zero and the read must be restricted to +one page (offset + size <= PAGE_SIZE). + +:Errors: + +* -KVM_ENOENT - the guest page doesn't exists +* -KVM_EINVAL - the specified gpa/size pair is invalid +* -KVM_EINVAL - the padding is not zero + +7. KVMI_VM_WRITE_PHYSICAL +------------------------- + +:Architectures: all +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vm_write_physical { + __u64 gpa; + __u16 size; + __u16 padding1; + __u32 padding2; + __u8 data[0]; + }; + +:Returns: + +:: + + struct kvmi_error_code + +Writes into the guest memory. + +Currently, the size must be non-zero and the write must be restricted to +one page (offset + size <= PAGE_SIZE). + +:Errors: + +* -KVM_ENOENT - the guest page doesn't exists +* -KVM_EINVAL - the specified gpa/size pair is invalid +* -KVM_EINVAL - the padding is not zero + Events ===== diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index f9e2cb8a2c5e..9b2428963994 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -22,6 +22,8 @@ enum { KVMI_VM_CHECK_EVENT = 3, KVMI_VM_GET_INFO = 4, KVMI_VM_CONTROL_EVENTS = 5, + KVMI_VM_READ_PHYSICAL = 6, + KVMI_VM_WRITE_PHYSICAL = 7, KVMI_NUM_MESSAGES }; @@ -82,6 +84,21 @@ struct kvmi_vm_control_events { __u32 padding2; }; +struct kvmi_vm_read_physical { + __u64 gpa; + __u16 size; + __u16 padding1; + __u32 padding2; +}; + +struct kvmi_vm_write_physical { + __u64 gpa; + __u16 size; + __u16 padding1; + __u32 padding2; + __u8 data[0]; +}; + struct kvmi_event { __u16 size; __u16 vcpu; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index bb2daaca0291..97dec49d52b7 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -8,6 +8,7 @@ #define _GNU_SOURCE /* for program_invocation_short_name */ #include <sys/types.h> #include <sys/socket.h> +#include <time.h> #include "test_util.h" @@ -24,6 +25,13 @@ static int socket_pair[2]; #define Kvm_socket socket_pair[0] #define Userspace_socket socket_pair[1] +static vm_vaddr_t test_gva; +static void *test_hva; +static vm_paddr_t test_gpa; + +static uint8_t test_write_pattern; +static int page_size; + void setup_socket(void) { int r; @@ -434,8 +442,154 @@ static void test_cmd_vm_control_events(struct kvm_vm *vm) allow_event(vm, id); } +static void cmd_vm_write_page(__u64 gpa, __u64 size, void *p, __u16 padding, + int expected_err) +{ + struct kvmi_vm_write_physical *cmd; + struct kvmi_msg_hdr *req; + size_t req_size; + int r; + + req_size = sizeof(*req) + sizeof(*cmd) + size; + + req = calloc(1, req_size); + TEST_ASSERT(req, "Insufficient Memory\n"); + + cmd = (struct kvmi_vm_write_physical *)(req + 1); + cmd->gpa = gpa; + cmd->size = size; + cmd->padding1 = padding; + cmd->padding2 = padding; + + memcpy(cmd + 1, p, size); + + r = do_command(KVMI_VM_WRITE_PHYSICAL, req, req_size, NULL, 0); + + free(req); + + TEST_ASSERT(r == expected_err, + "KVMI_VM_WRITE_PHYSICAL failed, gpa 0x%llx, error %d (%s), expected error %d\n", + gpa, -r, kvm_strerror(-r), expected_err); +} + +static void write_guest_page(__u64 gpa, void *p) +{ + cmd_vm_write_page(gpa, page_size, p, 0, 0); +} + +static void write_with_invalid_arguments(__u64 gpa, __u64 size, void *p) +{ + cmd_vm_write_page(gpa, size, p, 0, -KVM_EINVAL); +} + +static void write_with_invalid_padding(__u64 gpa, void *p) +{ + __u16 padding = 1; + + cmd_vm_write_page(gpa, page_size, p, padding, -KVM_EINVAL); +} + +static void write_invalid_guest_page(struct kvm_vm *vm, void *p) +{ + __u64 gpa = vm->max_gfn << vm->page_shift; + __u64 size = 1; + + cmd_vm_write_page(gpa, size, p, 0, -KVM_ENOENT); +} + +static void cmd_vm_read_page(__u64 gpa, __u64 size, void *p, __u16 padding, + int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vm_read_physical cmd; + } req = { }; + int r; + + req.cmd.gpa = gpa; + req.cmd.size = size; + req.cmd.padding1 = padding; + req.cmd.padding2 = padding; + + r = do_command(KVMI_VM_READ_PHYSICAL, &req.hdr, sizeof(req), p, size); + TEST_ASSERT(r == expected_err, + "KVMI_VM_READ_PHYSICAL failed, gpa 0x%llx, error %d (%s), expected error %d\n", + gpa, -r, kvm_strerror(-r), expected_err); +} + +static void read_guest_page(__u64 gpa, void *p) +{ + cmd_vm_read_page(gpa, page_size, p, 0, 0); +} + +static void read_with_invalid_arguments(__u64 gpa, __u64 size, void *p) +{ + cmd_vm_read_page(gpa, size, p, 0, -KVM_EINVAL); +} + +static void read_with_invalid_padding(__u64 gpa, void *p) +{ + __u16 padding = 1; + + cmd_vm_read_page(gpa, page_size, p, padding, -KVM_EINVAL); +} + +static void read_invalid_guest_page(struct kvm_vm *vm) +{ + __u64 gpa = vm->max_gfn << vm->page_shift; + __u64 size = 1; + + cmd_vm_read_page(gpa, size, NULL, 0, -KVM_ENOENT); +} + +static void new_test_write_pattern(struct kvm_vm *vm) +{ + uint8_t n; + + do { + n = random(); + } while (n == 0 || n == test_write_pattern); + + test_write_pattern = n; + sync_global_to_guest(vm, test_write_pattern); +} + +static void test_memory_access(struct kvm_vm *vm) +{ + void *pw, *pr; + + new_test_write_pattern(vm); + + pw = malloc(page_size); + TEST_ASSERT(pw, "Insufficient Memory\n"); + + memset(pw, test_write_pattern, page_size); + + write_guest_page(test_gpa, pw); + TEST_ASSERT(memcmp(pw, test_hva, page_size) == 0, + "Write page test failed"); + + pr = malloc(page_size); + TEST_ASSERT(pr, "Insufficient Memory\n"); + + read_guest_page(test_gpa, pr); + TEST_ASSERT(memcmp(pw, pr, page_size) == 0, + "Read page test failed"); + + read_with_invalid_arguments(test_gpa, 0, pr); + read_with_invalid_padding(test_gpa, pr); + write_with_invalid_arguments(test_gpa, 0, pw); + write_with_invalid_padding(test_gpa, pw); + write_invalid_guest_page(vm, pw); + + free(pw); + free(pr); + + read_invalid_guest_page(vm); +} static void test_introspection(struct kvm_vm *vm) { + srandom(time(0)); setup_socket(); hook_introspection(vm); @@ -446,10 +600,23 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vm_get_info(); test_event_unhook(vm); test_cmd_vm_control_events(vm); + test_memory_access(vm); unhook_introspection(vm); } +static void setup_test_pages(struct kvm_vm *vm) +{ + test_gva = vm_vaddr_alloc(vm, page_size, KVM_UTIL_MIN_VADDR, 0, 0); + + sync_global_to_guest(vm, test_gva); + + test_hva = addr_gva2hva(vm, test_gva); + memset(test_hva, 0, page_size); + + test_gpa = addr_gva2gpa(vm, test_gva); +} + int main(int argc, char *argv[]) { struct kvm_vm *vm; @@ -462,6 +629,9 @@ int main(int argc, char *argv[]) vm = vm_create_default(VCPU_ID, 0, NULL); vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid()); + page_size = getpagesize(); + setup_test_pages(vm); + test_introspection(vm); kvm_vm_free(vm); diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 5af6ea041035..354dea276347 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -5,6 +5,7 @@ * Copyright (C) 2017-2020 Bitdefender S.R.L. * */ +#include <linux/mmu_context.h> #include <linux/kthread.h> #include "kvmi_int.h" @@ -445,3 +446,110 @@ int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, return 0; } + +static unsigned long gfn_to_hva_safe(struct kvm *kvm, gfn_t gfn) +{ + unsigned long hva; + int srcu_idx; + + srcu_idx = srcu_read_lock(&kvm->srcu); + hva = gfn_to_hva(kvm, gfn); + srcu_read_unlock(&kvm->srcu, srcu_idx); + + return hva; +} + +static long +get_user_pages_remote_unlocked(struct mm_struct *mm, unsigned long start, + unsigned long nr_pages, unsigned int gup_flags, + struct page **pages) +{ + struct vm_area_struct **vmas = NULL; + struct task_struct *tsk = NULL; + int locked = 1; + long r; + + mmap_read_lock(mm); + r = get_user_pages_remote(tsk, mm, start, nr_pages, gup_flags, + pages, vmas, &locked); + if (locked) + mmap_read_unlock(mm); + + return r; +} + +static void *get_page_ptr(struct kvm *kvm, gpa_t gpa, struct page **page, + bool write) +{ + unsigned int flags = write ? FOLL_WRITE : 0; + unsigned long hva; + + *page = NULL; + + hva = gfn_to_hva_safe(kvm, gpa_to_gfn(gpa)); + + if (kvm_is_error_hva(hva)) + return NULL; + + if (get_user_pages_remote_unlocked(kvm->mm, hva, 1, flags, page) != 1) + return NULL; + + return write ? kmap_atomic(*page) : kmap(*page); +} + +static void put_page_ptr(void *ptr, struct page *page, bool write) +{ + if (ptr) { + if (write) + kunmap_atomic(ptr); + else + kunmap(ptr); + } + if (page) + put_page(page); +} + +int kvmi_cmd_read_physical(struct kvm *kvm, u64 gpa, size_t size, + int (*send)(struct kvm_introspection *, + const struct kvmi_msg_hdr *, + int err, const void *buf, size_t), + const struct kvmi_msg_hdr *ctx) +{ + void *ptr_page = NULL, *ptr; + struct page *page = NULL; + size_t ptr_size; + int err, ec; + + ptr_page = get_page_ptr(kvm, gpa, &page, false); + if (ptr_page) { + ptr = ptr_page + (gpa & ~PAGE_MASK); + ptr_size = size; + ec = 0; + } else { + ptr = NULL; + ptr_size = 0; + ec = -KVM_ENOENT; + } + + err = send(KVMI(kvm), ctx, ec, ptr, ptr_size); + + put_page_ptr(ptr_page, page, false); + return err; +} + +int kvmi_cmd_write_physical(struct kvm *kvm, u64 gpa, size_t size, + const void *buf) +{ + struct page *page; + void *ptr; + + ptr = get_page_ptr(kvm, gpa, &page, true); + if (!ptr) + return -KVM_ENOENT; + + memcpy(ptr + (gpa & ~PAGE_MASK), buf, size); + + put_page_ptr(ptr, page, true); + + return 0; +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 7c503b8ca043..40e8647a6fd4 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -35,5 +35,12 @@ bool kvmi_is_known_event(u8 id); bool kvmi_is_known_vm_event(u8 id); int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, unsigned int event_id, bool enable); +int kvmi_cmd_read_physical(struct kvm *kvm, u64 gpa, size_t size, + int (*send)(struct kvm_introspection *, + const struct kvmi_msg_hdr*, + int err, const void *buf, size_t), + const struct kvmi_msg_hdr *ctx); +int kvmi_cmd_write_physical(struct kvm *kvm, u64 gpa, size_t size, + const void *buf); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index a148ed1e767c..de9e38e8e24b 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -182,6 +182,48 @@ static int handle_vm_control_events(struct kvm_introspection *kvmi, return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); } +static bool invalid_page_access(u64 gpa, u64 size) +{ + u64 off = gpa & ~PAGE_MASK; + + return (size == 0 || size > PAGE_SIZE || off + size > PAGE_SIZE); +} + +static int handle_vm_read_physical(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vm_read_physical *req = _req; + + if (invalid_page_access(req->gpa, req->size) + || req->padding1 || req->padding2) + return kvmi_msg_vm_reply(kvmi, msg, -KVM_EINVAL, NULL, 0); + + return kvmi_cmd_read_physical(kvmi->kvm, req->gpa, req->size, + kvmi_msg_vm_reply, msg); +} + +static int handle_vm_write_physical(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vm_write_physical *req = _req; + int ec; + + if (msg->size < struct_size(req, data, req->size)) + return -EINVAL; + + if (invalid_page_access(req->gpa, req->size)) + ec = -KVM_EINVAL; + else if (req->padding1 || req->padding2) + ec = -KVM_EINVAL; + else + ec = kvmi_cmd_write_physical(kvmi->kvm, req->gpa, + req->size, req->data); + + return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); +} + /* * These commands are executed by the receiving thread. */ @@ -192,6 +234,8 @@ static int(*const msg_vm[])(struct kvm_introspection *, [KVMI_VM_CHECK_EVENT] = handle_vm_check_event, [KVMI_VM_CONTROL_EVENTS] = handle_vm_control_events, [KVMI_VM_GET_INFO] = handle_vm_get_info, + [KVMI_VM_READ_PHYSICAL] = handle_vm_read_physical, + [KVMI_VM_WRITE_PHYSICAL] = handle_vm_write_physical, }; static bool is_vm_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 47/84] KVM: introspection: add vCPU related data
From: Mircea C?rjaliu <mcirjaliu at bitdefender.com> Add an introspection structure to all vCPUs when the VM is hooked. Signed-off-by: Mircea C?rjaliu <mcirjaliu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvmi_host.h | 3 ++ include/linux/kvm_host.h | 1 + include/linux/kvmi_host.h | 6 ++++ virt/kvm/introspection/kvmi.c | 51 ++++++++++++++++++++++++++++++++ virt/kvm/kvm_main.c | 2 ++ 5 files changed, 63 insertions(+) diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 747f3d779e15..05ade3a16b24 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -4,6 +4,9 @@ #include <asm/kvmi.h> +struct kvm_vcpu_arch_introspection { +}; + struct kvm_arch_introspection { }; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index c82c55085604..8509d8272466 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -319,6 +319,7 @@ struct kvm_vcpu { bool preempted; bool ready; struct kvm_vcpu_arch arch; + struct kvm_vcpu_introspection *kvmi; }; static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 8e142096ba47..f96a9a3cfdd4 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -6,6 +6,10 @@ #include <asm/kvmi_host.h> +struct kvm_vcpu_introspection { + struct kvm_vcpu_arch_introspection arch; +}; + struct kvm_introspection { struct kvm_arch_introspection arch; struct kvm *kvm; @@ -28,6 +32,7 @@ int kvmi_init(void); void kvmi_uninit(void); void kvmi_create_vm(struct kvm *kvm); void kvmi_destroy_vm(struct kvm *kvm); +void kvmi_vcpu_uninit(struct kvm_vcpu *vcpu); int kvmi_ioctl_hook(struct kvm *kvm, const struct kvm_introspection_hook *hook); @@ -44,6 +49,7 @@ static inline int kvmi_init(void) { return 0; } static inline void kvmi_uninit(void) { } static inline void kvmi_create_vm(struct kvm *kvm) { } static inline void kvmi_destroy_vm(struct kvm *kvm) { } +static inline void kvmi_vcpu_uninit(struct kvm_vcpu *vcpu) { } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 354dea276347..a51e7342f837 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -107,8 +107,41 @@ void kvmi_uninit(void) kvmi_cache_destroy(); } +static bool alloc_vcpui(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui; + + vcpui = kzalloc(sizeof(*vcpui), GFP_KERNEL); + if (!vcpui) + return false; + + vcpu->kvmi = vcpui; + + return true; +} + +static int create_vcpui(struct kvm_vcpu *vcpu) +{ + if (!alloc_vcpui(vcpu)) + return -ENOMEM; + + return 0; +} + +static void free_vcpui(struct kvm_vcpu *vcpu) +{ + kfree(vcpu->kvmi); + vcpu->kvmi = NULL; +} + static void free_kvmi(struct kvm *kvm) { + struct kvm_vcpu *vcpu; + int i; + + kvm_for_each_vcpu(i, vcpu, kvm) + free_vcpui(vcpu); + bitmap_free(kvm->kvmi->cmd_allow_mask); bitmap_free(kvm->kvmi->event_allow_mask); bitmap_free(kvm->kvmi->vm_event_enable_mask); @@ -117,10 +150,19 @@ static void free_kvmi(struct kvm *kvm) kvm->kvmi = NULL; } +void kvmi_vcpu_uninit(struct kvm_vcpu *vcpu) +{ + mutex_lock(&vcpu->kvm->kvmi_lock); + free_vcpui(vcpu); + mutex_unlock(&vcpu->kvm->kvmi_lock); +} + static struct kvm_introspection * alloc_kvmi(struct kvm *kvm, const struct kvm_introspection_hook *hook) { struct kvm_introspection *kvmi; + struct kvm_vcpu *vcpu; + int i; kvmi = kzalloc(sizeof(*kvmi), GFP_KERNEL); if (!kvmi) @@ -146,6 +188,15 @@ alloc_kvmi(struct kvm *kvm, const struct kvm_introspection_hook *hook) atomic_set(&kvmi->ev_seq, 0); + kvm_for_each_vcpu(i, vcpu, kvm) { + int err = create_vcpui(vcpu); + + if (err) { + free_kvmi(kvm); + return NULL; + } + } + kvmi->kvm = kvm; return kvmi; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 86910e6c8d93..1f199da03aa4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -366,6 +366,7 @@ static void kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id) void kvm_vcpu_destroy(struct kvm_vcpu *vcpu) { + kvmi_vcpu_uninit(vcpu); kvm_arch_vcpu_destroy(vcpu); /* @@ -3137,6 +3138,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) unlock_vcpu_destroy: mutex_unlock(&kvm->lock); + kvmi_vcpu_uninit(vcpu); kvm_arch_vcpu_destroy(vcpu); vcpu_free_run_page: free_page((unsigned long)vcpu->run);
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 48/84] KVM: introspection: add a jobs list to every introspected vCPU
Every vCPU has a lock-protected list in which the receiving thread places the jobs that has to be done by the vCPU thread once it is kicked out of guest (KVM_REQ_INTROSPECTION). Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- include/linux/kvmi_host.h | 10 +++++ virt/kvm/introspection/kvmi.c | 72 ++++++++++++++++++++++++++++++- virt/kvm/introspection/kvmi_int.h | 1 + 3 files changed, 81 insertions(+), 2 deletions(-) diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index f96a9a3cfdd4..d3242a99f891 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -6,8 +6,18 @@ #include <asm/kvmi_host.h> +struct kvmi_job { + struct list_head link; + void *ctx; + void (*fct)(struct kvm_vcpu *vcpu, void *ctx); + void (*free_fct)(void *ctx); +}; + struct kvm_vcpu_introspection { struct kvm_vcpu_arch_introspection arch; + + struct list_head job_list; + spinlock_t job_lock; }; struct kvm_introspection { diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index a51e7342f837..b6595bca99f7 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -18,6 +18,7 @@ static DECLARE_BITMAP(Kvmi_known_vm_events, KVMI_NUM_EVENTS); static DECLARE_BITMAP(Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); static struct kmem_cache *msg_cache; +static struct kmem_cache *job_cache; void *kvmi_msg_alloc(void) { @@ -34,14 +35,19 @@ static void kvmi_cache_destroy(void) { kmem_cache_destroy(msg_cache); msg_cache = NULL; + kmem_cache_destroy(job_cache); + job_cache = NULL; } static int kvmi_cache_create(void) { msg_cache = kmem_cache_create("kvmi_msg", KVMI_MSG_SIZE_ALLOC, 4096, SLAB_ACCOUNT, NULL); + job_cache = kmem_cache_create("kvmi_job", + sizeof(struct kvmi_job), + 0, SLAB_ACCOUNT, NULL); - if (!msg_cache) { + if (!msg_cache || !job_cache) { kvmi_cache_destroy(); return -1; @@ -107,6 +113,48 @@ void kvmi_uninit(void) kvmi_cache_destroy(); } +static int __kvmi_add_job(struct kvm_vcpu *vcpu, + void (*fct)(struct kvm_vcpu *vcpu, void *ctx), + void *ctx, void (*free_fct)(void *ctx)) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + struct kvmi_job *job; + + job = kmem_cache_zalloc(job_cache, GFP_KERNEL); + if (unlikely(!job)) + return -ENOMEM; + + INIT_LIST_HEAD(&job->link); + job->fct = fct; + job->ctx = ctx; + job->free_fct = free_fct; + + spin_lock(&vcpui->job_lock); + list_add_tail(&job->link, &vcpui->job_list); + spin_unlock(&vcpui->job_lock); + + return 0; +} + +int kvmi_add_job(struct kvm_vcpu *vcpu, + void (*fct)(struct kvm_vcpu *vcpu, void *ctx), + void *ctx, void (*free_fct)(void *ctx)) +{ + int err; + + err = __kvmi_add_job(vcpu, fct, ctx, free_fct); + + return err; +} + +static void kvmi_free_job(struct kvmi_job *job) +{ + if (job->free_fct) + job->free_fct(job->ctx); + + kmem_cache_free(job_cache, job); +} + static bool alloc_vcpui(struct kvm_vcpu *vcpu) { struct kvm_vcpu_introspection *vcpui; @@ -115,6 +163,9 @@ static bool alloc_vcpui(struct kvm_vcpu *vcpu) if (!vcpui) return false; + INIT_LIST_HEAD(&vcpui->job_list); + spin_lock_init(&vcpui->job_lock); + vcpu->kvmi = vcpui; return true; @@ -128,9 +179,26 @@ static int create_vcpui(struct kvm_vcpu *vcpu) return 0; } +static void free_vcpu_jobs(struct kvm_vcpu_introspection *vcpui) +{ + struct kvmi_job *cur, *next; + + list_for_each_entry_safe(cur, next, &vcpui->job_list, link) { + list_del(&cur->link); + kvmi_free_job(cur); + } +} + static void free_vcpui(struct kvm_vcpu *vcpu) { - kfree(vcpu->kvmi); + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + if (!vcpui) + return; + + free_vcpu_jobs(vcpui); + + kfree(vcpui); vcpu->kvmi = NULL; } diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 40e8647a6fd4..ceed50722dc1 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -19,6 +19,7 @@ kvm_info("%pU ERROR: " fmt, &kvmi->uuid, ## __VA_ARGS__) #define KVMI(kvm) ((kvm)->kvmi) +#define VCPUI(vcpu) ((vcpu)->kvmi) /* kvmi_msg.c */ bool kvmi_sock_get(struct kvm_introspection *kvmi, int fd);
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 49/84] KVM: introspection: handle vCPU introspection requests
From: Mihai Don?u <mdontu at bitdefender.com> The receiving thread dispatches the vCPU introspection commands by adding them to the vCPU's jobs list and kicking the vCPU. Before entering in guest, the vCPU thread checks the introspection request (KVM_REQ_INTROSPECTION) and runs its queued jobs. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Mircea C?rjaliu <mcirjaliu at bitdefender.com> Signed-off-by: Mircea C?rjaliu <mcirjaliu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/x86.c | 3 ++ include/linux/kvm_host.h | 1 + include/linux/kvmi_host.h | 4 +++ virt/kvm/introspection/kvmi.c | 58 +++++++++++++++++++++++++++++++++++ virt/kvm/kvm_main.c | 2 ++ 5 files changed, 68 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ff0d3c82de64..7f56e2149f18 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8706,6 +8706,9 @@ static int vcpu_run(struct kvm_vcpu *vcpu) vcpu->arch.l1tf_flush_l1d = true; for (;;) { + if (kvm_check_request(KVM_REQ_INTROSPECTION, vcpu)) + kvmi_handle_requests(vcpu); + if (kvm_vcpu_running(vcpu)) { r = vcpu_enter_guest(vcpu); } else { diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 8509d8272466..296b59ecc540 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -146,6 +146,7 @@ static inline bool is_error_page(struct page *page) #define KVM_REQ_MMU_RELOAD (1 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) #define KVM_REQ_PENDING_TIMER 2 #define KVM_REQ_UNHALT 3 +#define KVM_REQ_INTROSPECTION 4 #define KVM_REQUEST_ARCH_BASE 8 #define KVM_ARCH_REQ_FLAGS(nr, flags) ({ \ diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index d3242a99f891..956b8d5c51e3 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -53,6 +53,8 @@ int kvmi_ioctl_event(struct kvm *kvm, const struct kvm_introspection_feature *feat); int kvmi_ioctl_preunhook(struct kvm *kvm); +void kvmi_handle_requests(struct kvm_vcpu *vcpu); + #else static inline int kvmi_init(void) { return 0; } @@ -61,6 +63,8 @@ static inline void kvmi_create_vm(struct kvm *kvm) { } static inline void kvmi_destroy_vm(struct kvm *kvm) { } static inline void kvmi_vcpu_uninit(struct kvm_vcpu *vcpu) { } +static inline void kvmi_handle_requests(struct kvm_vcpu *vcpu) { } + #endif /* CONFIG_KVM_INTROSPECTION */ #endif diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index b6595bca99f7..a9d406f276f5 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -113,6 +113,12 @@ void kvmi_uninit(void) kvmi_cache_destroy(); } +static void kvmi_make_request(struct kvm_vcpu *vcpu) +{ + kvm_make_request(KVM_REQ_INTROSPECTION, vcpu); + kvm_vcpu_kick(vcpu); +} + static int __kvmi_add_job(struct kvm_vcpu *vcpu, void (*fct)(struct kvm_vcpu *vcpu, void *ctx), void *ctx, void (*free_fct)(void *ctx)) @@ -144,6 +150,9 @@ int kvmi_add_job(struct kvm_vcpu *vcpu, err = __kvmi_add_job(vcpu, fct, ctx, free_fct); + if (!err) + kvmi_make_request(vcpu); + return err; } @@ -312,6 +321,14 @@ int kvmi_ioctl_unhook(struct kvm *kvm) return 0; } +struct kvm_introspection * __must_check kvmi_get(struct kvm *kvm) +{ + if (refcount_inc_not_zero(&kvm->kvmi_ref)) + return kvm->kvmi; + + return NULL; +} + void kvmi_put(struct kvm *kvm) { if (refcount_dec_and_test(&kvm->kvmi_ref)) @@ -373,6 +390,10 @@ int kvmi_hook(struct kvm *kvm, const struct kvm_introspection_hook *hook) init_completion(&kvm->kvmi_complete); refcount_set(&kvm->kvmi_ref, 1); + /* + * Paired with refcount_inc_not_zero() from kvmi_get(). + */ + smp_wmb(); kvmi->recv = kthread_run(kvmi_recv_thread, kvmi, "kvmi-recv"); if (IS_ERR(kvmi->recv)) { @@ -672,3 +693,40 @@ int kvmi_cmd_write_physical(struct kvm *kvm, u64 gpa, size_t size, return 0; } + +static struct kvmi_job *kvmi_pull_job(struct kvm_vcpu_introspection *vcpui) +{ + struct kvmi_job *job = NULL; + + spin_lock(&vcpui->job_lock); + job = list_first_entry_or_null(&vcpui->job_list, typeof(*job), link); + if (job) + list_del(&job->link); + spin_unlock(&vcpui->job_lock); + + return job; +} + +void kvmi_run_jobs(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + struct kvmi_job *job; + + while ((job = kvmi_pull_job(vcpui))) { + job->fct(vcpu, job->ctx); + kvmi_free_job(job); + } +} + +void kvmi_handle_requests(struct kvm_vcpu *vcpu) +{ + struct kvm_introspection *kvmi; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return; + + kvmi_run_jobs(vcpu); + + kvmi_put(vcpu->kvm); +} diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1f199da03aa4..92d662e7269f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2710,6 +2710,8 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) goto out; if (signal_pending(current)) goto out; + if (kvm_test_request(KVM_REQ_INTROSPECTION, vcpu)) + goto out; ret = 0; out:
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 50/84] KVM: introspection: handle vCPU commands
From: Mihai Don?u <mdontu at bitdefender.com> Based on the common structure (kvmi_vcpu_hdr) used for all vCPU commands, the receiving thread validates and dispatches the message to the proper vCPU (adding the handling function to its jobs list). Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 8 ++ include/uapi/linux/kvmi.h | 6 ++ virt/kvm/introspection/kvmi_int.h | 3 + virt/kvm/introspection/kvmi_msg.c | 155 +++++++++++++++++++++++++++++- 4 files changed, 170 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index be5a92e20173..383bf39ec1e4 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -232,6 +232,14 @@ The following C structures are meant to be used directly when communicating over the wire. The peer that detects any size mismatch should simply close the connection and report the error. +The vCPU commands start with:: + + struct kvmi_vcpu_hdr { + __u16 vcpu; + __u16 padding1; + __u32 padding2; + } + 1. KVMI_GET_VERSION ------------------- diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 9b2428963994..b206b7441859 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -99,6 +99,12 @@ struct kvmi_vm_write_physical { __u8 data[0]; }; +struct kvmi_vcpu_hdr { + __u16 vcpu; + __u16 padding1; + __u32 padding2; +}; + struct kvmi_event { __u16 size; __u16 vcpu; diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index ceed50722dc1..fe5190ab31d6 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -34,6 +34,9 @@ void kvmi_msg_free(void *addr); bool kvmi_is_command_allowed(struct kvm_introspection *kvmi, u16 id); bool kvmi_is_known_event(u8 id); bool kvmi_is_known_vm_event(u8 id); +int kvmi_add_job(struct kvm_vcpu *vcpu, + void (*fct)(struct kvm_vcpu *vcpu, void *ctx), + void *ctx, void (*free_fct)(void *ctx)); int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, unsigned int event_id, bool enable); int kvmi_cmd_read_physical(struct kvm *kvm, u64 gpa, size_t size, diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index de9e38e8e24b..31a471df4b12 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -9,6 +9,15 @@ #include "kvmi_int.h" static bool is_vm_command(u16 id); +static bool is_vcpu_command(u16 id); + +struct kvmi_vcpu_msg_job { + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + } *msg; + struct kvm_vcpu *vcpu; +}; bool kvmi_sock_get(struct kvm_introspection *kvmi, int fd) { @@ -100,6 +109,28 @@ static int kvmi_msg_vm_reply(struct kvm_introspection *kvmi, return kvmi_msg_reply(kvmi, msg, err, rpl, rpl_size); } +static bool invalid_vcpu_hdr(const struct kvmi_vcpu_hdr *hdr) +{ + return hdr->padding1 || hdr->padding2; +} + +static int kvmi_get_vcpu(struct kvm_introspection *kvmi, unsigned int vcpu_idx, + struct kvm_vcpu **dest) +{ + struct kvm *kvm = kvmi->kvm; + struct kvm_vcpu *vcpu; + + if (vcpu_idx >= atomic_read(&kvm->online_vcpus)) + return -KVM_EINVAL; + + vcpu = kvm_get_vcpu(kvm, vcpu_idx); + if (!vcpu) + return -KVM_EINVAL; + + *dest = vcpu; + return 0; +} + static int handle_get_version(struct kvm_introspection *kvmi, const struct kvmi_msg_hdr *msg, const void *req) { @@ -120,7 +151,7 @@ static int handle_vm_check_command(struct kvm_introspection *kvmi, if (req->padding1 || req->padding2) ec = -KVM_EINVAL; - else if (!is_vm_command(req->id)) + else if (!is_vm_command(req->id) && !is_vcpu_command(req->id)) ec = -KVM_ENOENT; else if (!kvmi_is_command_allowed(kvmi, req->id)) ec = -KVM_EPERM; @@ -243,6 +274,60 @@ static bool is_vm_command(u16 id) return id < ARRAY_SIZE(msg_vm) && !!msg_vm[id]; } +/* + * These functions are executed from the vCPU thread. The receiving thread + * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' + * and signals the vCPU to handle the message (which includes + * sending back the reply if needed). + */ +static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, + const struct kvmi_msg_hdr *, const void *) = { +}; + +static bool is_vcpu_command(u16 id) +{ + return id < ARRAY_SIZE(msg_vcpu) && !!msg_vcpu[id]; +} + +static void kvmi_job_vcpu_msg(struct kvm_vcpu *vcpu, void *ctx) +{ + struct kvmi_vcpu_msg_job *job = ctx; + size_t id = job->msg->hdr.id; + int err; + + job->vcpu = vcpu; + + err = msg_vcpu[id](job, &job->msg->hdr, job->msg + 1); + + /* + * This is running from the vCPU thread. + * Any error that is not sent with the reply + * will shut down the socket. + */ + if (err) + kvmi_sock_shutdown(KVMI(vcpu->kvm)); +} + +static void kvmi_free_ctx(void *_ctx) +{ + const struct kvmi_vcpu_msg_job *ctx = _ctx; + + kvmi_msg_free(ctx->msg); + kfree(ctx); +} + +static int kvmi_msg_queue_to_vcpu(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_msg_job *cmd) +{ + return kvmi_add_job(vcpu, kvmi_job_vcpu_msg, (void *)cmd, + kvmi_free_ctx); +} + +static bool is_vcpu_message(u16 id) +{ + return is_vcpu_command(id); +} + static struct kvmi_msg_hdr *kvmi_msg_recv(struct kvm_introspection *kvmi) { struct kvmi_msg_hdr *msg; @@ -299,9 +384,72 @@ static int kvmi_msg_handle_vm_cmd(struct kvm_introspection *kvmi, return kvmi_msg_do_vm_cmd(kvmi, msg); } +static bool vcpu_can_handle_messages(struct kvm_vcpu *vcpu) +{ + return vcpu->arch.mp_state != KVM_MP_STATE_UNINITIALIZED; +} + +static int kvmi_get_vcpu_if_ready(struct kvm_introspection *kvmi, + unsigned int vcpu_idx, + struct kvm_vcpu **vcpu) +{ + int err; + + err = kvmi_get_vcpu(kvmi, vcpu_idx, vcpu); + + if (!err && !vcpu_can_handle_messages(*vcpu)) + err = -KVM_EAGAIN; + + return err; +} + +static int kvmi_msg_dispatch_vcpu_msg(struct kvm_introspection *kvmi, + struct kvmi_msg_hdr *msg, + struct kvm_vcpu *vcpu) +{ + struct kvmi_vcpu_msg_job *job_cmd; + int err; + + job_cmd = kzalloc(sizeof(*job_cmd), GFP_KERNEL); + if (!job_cmd) + return -ENOMEM; + + job_cmd->msg = (void *)msg; + + err = kvmi_msg_queue_to_vcpu(vcpu, job_cmd); + if (err) + kfree(job_cmd); + + return err; +} + +static int kvmi_msg_handle_vcpu_msg(struct kvm_introspection *kvmi, + struct kvmi_msg_hdr *msg, + bool *queued) +{ + struct kvmi_vcpu_hdr *vcpu_hdr = (struct kvmi_vcpu_hdr *)(msg + 1); + struct kvm_vcpu *vcpu = NULL; + int err, ec; + + if (!is_message_allowed(kvmi, msg->id)) + return kvmi_msg_vm_reply_ec(kvmi, msg, -KVM_EPERM); + + if (invalid_vcpu_hdr(vcpu_hdr)) + return kvmi_msg_vm_reply_ec(kvmi, msg, -KVM_EINVAL); + + ec = kvmi_get_vcpu_if_ready(kvmi, vcpu_hdr->vcpu, &vcpu); + if (ec) + return kvmi_msg_vm_reply_ec(kvmi, msg, ec); + + err = kvmi_msg_dispatch_vcpu_msg(kvmi, msg, vcpu); + *queued = err == 0; + return err; +} + bool kvmi_msg_process(struct kvm_introspection *kvmi) { struct kvmi_msg_hdr *msg; + bool queued = false; int err = -1; msg = kvmi_msg_recv(kvmi); @@ -310,10 +458,13 @@ bool kvmi_msg_process(struct kvm_introspection *kvmi) if (is_vm_command(msg->id)) err = kvmi_msg_handle_vm_cmd(kvmi, msg); + else if (is_vcpu_message(msg->id)) + err = kvmi_msg_handle_vcpu_msg(kvmi, msg, &queued); else err = kvmi_msg_vm_reply_ec(kvmi, msg, -KVM_ENOSYS); - kvmi_msg_free(msg); + if (!queued) + kvmi_msg_free(msg); out: return err == 0; }
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 51/84] KVM: introspection: add KVMI_VCPU_GET_INFO
From: Mihai Don?u <mdontu at bitdefender.com> For now, this command returns the TSC frequency (in HZ) for the specified vCPU if available (otherwise it returns zero). Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 29 ++++ arch/x86/include/uapi/asm/kvmi.h | 4 + arch/x86/kvm/Makefile | 2 +- arch/x86/kvm/kvmi.c | 19 +++ include/uapi/linux/kvmi.h | 2 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 152 +++++++++++++++++- virt/kvm/introspection/kvmi_int.h | 4 + virt/kvm/introspection/kvmi_msg.c | 22 +++ 8 files changed, 232 insertions(+), 2 deletions(-) create mode 100644 arch/x86/kvm/kvmi.c diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 383bf39ec1e4..5ead29a7b2a7 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -451,6 +451,35 @@ one page (offset + size <= PAGE_SIZE). * -KVM_EINVAL - the specified gpa/size pair is invalid * -KVM_EINVAL - the padding is not zero +8. KVMI_VCPU_GET_INFO +--------------------- + +:Architectures: all +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + +:Returns: + +:: + + struct kvmi_error_code; + struct kvmi_vcpu_get_info_reply { + __u64 tsc_speed; + }; + +Returns the TSC frequency (in HZ) for the specified vCPU if available +(otherwise it returns zero). + +:Errors: + +* -KVM_EINVAL - the padding is not zero +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EAGAIN - the selected vCPU can't be introspected yet + Events ===== diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index 551f9ed1ed9c..89adf84cefe4 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -26,4 +26,8 @@ struct kvmi_event_arch { } msrs; }; +struct kvmi_vcpu_get_info_reply { + __u64 tsc_speed; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index fb0242032cd1..3cfe76299dee 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -13,7 +13,7 @@ KVMI := $(KVM)/introspection kvm-y += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \ $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o -kvm-$(CONFIG_KVM_INTROSPECTION) += $(KVMI)/kvmi.o $(KVMI)/kvmi_msg.o +kvm-$(CONFIG_KVM_INTROSPECTION) += $(KVMI)/kvmi.o $(KVMI)/kvmi_msg.o kvmi.o kvm-y += x86.o emulate.o i8259.o irq.o lapic.o \ i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c new file mode 100644 index 000000000000..cf7bfff6c8c5 --- /dev/null +++ b/arch/x86/kvm/kvmi.c @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KVM Introspection - x86 + * + * Copyright (C) 2019-2020 Bitdefender S.R.L. + */ + +#include "../../../virt/kvm/introspection/kvmi_int.h" + +int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, + struct kvmi_vcpu_get_info_reply *rpl) +{ + if (kvm_has_tsc_control) + rpl->tsc_speed = 1000ul * vcpu->arch.virtual_tsc_khz; + else + rpl->tsc_speed = 0; + + return 0; +} diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index b206b7441859..a3dca420c887 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -25,6 +25,8 @@ enum { KVMI_VM_READ_PHYSICAL = 6, KVMI_VM_WRITE_PHYSICAL = 7, + KVMI_VCPU_GET_INFO = 8, + KVMI_NUM_MESSAGES }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 97dec49d52b7..107661fbe52f 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -9,6 +9,7 @@ #include <sys/types.h> #include <sys/socket.h> #include <time.h> +#include <pthread.h> #include "test_util.h" @@ -25,6 +26,7 @@ static int socket_pair[2]; #define Kvm_socket socket_pair[0] #define Userspace_socket socket_pair[1] +static int test_id; static vm_vaddr_t test_gva; static void *test_hva; static vm_paddr_t test_gpa; @@ -32,6 +34,39 @@ static vm_paddr_t test_gpa; static uint8_t test_write_pattern; static int page_size; +struct vcpu_worker_data { + struct kvm_vm *vm; + int vcpu_id; + int test_id; + bool stop; +}; + +enum { + GUEST_TEST_NOOP = 0, +}; + +#define GUEST_REQUEST_TEST() GUEST_SYNC(0) +#define GUEST_SIGNAL_TEST_DONE() GUEST_SYNC(1) + +#define HOST_SEND_TEST(uc) (uc.cmd == UCALL_SYNC && uc.args[1] == 0) + +static int guest_test_id(void) +{ + GUEST_REQUEST_TEST(); + return READ_ONCE(test_id); +} + +static void guest_code(void) +{ + while (true) { + switch (guest_test_id()) { + case GUEST_TEST_NOOP: + break; + } + GUEST_SIGNAL_TEST_DONE(); + } +} + void setup_socket(void) { int r; @@ -587,6 +622,120 @@ static void test_memory_access(struct kvm_vm *vm) read_invalid_guest_page(vm); } + +static void *vcpu_worker(void *data) +{ + struct vcpu_worker_data *ctx = data; + struct kvm_run *run; + + run = vcpu_state(ctx->vm, ctx->vcpu_id); + + while (!READ_ONCE(ctx->stop)) { + struct ucall uc; + + vcpu_run(ctx->vm, ctx->vcpu_id); + + TEST_ASSERT(run->exit_reason == KVM_EXIT_IO, + "vcpu_run() failed, test_id %d, exit reason %u (%s)\n", + ctx->test_id, run->exit_reason, + exit_reason_str(run->exit_reason)); + + TEST_ASSERT(get_ucall(ctx->vm, ctx->vcpu_id, &uc), + "No guest request\n"); + + if (HOST_SEND_TEST(uc)) { + test_id = READ_ONCE(ctx->test_id); + sync_global_to_guest(ctx->vm, test_id); + } + } + + return NULL; +} + +static pthread_t start_vcpu_worker(struct vcpu_worker_data *data) +{ + pthread_t thread_id; + + pthread_create(&thread_id, NULL, vcpu_worker, data); + + return thread_id; +} + +static void wait_vcpu_worker(pthread_t vcpu_thread) +{ + pthread_join(vcpu_thread, NULL); +} + +static void stop_vcpu_worker(pthread_t vcpu_thread, + struct vcpu_worker_data *data) +{ + WRITE_ONCE(data->stop, true); + + wait_vcpu_worker(vcpu_thread); +} + +static int do_vcpu_command(struct kvm_vm *vm, int cmd_id, + struct kvmi_msg_hdr *req, size_t req_size, + void *rpl, size_t rpl_size) +{ + struct vcpu_worker_data data = {.vm = vm, .vcpu_id = VCPU_ID }; + pthread_t vcpu_thread; + int r; + + vcpu_thread = start_vcpu_worker(&data); + + send_message(cmd_id, req, req_size); + r = receive_cmd_reply(req, rpl, rpl_size); + + stop_vcpu_worker(vcpu_thread, &data); + return r; +} + +static int do_vcpu0_command(struct kvm_vm *vm, int cmd_id, + struct kvmi_msg_hdr *req, size_t req_size, + void *rpl, size_t rpl_size) +{ + struct kvmi_vcpu_hdr *vcpu_hdr = (struct kvmi_vcpu_hdr *)(req + 1); + + vcpu_hdr->vcpu = 0; + + return do_vcpu_command(vm, cmd_id, req, req_size, rpl, rpl_size); +} + +static void test_vcpu0_command(struct kvm_vm *vm, int cmd_id, + struct kvmi_msg_hdr *req, size_t req_size, + void *rpl, size_t rpl_size) +{ + int r; + + r = do_vcpu0_command(vm, cmd_id, req, req_size, rpl, rpl_size); + TEST_ASSERT(r == 0, + "Command %d failed, error %d (%s)\n", + cmd_id, -r, kvm_strerror(-r)); +} + +static void test_cmd_vcpu_get_info(struct kvm_vm *vm) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + } req = {}; + struct kvmi_vcpu_get_info_reply rpl; + int cmd_id = KVMI_VCPU_GET_INFO; + int r; + + test_vcpu0_command(vm, cmd_id, &req.hdr, sizeof(req), + &rpl, sizeof(rpl)); + + pr_info("tsc_speed: %llu HZ\n", rpl.tsc_speed); + + req.vcpu_hdr.vcpu = 99; + r = do_command(cmd_id, &req.hdr, sizeof(req), &rpl, sizeof(rpl)); + TEST_ASSERT(r == -KVM_EINVAL, + "KVMI_VCPU_GET_INFO didn't failed with -KVM_EINVAL, error %d (%s)\n", + -r, kvm_strerror(-r)); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -601,6 +750,7 @@ static void test_introspection(struct kvm_vm *vm) test_event_unhook(vm); test_cmd_vm_control_events(vm); test_memory_access(vm); + test_cmd_vcpu_get_info(vm); unhook_introspection(vm); } @@ -626,7 +776,7 @@ int main(int argc, char *argv[]) exit(KSFT_SKIP); } - vm = vm_create_default(VCPU_ID, 0, NULL); + vm = vm_create_default(VCPU_ID, 0, guest_code); vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid()); page_size = getpagesize(); diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index fe5190ab31d6..42803b6d0e81 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -47,4 +47,8 @@ int kvmi_cmd_read_physical(struct kvm *kvm, u64 gpa, size_t size, int kvmi_cmd_write_physical(struct kvm *kvm, u64 gpa, size_t size, const void *buf); +/* arch */ +int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, + struct kvmi_vcpu_get_info_reply *rpl); + #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 31a471df4b12..ee25ba44fb0b 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -109,6 +109,15 @@ static int kvmi_msg_vm_reply(struct kvm_introspection *kvmi, return kvmi_msg_reply(kvmi, msg, err, rpl, rpl_size); } +static int kvmi_msg_vcpu_reply(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, int err, + const void *rpl, size_t rpl_size) +{ + struct kvm_introspection *kvmi = KVMI(job->vcpu->kvm); + + return kvmi_msg_reply(kvmi, msg, err, rpl, rpl_size); +} + static bool invalid_vcpu_hdr(const struct kvmi_vcpu_hdr *hdr) { return hdr->padding1 || hdr->padding2; @@ -274,6 +283,18 @@ static bool is_vm_command(u16 id) return id < ARRAY_SIZE(msg_vm) && !!msg_vm[id]; } +static int handle_vcpu_get_info(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + struct kvmi_vcpu_get_info_reply rpl; + + memset(&rpl, 0, sizeof(rpl)); + kvmi_arch_cmd_vcpu_get_info(job->vcpu, &rpl); + + return kvmi_msg_vcpu_reply(job, msg, 0, &rpl, sizeof(rpl)); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -282,6 +303,7 @@ static bool is_vm_command(u16 id) */ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, const struct kvmi_msg_hdr *, const void *) = { + [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, }; static bool is_vcpu_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 52/84] KVM: introspection: add KVMI_VCPU_PAUSE
This is the only vCPU command handled by the receiving thread. It increments a pause requests counter and kicks the vCPU out of guest. The introspection tool can pause a VM by sending this command to all vCPUs. If it sets 'wait=1', it can consider that the VM is paused when it receives the reply for last KVMI_VCPU_PAUSE command. Usually, a vCPU command is dispatched to the vCPU thread after being read from socket. This new command only signals the vCPU. Once out of guest, the vCPU will send the event that caused the VM-exist (if it is the case), handle the queued commands and only then checks its pause counter in order to send the pause events requested by the introspection tool. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 66 ++++++++++++++++++- include/linux/kvmi_host.h | 2 + include/uapi/linux/kvmi.h | 11 +++- .../testing/selftests/kvm/x86_64/kvmi_test.c | 53 +++++++++++++++ virt/kvm/introspection/kvmi.c | 63 ++++++++++++++++-- virt/kvm/introspection/kvmi_int.h | 1 + virt/kvm/introspection/kvmi_msg.c | 42 ++++++++++++ 7 files changed, 232 insertions(+), 6 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 5ead29a7b2a7..502ee06d5e77 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -480,12 +480,52 @@ Returns the TSC frequency (in HZ) for the specified vCPU if available * -KVM_EINVAL - the selected vCPU is invalid * -KVM_EAGAIN - the selected vCPU can't be introspected yet +9. KVMI_VCPU_PAUSE +------------------ + +:Architecture: all +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_pause { + __u8 wait; + __u8 padding1; + __u16 padding2; + __u32 padding3; + }; + +:Returns: + +:: + + struct kvmi_error_code; + +Kicks the vCPU out of guest. + +If `wait` is 1, the command will wait for vCPU to acknowledge the IPI. + +The vCPU will handle the pending commands/events and send the +*KVMI_EVENT_PAUSE_VCPU* event (one for every successful *KVMI_VCPU_PAUSE* +command) before returning to guest. + +:Errors: + +* -KVM_EINVAL - the padding is not zero +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EAGAIN - the selected vCPU can't be introspected yet +* -KVM_EBUSY - the selected vCPU has too many queued + *KVMI_EVENT_PAUSE_VCPU* events +* -KVM_EPERM - the *KVMI_EVENT_PAUSE_VCPU* event is disallowed + Events ===== All introspection events (VM or vCPU related) are sent using the *KVMI_EVENT* message id. No event will be sent unless -it is explicitly enabled. +it is explicitly enabled or requested (eg. *KVMI_EVENT_PAUSE_VCPU*). The *KVMI_EVENT_UNHOOK* event doesn't have a reply and share the kvmi_event structure, for consistency with the vCPU events. @@ -544,3 +584,27 @@ the guest (see **Unhooking**) and the introspection has been enabled for this event (see **KVMI_VM_CONTROL_EVENTS**). The introspection tool has a chance to unhook and close the KVMI channel (signaling that the operation can proceed). + +2. KVMI_EVENT_PAUSE_VCPU +------------------------ + +:Architectures: all +:Versions: >= 1 +:Actions: CONTINUE, CRASH +:Parameters: + +:: + + struct kvmi_event; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + +This event is sent in response to a *KVMI_VCPU_PAUSE* command. +Because it has a low priority, it will be sent after any other vCPU +introspection event and when no other vCPU introspection command is +queued. diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 956b8d5c51e3..fdb8ce6fe6a5 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -18,6 +18,8 @@ struct kvm_vcpu_introspection { struct list_head job_list; spinlock_t job_lock; + + atomic_t pause_requests; }; struct kvm_introspection { diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index a3dca420c887..3ded22020bef 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -26,12 +26,14 @@ enum { KVMI_VM_WRITE_PHYSICAL = 7, KVMI_VCPU_GET_INFO = 8, + KVMI_VCPU_PAUSE = 9, KVMI_NUM_MESSAGES }; enum { - KVMI_EVENT_UNHOOK = 0, + KVMI_EVENT_UNHOOK = 0, + KVMI_EVENT_PAUSE_VCPU = 1, KVMI_NUM_EVENTS }; @@ -107,6 +109,13 @@ struct kvmi_vcpu_hdr { __u32 padding2; }; +struct kvmi_vcpu_pause { + __u8 wait; + __u8 padding1; + __u16 padding2; + __u32 padding3; +}; + struct kvmi_event { __u16 size; __u16 vcpu; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 107661fbe52f..0df890b4b440 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -691,6 +691,17 @@ static int do_vcpu_command(struct kvm_vm *vm, int cmd_id, return r; } +static int __do_vcpu0_command(int cmd_id, struct kvmi_msg_hdr *req, + size_t req_size, void *rpl, size_t rpl_size) +{ + struct kvmi_vcpu_hdr *vcpu_hdr = (struct kvmi_vcpu_hdr *)(req + 1); + + vcpu_hdr->vcpu = 0; + + send_message(cmd_id, req, req_size); + return receive_cmd_reply(req, rpl, rpl_size); +} + static int do_vcpu0_command(struct kvm_vm *vm, int cmd_id, struct kvmi_msg_hdr *req, size_t req_size, void *rpl, size_t rpl_size) @@ -736,6 +747,47 @@ static void test_cmd_vcpu_get_info(struct kvm_vm *vm) -r, kvm_strerror(-r)); } +static void cmd_vcpu_pause(__u8 wait, __u8 padding, int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_pause cmd; + } req = {}; + int r; + + req.cmd.wait = wait; + req.cmd.padding1 = padding; + req.cmd.padding2 = padding; + req.cmd.padding3 = padding; + + r = __do_vcpu0_command(KVMI_VCPU_PAUSE, &req.hdr, sizeof(req), NULL, 0); + TEST_ASSERT(r == expected_err, + "KVMI_VCPU_PAUSE failed, error %d (%s), expected error %d\n", + -r, kvm_strerror(-r), expected_err); +} + +static void pause_vcpu(void) +{ + cmd_vcpu_pause(1, 0, 0); +} + +static void test_pause(struct kvm_vm *vm) +{ + __u8 no_wait = 0, wait = 1, wait_inval = 2; + __u8 padding = 1, no_padding = 0; + + pause_vcpu(); + + cmd_vcpu_pause(wait, no_padding, 0); + cmd_vcpu_pause(wait_inval, no_padding, -KVM_EINVAL); + cmd_vcpu_pause(no_wait, padding, -KVM_EINVAL); + + disallow_event(vm, KVMI_EVENT_PAUSE_VCPU); + cmd_vcpu_pause(no_wait, no_padding, -KVM_EPERM); + allow_event(vm, KVMI_EVENT_PAUSE_VCPU); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -751,6 +803,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vm_control_events(vm); test_memory_access(vm); test_cmd_vcpu_get_info(vm); + test_pause(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index a9d406f276f5..a704e05b3184 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -12,6 +12,8 @@ #define KVMI_NUM_COMMANDS KVMI_NUM_MESSAGES #define KVMI_MSG_SIZE_ALLOC (sizeof(struct kvmi_msg_hdr) + KVMI_MSG_SIZE) +#define MAX_PAUSE_REQUESTS 1001 + static DECLARE_BITMAP(Kvmi_always_allowed_commands, KVMI_NUM_COMMANDS); static DECLARE_BITMAP(Kvmi_known_events, KVMI_NUM_EVENTS); static DECLARE_BITMAP(Kvmi_known_vm_events, KVMI_NUM_EVENTS); @@ -90,6 +92,7 @@ static void setup_known_events(void) set_bit(KVMI_EVENT_UNHOOK, Kvmi_known_vm_events); bitmap_zero(Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); + set_bit(KVMI_EVENT_PAUSE_VCPU, Kvmi_known_vcpu_events); bitmap_or(Kvmi_known_events, Kvmi_known_vm_events, Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); @@ -113,10 +116,14 @@ void kvmi_uninit(void) kvmi_cache_destroy(); } -static void kvmi_make_request(struct kvm_vcpu *vcpu) +static void kvmi_make_request(struct kvm_vcpu *vcpu, bool wait) { kvm_make_request(KVM_REQ_INTROSPECTION, vcpu); - kvm_vcpu_kick(vcpu); + + if (wait) + kvm_vcpu_kick_and_wait(vcpu); + else + kvm_vcpu_kick(vcpu); } static int __kvmi_add_job(struct kvm_vcpu *vcpu, @@ -151,7 +158,7 @@ int kvmi_add_job(struct kvm_vcpu *vcpu, err = __kvmi_add_job(vcpu, fct, ctx, free_fct); if (!err) - kvmi_make_request(vcpu); + kvmi_make_request(vcpu, false); return err; } @@ -346,6 +353,22 @@ static int __kvmi_hook(struct kvm *kvm, return 0; } +static void kvmi_job_release_vcpu(struct kvm_vcpu *vcpu, void *ctx) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + atomic_set(&vcpui->pause_requests, 0); +} + +static void kvmi_release_vcpus(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu; + int i; + + kvm_for_each_vcpu(i, vcpu, kvm) + kvmi_add_job(vcpu, kvmi_job_release_vcpu, NULL, NULL); +} + static int kvmi_recv_thread(void *arg) { struct kvm_introspection *kvmi = arg; @@ -359,6 +382,8 @@ static int kvmi_recv_thread(void *arg) */ kvmi_sock_shutdown(kvmi); + kvmi_release_vcpus(kvmi->kvm); + kvmi_put(kvmi->kvm); return 0; } @@ -718,15 +743,45 @@ void kvmi_run_jobs(struct kvm_vcpu *vcpu) } } +static void kvmi_vcpu_pause_event(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + atomic_dec(&vcpui->pause_requests); + /* to be implemented */ +} + void kvmi_handle_requests(struct kvm_vcpu *vcpu) { + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); struct kvm_introspection *kvmi; kvmi = kvmi_get(vcpu->kvm); if (!kvmi) return; - kvmi_run_jobs(vcpu); + for (;;) { + kvmi_run_jobs(vcpu); + + if (atomic_read(&vcpui->pause_requests)) + kvmi_vcpu_pause_event(vcpu); + else + break; + } kvmi_put(vcpu->kvm); } + +int kvmi_cmd_vcpu_pause(struct kvm_vcpu *vcpu, bool wait) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + if (atomic_read(&vcpui->pause_requests) > MAX_PAUSE_REQUESTS) + return -KVM_EBUSY; + + atomic_inc(&vcpui->pause_requests); + + kvmi_make_request(vcpu, wait); + + return 0; +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 42803b6d0e81..cb99cb3db396 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -46,6 +46,7 @@ int kvmi_cmd_read_physical(struct kvm *kvm, u64 gpa, size_t size, const struct kvmi_msg_hdr *ctx); int kvmi_cmd_write_physical(struct kvm *kvm, u64 gpa, size_t size, const void *buf); +int kvmi_cmd_vcpu_pause(struct kvm_vcpu *vcpu, bool wait); /* arch */ int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index ee25ba44fb0b..1adec838cddd 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -264,12 +264,54 @@ static int handle_vm_write_physical(struct kvm_introspection *kvmi, return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); } +/* + * This vCPU command is handled by the receiving thread instead of + * the vCPU thread in order to make it easier for the introspection tool + * to implement a 'pause VM' command by sending a 'pause vCPU' command + * for every vCPU. It can consider that the VM has stopped + * once it receives the reply for the last 'pause vCPU' command. + */ +static int handle_vcpu_pause(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vcpu_hdr *vcpu_hdr = _req; + const struct kvmi_vcpu_pause *vcpu_req; + struct kvm_vcpu *vcpu = NULL; + int err; + + vcpu_req = (const struct kvmi_vcpu_pause *) (vcpu_hdr + 1); + + if (invalid_vcpu_hdr(vcpu_hdr) || vcpu_req->wait > 1) { + err = -KVM_EINVAL; + goto reply; + } + + if (vcpu_req->padding1 || vcpu_req->padding2 || vcpu_req->padding3) { + err = -KVM_EINVAL; + goto reply; + } + + if (!is_event_allowed(kvmi, KVMI_EVENT_PAUSE_VCPU)) { + err = -KVM_EPERM; + goto reply; + } + + err = kvmi_get_vcpu(kvmi, vcpu_hdr->vcpu, &vcpu); + if (!err) + err = kvmi_cmd_vcpu_pause(vcpu, vcpu_req->wait == 1); + +reply: + return kvmi_msg_vm_reply(kvmi, msg, err, NULL, 0); +} + /* * These commands are executed by the receiving thread. */ static int(*const msg_vm[])(struct kvm_introspection *, const struct kvmi_msg_hdr *, const void *) = { [KVMI_GET_VERSION] = handle_get_version, + [KVMI_VCPU_PAUSE] = handle_vcpu_pause, [KVMI_VM_CHECK_COMMAND] = handle_vm_check_command, [KVMI_VM_CHECK_EVENT] = handle_vm_check_event, [KVMI_VM_CONTROL_EVENTS] = handle_vm_control_events,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 53/84] KVM: introspection: add KVMI_EVENT_PAUSE_VCPU
This event is sent by the vCPU thread as a response to the KVMI_VCPU_PAUSE command, but it has a lower priority, being sent after any other introspection event and when no other introspection command is queued. The number of KVMI_EVENT_PAUSE_VCPU will match the number of successful KVMI_VCPU_PAUSE commands. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 22 ++- arch/x86/kvm/kvmi.c | 81 +++++++++ include/linux/kvmi_host.h | 11 ++ include/uapi/linux/kvmi.h | 13 ++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 46 ++++++ virt/kvm/introspection/kvmi.c | 24 ++- virt/kvm/introspection/kvmi_int.h | 3 + virt/kvm/introspection/kvmi_msg.c | 155 +++++++++++++++++- 8 files changed, 351 insertions(+), 4 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 502ee06d5e77..06c1cb34209e 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -563,6 +563,25 @@ On x86 the structure looks like this:: It contains information about the vCPU state at the time of the event. +An event reply begins with two common structures:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply { + __u8 action; + __u8 event; + __u16 padding1; + __u32 padding2; + }; + +All events accept the KVMI_EVENT_ACTION_CRASH action, which stops the +guest ungracefully, but as soon as possible. + +Most of the events accept the KVMI_EVENT_ACTION_CONTINUE action, which +lets the instruction that caused the event to continue. + +Some of the events accept the KVMI_EVENT_ACTION_RETRY action, to continue +by re-entering in guest. + Specific event data can follow these common structures. 1. KVMI_EVENT_UNHOOK @@ -604,7 +623,8 @@ operation can proceed). struct kvmi_vcpu_hdr; struct kvmi_event_reply; -This event is sent in response to a *KVMI_VCPU_PAUSE* command. +This event is sent in response to a *KVMI_VCPU_PAUSE* command and +cannot be controlled with *KVMI_VCPU_CONTROL_EVENTS*. Because it has a low priority, it will be sent after any other vCPU introspection event and when no other vCPU introspection command is queued. diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index cf7bfff6c8c5..ce7e2d5f2ab4 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -5,8 +5,89 @@ * Copyright (C) 2019-2020 Bitdefender S.R.L. */ +#include "linux/kvm_host.h" +#include "x86.h" #include "../../../virt/kvm/introspection/kvmi_int.h" +static unsigned int kvmi_vcpu_mode(const struct kvm_vcpu *vcpu, + const struct kvm_sregs *sregs) +{ + unsigned int mode = 0; + + if (is_long_mode((struct kvm_vcpu *) vcpu)) { + if (sregs->cs.l) + mode = 8; + else if (!sregs->cs.db) + mode = 2; + else + mode = 4; + } else if (sregs->cr0 & X86_CR0_PE) { + if (!sregs->cs.db) + mode = 2; + else + mode = 4; + } else if (!sregs->cs.db) { + mode = 2; + } else { + mode = 4; + } + + return mode; +} + +static void kvmi_get_msrs(struct kvm_vcpu *vcpu, struct kvmi_event_arch *event) +{ + struct msr_data msr; + + msr.host_initiated = true; + + msr.index = MSR_IA32_SYSENTER_CS; + kvm_x86_ops.get_msr(vcpu, &msr); + event->msrs.sysenter_cs = msr.data; + + msr.index = MSR_IA32_SYSENTER_ESP; + kvm_x86_ops.get_msr(vcpu, &msr); + event->msrs.sysenter_esp = msr.data; + + msr.index = MSR_IA32_SYSENTER_EIP; + kvm_x86_ops.get_msr(vcpu, &msr); + event->msrs.sysenter_eip = msr.data; + + msr.index = MSR_EFER; + kvm_x86_ops.get_msr(vcpu, &msr); + event->msrs.efer = msr.data; + + msr.index = MSR_STAR; + kvm_x86_ops.get_msr(vcpu, &msr); + event->msrs.star = msr.data; + + msr.index = MSR_LSTAR; + kvm_x86_ops.get_msr(vcpu, &msr); + event->msrs.lstar = msr.data; + + msr.index = MSR_CSTAR; + kvm_x86_ops.get_msr(vcpu, &msr); + event->msrs.cstar = msr.data; + + msr.index = MSR_IA32_CR_PAT; + kvm_x86_ops.get_msr(vcpu, &msr); + event->msrs.pat = msr.data; + + msr.index = MSR_KERNEL_GS_BASE; + kvm_x86_ops.get_msr(vcpu, &msr); + event->msrs.shadow_gs = msr.data; +} + +void kvmi_arch_setup_event(struct kvm_vcpu *vcpu, struct kvmi_event *ev) +{ + struct kvmi_event_arch *event = &ev->arch; + + kvm_arch_vcpu_get_regs(vcpu, &event->regs); + kvm_arch_vcpu_get_sregs(vcpu, &event->sregs); + ev->arch.mode = kvmi_vcpu_mode(vcpu, &event->sregs); + kvmi_get_msrs(vcpu, event); +} + int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, struct kvmi_vcpu_get_info_reply *rpl) { diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index fdb8ce6fe6a5..a87f0322c584 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -6,6 +6,14 @@ #include <asm/kvmi_host.h> +struct kvmi_vcpu_reply { + int error; + int action; + u32 seq; + void *data; + size_t size; +}; + struct kvmi_job { struct list_head link; void *ctx; @@ -20,6 +28,9 @@ struct kvm_vcpu_introspection { spinlock_t job_lock; atomic_t pause_requests; + + struct kvmi_vcpu_reply reply; + bool waiting_for_reply; }; struct kvm_introspection { diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 3ded22020bef..5a5b01df7e3e 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -38,6 +38,12 @@ enum { KVMI_NUM_EVENTS }; +enum { + KVMI_EVENT_ACTION_CONTINUE = 0, + KVMI_EVENT_ACTION_RETRY = 1, + KVMI_EVENT_ACTION_CRASH = 2, +}; + struct kvmi_msg_hdr { __u16 id; __u16 size; @@ -124,4 +130,11 @@ struct kvmi_event { struct kvmi_event_arch arch; }; +struct kvmi_event_reply { + __u8 action; + __u8 event; + __u16 padding1; + __u32 padding2; +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 0df890b4b440..5c5c5018832d 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -34,6 +34,12 @@ static vm_paddr_t test_gpa; static uint8_t test_write_pattern; static int page_size; +struct vcpu_reply { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_event_reply reply; +}; + struct vcpu_worker_data { struct kvm_vm *vm; int vcpu_id; @@ -772,14 +778,54 @@ static void pause_vcpu(void) cmd_vcpu_pause(1, 0, 0); } +static void reply_to_event(struct kvmi_msg_hdr *ev_hdr, struct kvmi_event *ev, + __u8 action, struct vcpu_reply *rpl, size_t rpl_size) +{ + ssize_t r; + + rpl->hdr.id = ev_hdr->id; + rpl->hdr.seq = ev_hdr->seq; + rpl->hdr.size = rpl_size - sizeof(rpl->hdr); + + rpl->vcpu_hdr.vcpu = ev->vcpu; + + rpl->reply.action = action; + rpl->reply.event = ev->event; + + r = send(Userspace_socket, rpl, rpl_size, 0); + TEST_ASSERT(r == rpl_size, + "send() failed, sending %zd, result %zd, errno %d (%s)\n", + rpl_size, r, errno, strerror(errno)); +} + +static void discard_pause_event(struct kvm_vm *vm) +{ + struct vcpu_worker_data data = {.vm = vm, .vcpu_id = VCPU_ID}; + struct vcpu_reply rpl = {}; + struct kvmi_msg_hdr hdr; + pthread_t vcpu_thread; + struct kvmi_event ev; + + vcpu_thread = start_vcpu_worker(&data); + + receive_event(&hdr, &ev, sizeof(ev), KVMI_EVENT_PAUSE_VCPU); + + reply_to_event(&hdr, &ev, KVMI_EVENT_ACTION_CONTINUE, + &rpl, sizeof(rpl)); + + stop_vcpu_worker(vcpu_thread, &data); +} + static void test_pause(struct kvm_vm *vm) { __u8 no_wait = 0, wait = 1, wait_inval = 2; __u8 padding = 1, no_padding = 0; pause_vcpu(); + discard_pause_event(vm); cmd_vcpu_pause(wait, no_padding, 0); + discard_pause_event(vm); cmd_vcpu_pause(wait_inval, no_padding, -KVM_EINVAL); cmd_vcpu_pause(no_wait, padding, -KVM_EINVAL); diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index a704e05b3184..02a866ca8d8c 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -358,6 +358,7 @@ static void kvmi_job_release_vcpu(struct kvm_vcpu *vcpu, void *ctx) struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); atomic_set(&vcpui->pause_requests, 0); + vcpui->waiting_for_reply = false; } static void kvmi_release_vcpus(struct kvm *kvm) @@ -743,12 +744,33 @@ void kvmi_run_jobs(struct kvm_vcpu *vcpu) } } +static void kvmi_handle_unsupported_event_action(struct kvm *kvm) +{ + kvmi_sock_shutdown(KVMI(kvm)); +} + +void kvmi_handle_common_event_actions(struct kvm *kvm, u32 action) +{ + switch (action) { + default: + kvmi_handle_unsupported_event_action(kvm); + } +} + static void kvmi_vcpu_pause_event(struct kvm_vcpu *vcpu) { struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + u32 action; atomic_dec(&vcpui->pause_requests); - /* to be implemented */ + + action = kvmi_msg_send_vcpu_pause(vcpu); + switch (action) { + case KVMI_EVENT_ACTION_CONTINUE: + break; + default: + kvmi_handle_common_event_actions(vcpu->kvm, action); + } } void kvmi_handle_requests(struct kvm_vcpu *vcpu) diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index cb99cb3db396..f73596032883 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -27,6 +27,7 @@ void kvmi_sock_shutdown(struct kvm_introspection *kvmi); void kvmi_sock_put(struct kvm_introspection *kvmi); bool kvmi_msg_process(struct kvm_introspection *kvmi); int kvmi_msg_send_unhook(struct kvm_introspection *kvmi); +u32 kvmi_msg_send_vcpu_pause(struct kvm_vcpu *vcpu); /* kvmi.c */ void *kvmi_msg_alloc(void); @@ -37,6 +38,7 @@ bool kvmi_is_known_vm_event(u8 id); int kvmi_add_job(struct kvm_vcpu *vcpu, void (*fct)(struct kvm_vcpu *vcpu, void *ctx), void *ctx, void (*free_fct)(void *ctx)); +void kvmi_run_jobs(struct kvm_vcpu *vcpu); int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, unsigned int event_id, bool enable); int kvmi_cmd_read_physical(struct kvm *kvm, u64 gpa, size_t size, @@ -51,5 +53,6 @@ int kvmi_cmd_vcpu_pause(struct kvm_vcpu *vcpu, bool wait); /* arch */ int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, struct kvmi_vcpu_get_info_reply *rpl); +void kvmi_arch_setup_event(struct kvm_vcpu *vcpu, struct kvmi_event *ev); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 1adec838cddd..1dcd3db75ff1 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -337,6 +337,66 @@ static int handle_vcpu_get_info(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, 0, &rpl, sizeof(rpl)); } +static int check_event_reply(const struct kvmi_msg_hdr *msg, + const struct kvmi_event_reply *reply, + const struct kvmi_vcpu_reply *expected, + u8 *action, size_t *received) +{ + size_t msg_size, common, event_size; + int err = -EINVAL; + + if (unlikely(msg->seq != expected->seq)) + return err; + + msg_size = msg->size; + common = sizeof(struct kvmi_vcpu_hdr) + sizeof(*reply); + + if (check_sub_overflow(msg_size, common, &event_size)) + return err; + + if (unlikely(event_size > expected->size)) + return err; + + if (unlikely(reply->padding1 || reply->padding2)) + return err; + + *received = event_size; + *action = reply->action; + return 0; +} + +static int handle_vcpu_event_reply(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *rpl) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(job->vcpu); + struct kvmi_vcpu_reply *expected = &vcpui->reply; + const struct kvmi_event_reply *reply = rpl; + const void *reply_data = reply + 1; + size_t useful, received; + u8 action; + + expected->error = check_event_reply(msg, reply, expected, &action, + &received); + if (unlikely(expected->error)) + goto out; + + useful = min(received, expected->size); + if (useful) + memcpy(expected->data, reply_data, useful); + + if (expected->size > useful) + memset((char *)expected->data + useful, 0, + expected->size - useful); + + expected->action = action; + expected->error = 0; + +out: + vcpui->waiting_for_reply = false; + return expected->error; +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -345,6 +405,7 @@ static int handle_vcpu_get_info(const struct kvmi_vcpu_msg_job *job, */ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, const struct kvmi_msg_hdr *, const void *) = { + [KVMI_EVENT] = handle_vcpu_event_reply, [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, }; @@ -430,7 +491,7 @@ static int kvmi_msg_do_vm_cmd(struct kvm_introspection *kvmi, static bool is_message_allowed(struct kvm_introspection *kvmi, u16 id) { - return kvmi_is_command_allowed(kvmi, id); + return id == KVMI_EVENT || kvmi_is_command_allowed(kvmi, id); } static int kvmi_msg_vm_reply_ec(struct kvm_introspection *kvmi, @@ -450,7 +511,8 @@ static int kvmi_msg_handle_vm_cmd(struct kvm_introspection *kvmi, static bool vcpu_can_handle_messages(struct kvm_vcpu *vcpu) { - return vcpu->arch.mp_state != KVM_MP_STATE_UNINITIALIZED; + return VCPUI(vcpu)->waiting_for_reply + || vcpu->arch.mp_state != KVM_MP_STATE_UNINITIALIZED; } static int kvmi_get_vcpu_if_ready(struct kvm_introspection *kvmi, @@ -554,6 +616,13 @@ static void kvmi_setup_event_common(struct kvmi_event *ev, u32 ev_id, ev->size = sizeof(*ev); } +static void kvmi_setup_event(struct kvm_vcpu *vcpu, struct kvmi_event *ev, + u32 ev_id) +{ + kvmi_setup_event_common(ev, ev_id, kvm_vcpu_get_idx(vcpu)); + kvmi_arch_setup_event(vcpu, ev); +} + int kvmi_msg_send_unhook(struct kvm_introspection *kvmi) { struct kvmi_msg_hdr hdr; @@ -570,3 +639,85 @@ int kvmi_msg_send_unhook(struct kvm_introspection *kvmi) return kvmi_sock_write(kvmi, vec, n, msg_size); } + +static int kvmi_wait_for_reply(struct kvm_vcpu *vcpu) +{ + struct rcuwait *waitp = kvm_arch_vcpu_get_wait(vcpu); + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + int err = 0; + + while (vcpui->waiting_for_reply && !err) { + kvmi_run_jobs(vcpu); + + err = rcuwait_wait_event(waitp, + !vcpui->waiting_for_reply || + !list_empty(&vcpui->job_list), + TASK_KILLABLE); + } + + return err; +} + +static void kvmi_setup_vcpu_reply(struct kvm_vcpu_introspection *vcpui, + u32 event_seq, void *rpl, size_t rpl_size) +{ + memset(&vcpui->reply, 0, sizeof(vcpui->reply)); + + vcpui->reply.seq = event_seq; + vcpui->reply.data = rpl; + vcpui->reply.size = rpl_size; + vcpui->reply.error = -EINTR; + vcpui->waiting_for_reply = true; +} + +static int kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, + void *ev, size_t ev_size, + void *rpl, size_t rpl_size, int *action) +{ + struct kvmi_msg_hdr hdr; + struct kvmi_event common; + struct kvec vec[] = { + {.iov_base = &hdr, .iov_len = sizeof(hdr) }, + {.iov_base = &common, .iov_len = sizeof(common)}, + {.iov_base = ev, .iov_len = ev_size }, + }; + size_t msg_size = sizeof(hdr) + sizeof(common) + ev_size; + size_t n = ARRAY_SIZE(vec) - (ev_size == 0 ? 1 : 0); + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + struct kvm_introspection *kvmi = KVMI(vcpu->kvm); + int err; + + kvmi_setup_event_msg_hdr(kvmi, &hdr, msg_size); + kvmi_setup_event(vcpu, &common, ev_id); + kvmi_setup_vcpu_reply(vcpui, hdr.seq, rpl, rpl_size); + + err = kvmi_sock_write(kvmi, vec, n, msg_size); + if (err) + goto out; + + err = kvmi_wait_for_reply(vcpu); + if (err) + goto out; + + err = vcpui->reply.error; + + if (!err) + *action = vcpui->reply.action; + +out: + if (err) + kvmi_sock_shutdown(kvmi); + return err; +} + +u32 kvmi_msg_send_vcpu_pause(struct kvm_vcpu *vcpu) +{ + int err, action; + + err = kvmi_send_event(vcpu, KVMI_EVENT_PAUSE_VCPU, NULL, 0, + NULL, 0, &action); + if (err) + return KVMI_EVENT_ACTION_CONTINUE; + + return action; +}
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 54/84] KVM: introspection: add the crash action handling on the event reply
From: Mihai Don?u <mdontu at bitdefender.com> This action is used in extreme cases such as blocking the spread of malware as fast as possible. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- virt/kvm/introspection/kvmi.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 02a866ca8d8c..9e0014bbf9a6 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -749,9 +749,37 @@ static void kvmi_handle_unsupported_event_action(struct kvm *kvm) kvmi_sock_shutdown(KVMI(kvm)); } +static int kvmi_vcpu_kill(int sig, struct kvm_vcpu *vcpu) +{ + struct kernel_siginfo siginfo[1] = {}; + int err = -ESRCH; + struct pid *pid; + + rcu_read_lock(); + pid = rcu_dereference(vcpu->pid); + if (pid) + err = kill_pid_info(sig, siginfo, pid); + rcu_read_unlock(); + + return err; +} + +static void kvmi_vm_shutdown(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu; + int i; + + kvm_for_each_vcpu(i, vcpu, kvm) + kvmi_vcpu_kill(SIGTERM, vcpu); +} + void kvmi_handle_common_event_actions(struct kvm *kvm, u32 action) { switch (action) { + case KVMI_EVENT_ACTION_CRASH: + kvmi_vm_shutdown(kvm); + break; + default: kvmi_handle_unsupported_event_action(kvm); }
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 55/84] KVM: introspection: add KVMI_VCPU_CONTROL_EVENTS
From: Mihai Don?u <mdontu at bitdefender.com> By default, all introspection vCPU events are disabled. The introspection tool must explicitly enable the vCPU events it wants to receive. With this command (KVMI_VCPU_CONTROL_EVENTS) it can enable/disable any vCPU event if allowed by the device manager. Some vCPU events doesn't have to be explicitly enabled (and can't be disabled) with this command because they are implicitly enabled/requested by the use of certain commands. For example, if the introspection tool uses the KVMI_VCPU_PAUSE command, it wants to receive an KVMI_EVENT_PAUSE_VCPU event. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 51 +++++++++++++++++- include/linux/kvmi_host.h | 2 + include/uapi/linux/kvmi.h | 12 ++++- .../testing/selftests/kvm/x86_64/kvmi_test.c | 54 +++++++++++++++++++ virt/kvm/introspection/kvmi.c | 26 +++++++++ virt/kvm/introspection/kvmi_int.h | 3 ++ virt/kvm/introspection/kvmi_msg.c | 26 ++++++++- 7 files changed, 169 insertions(+), 5 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 06c1cb34209e..4393ce89b2fa 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -377,6 +377,9 @@ the following events:: KVMI_EVENT_UNHOOK +The vCPU events (e.g. *KVMI_EVENT_PAUSE_VCPU*) are controlled with +the *KVMI_VCPU_CONTROL_EVENTS* command. + :Errors: * -KVM_EINVAL - the padding is not zero @@ -520,12 +523,58 @@ command) before returning to guest. *KVMI_EVENT_PAUSE_VCPU* events * -KVM_EPERM - the *KVMI_EVENT_PAUSE_VCPU* event is disallowed +10. KVMI_VCPU_CONTROL_EVENTS +---------------------------- + +:Architectures: all +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_control_events { + __u16 event_id; + __u8 enable; + __u8 padding1; + __u32 padding2; + }; + +:Returns: + +:: + + struct kvmi_error_code + +Enables/disables vCPU introspection events. + +When an event is enabled, the introspection tool is notified and +must reply with: continue, retry, crash, etc. (see **Events** below). + +The following vCPU events doesn't have to be enabled and can't be disabled, +because these are sent as a result of certain commands (but they can be +disallowed by the device manager) :: + + KVMI_EVENT_PAUSE_VCPU + +The VM events (e.g. *KVMI_EVENT_UNHOOK*) are controlled with +the *KVMI_VM_CONTROL_EVENTS* command. + +:Errors: + +* -KVM_EINVAL - the padding is not zero +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the event ID is unknown (use *KVMI_VM_CHECK_EVENT* first) +* -KVM_EPERM - the access is disallowed (use *KVMI_VM_CHECK_EVENT* first) +* -KVM_EAGAIN - the selected vCPU can't be introspected yet + Events ===== All introspection events (VM or vCPU related) are sent using the *KVMI_EVENT* message id. No event will be sent unless -it is explicitly enabled or requested (eg. *KVMI_EVENT_PAUSE_VCPU*). +it is explicitly enabled (see *KVMI_VM_CONTROL_EVENTS* +and *KVMI_VCPU_CONTROL_EVENTS*) or requested (eg. *KVMI_EVENT_PAUSE_VCPU*). The *KVMI_EVENT_UNHOOK* event doesn't have a reply and share the kvmi_event structure, for consistency with the vCPU events. diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index a87f0322c584..9625c8f19379 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -31,6 +31,8 @@ struct kvm_vcpu_introspection { struct kvmi_vcpu_reply reply; bool waiting_for_reply; + + unsigned long *ev_enable_mask; }; struct kvm_introspection { diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 5a5b01df7e3e..9ebf17fa9564 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -25,8 +25,9 @@ enum { KVMI_VM_READ_PHYSICAL = 6, KVMI_VM_WRITE_PHYSICAL = 7, - KVMI_VCPU_GET_INFO = 8, - KVMI_VCPU_PAUSE = 9, + KVMI_VCPU_GET_INFO = 8, + KVMI_VCPU_PAUSE = 9, + KVMI_VCPU_CONTROL_EVENTS = 10, KVMI_NUM_MESSAGES }; @@ -122,6 +123,13 @@ struct kvmi_vcpu_pause { __u32 padding3; }; +struct kvmi_vcpu_control_events { + __u16 event_id; + __u8 enable; + __u8 padding1; + __u32 padding2; +}; + struct kvmi_event { __u16 size; __u16 vcpu; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 5c5c5018832d..da6a06fa0baa 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -834,6 +834,59 @@ static void test_pause(struct kvm_vm *vm) allow_event(vm, KVMI_EVENT_PAUSE_VCPU); } +static void cmd_vcpu_control_event(struct kvm_vm *vm, __u16 event_id, + __u8 enable, __u16 padding, + int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_control_events cmd; + } req = {}; + int r; + + req.cmd.event_id = event_id; + req.cmd.enable = enable; + req.cmd.padding1 = padding; + req.cmd.padding2 = padding; + + r = do_vcpu0_command(vm, KVMI_VCPU_CONTROL_EVENTS, + &req.hdr, sizeof(req), NULL, 0); + TEST_ASSERT(r == expected_err, + "KVMI_VCPU_CONTROL_EVENTS failed to enable vCPU event %d, error %d (%s), expected error %d\n", + event_id, -r, kvm_strerror(-r), expected_err); +} + + +static void enable_vcpu_event(struct kvm_vm *vm, __u16 event_id) +{ + cmd_vcpu_control_event(vm, event_id, 1, 0, 0); +} + +static void disable_vcpu_event(struct kvm_vm *vm, __u16 event_id) +{ + cmd_vcpu_control_event(vm, event_id, 0, 0, 0); +} + +static void test_cmd_vcpu_control_events(struct kvm_vm *vm) +{ + __u16 id = KVMI_EVENT_PAUSE_VCPU, invalid_id = 0xffff; + __u16 padding = 1, no_padding = 0; + __u8 enable = 1, enable_inval = 2; + + enable_vcpu_event(vm, id); + disable_vcpu_event(vm, id); + + cmd_vcpu_control_event(vm, id, enable, padding, -KVM_EINVAL); + cmd_vcpu_control_event(vm, id, enable_inval, no_padding, -KVM_EINVAL); + cmd_vcpu_control_event(vm, invalid_id, enable, no_padding, -KVM_EINVAL); + + disallow_event(vm, id); + cmd_vcpu_control_event(vm, id, enable, no_padding, -KVM_EPERM); + allow_event(vm, id); + +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -850,6 +903,7 @@ static void test_introspection(struct kvm_vm *vm) test_memory_access(vm); test_cmd_vcpu_get_info(vm); test_pause(vm); + test_cmd_vcpu_control_events(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 9e0014bbf9a6..286a81e55d9d 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -73,6 +73,11 @@ bool kvmi_is_known_vm_event(u8 id) return id < KVMI_NUM_EVENTS && test_bit(id, Kvmi_known_vm_events); } +bool kvmi_is_known_vcpu_event(u8 id) +{ + return id < KVMI_NUM_EVENTS && test_bit(id, Kvmi_known_vcpu_events); +} + static bool is_vm_event_enabled(struct kvm_introspection *kvmi, int event) { return test_bit(event, kvmi->vm_event_enable_mask); @@ -179,6 +184,12 @@ static bool alloc_vcpui(struct kvm_vcpu *vcpu) if (!vcpui) return false; + vcpui->ev_enable_mask = bitmap_zalloc(KVMI_NUM_EVENTS, GFP_KERNEL); + if (!vcpui->ev_enable_mask) { + kfree(vcpu); + return false; + } + INIT_LIST_HEAD(&vcpui->job_list); spin_lock_init(&vcpui->job_lock); @@ -214,6 +225,8 @@ static void free_vcpui(struct kvm_vcpu *vcpu) free_vcpu_jobs(vcpui); + bitmap_free(vcpui->ev_enable_mask); + kfree(vcpui); vcpu->kvmi = NULL; } @@ -613,6 +626,19 @@ int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, return 0; } +int kvmi_cmd_vcpu_control_events(struct kvm_vcpu *vcpu, + unsigned int event_id, bool enable) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + if (enable) + set_bit(event_id, vcpui->ev_enable_mask); + else + clear_bit(event_id, vcpui->ev_enable_mask); + + return 0; +} + static unsigned long gfn_to_hva_safe(struct kvm *kvm, gfn_t gfn) { unsigned long hva; diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index f73596032883..57a62ebadd94 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -35,12 +35,15 @@ void kvmi_msg_free(void *addr); bool kvmi_is_command_allowed(struct kvm_introspection *kvmi, u16 id); bool kvmi_is_known_event(u8 id); bool kvmi_is_known_vm_event(u8 id); +bool kvmi_is_known_vcpu_event(u8 id); int kvmi_add_job(struct kvm_vcpu *vcpu, void (*fct)(struct kvm_vcpu *vcpu, void *ctx), void *ctx, void (*free_fct)(void *ctx)); void kvmi_run_jobs(struct kvm_vcpu *vcpu); int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, unsigned int event_id, bool enable); +int kvmi_cmd_vcpu_control_events(struct kvm_vcpu *vcpu, + unsigned int event_id, bool enable); int kvmi_cmd_read_physical(struct kvm *kvm, u64 gpa, size_t size, int (*send)(struct kvm_introspection *, const struct kvmi_msg_hdr*, diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 1dcd3db75ff1..20ef4a44d3a2 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -397,6 +397,27 @@ static int handle_vcpu_event_reply(const struct kvmi_vcpu_msg_job *job, return expected->error; } +static int handle_vcpu_control_events(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + struct kvm_introspection *kvmi = KVMI(job->vcpu->kvm); + const struct kvmi_vcpu_control_events *req = _req; + int ec; + + if (req->padding1 || req->padding2 || req->enable > 1) + ec = -KVM_EINVAL; + else if (!kvmi_is_known_vcpu_event(req->event_id)) + ec = -KVM_EINVAL; + else if (!is_event_allowed(kvmi, req->event_id)) + ec = -KVM_EPERM; + else + ec = kvmi_cmd_vcpu_control_events(job->vcpu, req->event_id, + req->enable == 1); + + return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -405,8 +426,9 @@ static int handle_vcpu_event_reply(const struct kvmi_vcpu_msg_job *job, */ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, const struct kvmi_msg_hdr *, const void *) = { - [KVMI_EVENT] = handle_vcpu_event_reply, - [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, + [KVMI_EVENT] = handle_vcpu_event_reply, + [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, + [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, }; static bool is_vcpu_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 56/84] KVM: introspection: add KVMI_VCPU_GET_REGISTERS
From: Mihai Don?u <mdontu at bitdefender.com> This command is used to get kvm_regs and kvm_sregs structures, plus a list of struct kvm_msrs from a specific vCPU. While the kvm_regs and kvm_sregs structures are included with every event, this command allows reading any MSR and can be used as a quick way to read the state of any vCPU. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 44 +++++++++ arch/x86/include/uapi/asm/kvmi.h | 15 +++ arch/x86/kvm/kvmi.c | 93 +++++++++++++++++++ include/uapi/linux/kvmi.h | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 75 +++++++++++++++ virt/kvm/introspection/kvmi_int.h | 7 ++ virt/kvm/introspection/kvmi_msg.c | 20 ++++ 7 files changed, 255 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 4393ce89b2fa..f9095e1a9417 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -568,6 +568,50 @@ the *KVMI_VM_CONTROL_EVENTS* command. * -KVM_EPERM - the access is disallowed (use *KVMI_VM_CHECK_EVENT* first) * -KVM_EAGAIN - the selected vCPU can't be introspected yet +11. KVMI_VCPU_GET_REGISTERS +--------------------------- + +:Architectures: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_get_registers { + __u16 nmsrs; + __u16 padding1; + __u32 padding2; + __u32 msrs_idx[0]; + }; + +:Returns: + +:: + + struct kvmi_error_code; + struct kvmi_vcpu_get_registers_reply { + __u32 mode; + __u32 padding; + struct kvm_regs regs; + struct kvm_sregs sregs; + struct kvm_msrs msrs; + }; + +For the given vCPU and the ``nmsrs`` sized array of MSRs registers, +returns the current vCPU mode (in bytes: 2, 4 or 8), the general purpose +registers, the special registers and the requested set of MSRs. + +:Errors: + +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - one of the indicated MSRs is invalid +* -KVM_EINVAL - the padding is not zero +* -KVM_EINVAL - the reply size is larger than KVMI_MSG_SIZE + (too many MSRs) +* -KVM_EAGAIN - the selected vCPU can't be introspected yet +* -KVM_ENOMEM - there is not enough memory to allocate the reply + Events ===== diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index 89adf84cefe4..f14674c3c109 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -30,4 +30,19 @@ struct kvmi_vcpu_get_info_reply { __u64 tsc_speed; }; +struct kvmi_vcpu_get_registers { + __u16 nmsrs; + __u16 padding1; + __u32 padding2; + __u32 msrs_idx[0]; +}; + +struct kvmi_vcpu_get_registers_reply { + __u32 mode; + __u32 padding; + struct kvm_regs regs; + struct kvm_sregs sregs; + struct kvm_msrs msrs; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index ce7e2d5f2ab4..4fd7a3c17ef5 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -98,3 +98,96 @@ int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, return 0; } + +int kvmi_arch_check_get_registers_req(const struct kvmi_msg_hdr *msg, + const struct kvmi_vcpu_get_registers *req) +{ + size_t req_size; + + if (check_add_overflow(sizeof(struct kvmi_vcpu_hdr), + struct_size(req, msrs_idx, req->nmsrs), + &req_size)) + return -1; + + if (msg->size < req_size) + return -1; + + return 0; +} + +static int kvmi_get_registers(struct kvm_vcpu *vcpu, u32 *mode, + struct kvm_regs *regs, + struct kvm_sregs *sregs, + struct kvm_msrs *msrs) +{ + struct kvm_msr_entry *msr = msrs->entries; + struct kvm_msr_entry *end = msrs->entries + msrs->nmsrs; + struct msr_data m = {.host_initiated = true}; + int err = 0; + + kvm_arch_vcpu_get_regs(vcpu, regs); + kvm_arch_vcpu_get_sregs(vcpu, sregs); + *mode = kvmi_vcpu_mode(vcpu, sregs); + + for (; msr < end && !err; msr++) { + m.index = msr->index; + + err = kvm_x86_ops.get_msr(vcpu, &m); + + if (!err) + msr->data = m.data; + } + + return err ? -KVM_EINVAL : 0; +} + +static bool valid_reply_size(size_t rpl_size) +{ + size_t msg_size; + + if (check_add_overflow(sizeof(struct kvmi_error_code), + rpl_size, &msg_size)) + return false; + + if (msg_size > KVMI_MSG_SIZE) + return false; + + return true; +} + +int kvmi_arch_cmd_vcpu_get_registers(struct kvm_vcpu *vcpu, + const struct kvmi_msg_hdr *msg, + const struct kvmi_vcpu_get_registers *req, + struct kvmi_vcpu_get_registers_reply **dest, + size_t *dest_size) +{ + struct kvmi_vcpu_get_registers_reply *rpl; + size_t rpl_size; + int err; + u16 k; + + if (req->padding1 || req->padding2) + return -KVM_EINVAL; + + rpl_size = struct_size(rpl, msrs.entries, req->nmsrs); + + if (!valid_reply_size(rpl_size)) + return -KVM_EINVAL; + + rpl = kvmi_msg_alloc(); + if (!rpl) + return -KVM_ENOMEM; + + rpl->msrs.nmsrs = req->nmsrs; + + for (k = 0; k < req->nmsrs; k++) + rpl->msrs.entries[k].index = req->msrs_idx[k]; + + err = kvmi_get_registers(vcpu, &rpl->mode, &rpl->regs, + &rpl->sregs, &rpl->msrs); + + *dest = rpl; + *dest_size = rpl_size; + + return err; +} diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 9ebf17fa9564..39ff54b4b661 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -28,6 +28,7 @@ enum { KVMI_VCPU_GET_INFO = 8, KVMI_VCPU_PAUSE = 9, KVMI_VCPU_CONTROL_EVENTS = 10, + KVMI_VCPU_GET_REGISTERS = 11, KVMI_NUM_MESSAGES }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index da6a06fa0baa..73aafc5d959a 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -887,6 +887,80 @@ static void test_cmd_vcpu_control_events(struct kvm_vm *vm) } +static void cmd_vcpu_get_registers(struct kvm_vm *vm, struct kvm_regs *regs) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_get_registers cmd; + } req = {}; + struct kvmi_vcpu_get_registers_reply rpl; + + test_vcpu0_command(vm, KVMI_VCPU_GET_REGISTERS, &req.hdr, sizeof(req), + &rpl, sizeof(rpl)); + + memcpy(regs, &rpl.regs, sizeof(*regs)); +} + +static void test_invalid_cmd_vcpu_get_registers(struct kvm_vm *vm, + struct kvmi_msg_hdr *req, + size_t req_size, void *rpl, + size_t rpl_size) +{ + int r; + + r = do_vcpu0_command(vm, KVMI_VCPU_GET_REGISTERS, req, req_size, + rpl, rpl_size); + TEST_ASSERT(r == -KVM_EINVAL, + "KVMI_VCPU_GET_REGISTERS didn't failed with -KVM_EINVAL, error %d (%s)\n", + -r, kvm_strerror(-r)); +} + +static void test_invalid_vcpu_get_registers(struct kvm_vm *vm) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_get_registers cmd; + __u32 msrs_idx[1]; + } req = {}; + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_get_registers cmd; + } *req_big; + struct kvmi_vcpu_get_registers_reply rpl; + + req.cmd.padding1 = 1; + req.cmd.padding2 = 1; + test_invalid_cmd_vcpu_get_registers(vm, &req.hdr, sizeof(req), + &rpl, sizeof(rpl)); + + req.cmd.padding1 = 0; + req.cmd.padding2 = 0; + req.cmd.nmsrs = 1; + req.cmd.msrs_idx[0] = 0xffffffff; + test_invalid_cmd_vcpu_get_registers(vm, &req.hdr, sizeof(req), + &rpl, sizeof(rpl)); + + req_big = calloc(1, KVMI_MSG_SIZE); + req_big->cmd.nmsrs = (KVMI_MSG_SIZE - sizeof(*req_big)) / sizeof(__u32); + test_invalid_cmd_vcpu_get_registers(vm, &req.hdr, sizeof(req), + &rpl, sizeof(rpl)); + free(req_big); +} + +static void test_cmd_vcpu_get_registers(struct kvm_vm *vm) +{ + struct kvm_regs regs = {}; + + cmd_vcpu_get_registers(vm, ®s); + + pr_info("get_registers rip 0x%llx\n", regs.rip); + + test_invalid_vcpu_get_registers(vm); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -904,6 +978,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_get_info(vm); test_pause(vm); test_cmd_vcpu_control_events(vm); + test_cmd_vcpu_get_registers(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 57a62ebadd94..740eea3a9531 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -57,5 +57,12 @@ int kvmi_cmd_vcpu_pause(struct kvm_vcpu *vcpu, bool wait); int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, struct kvmi_vcpu_get_info_reply *rpl); void kvmi_arch_setup_event(struct kvm_vcpu *vcpu, struct kvmi_event *ev); +int kvmi_arch_check_get_registers_req(const struct kvmi_msg_hdr *msg, + const struct kvmi_vcpu_get_registers *req); +int kvmi_arch_cmd_vcpu_get_registers(struct kvm_vcpu *vcpu, + const struct kvmi_msg_hdr *msg, + const struct kvmi_vcpu_get_registers *req, + struct kvmi_vcpu_get_registers_reply **dest, + size_t *dest_size); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 20ef4a44d3a2..6c7a600dd477 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -418,6 +418,25 @@ static int handle_vcpu_control_events(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); } +static int handle_vcpu_get_registers(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + struct kvmi_vcpu_get_registers_reply *rpl = NULL; + size_t rpl_size = 0; + int err, ec; + + if (kvmi_arch_check_get_registers_req(msg, req)) + return -EINVAL; + + ec = kvmi_arch_cmd_vcpu_get_registers(job->vcpu, msg, req, + &rpl, &rpl_size); + + err = kvmi_msg_vcpu_reply(job, msg, ec, rpl, rpl_size); + kvmi_msg_free(rpl); + return err; +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -429,6 +448,7 @@ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, [KVMI_EVENT] = handle_vcpu_event_reply, [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, + [KVMI_VCPU_GET_REGISTERS] = handle_vcpu_get_registers, }; static bool is_vcpu_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 57/84] KVM: introspection: add KVMI_VCPU_SET_REGISTERS
From: Mihai Don?u <mdontu at bitdefender.com> During an introspection event, the introspection tool might need to change the vCPU state, for example, to skip the current instruction. This command is allowed only during vCPU events and the registers will be set when the reply has been received. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Mircea C?rjaliu <mcirjaliu at bitdefender.com> Signed-off-by: Mircea C?rjaliu <mcirjaliu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 29 ++++++++ include/linux/kvmi_host.h | 3 + include/uapi/linux/kvmi.h | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 73 +++++++++++++++++++ virt/kvm/introspection/kvmi.c | 24 ++++++ virt/kvm/introspection/kvmi_int.h | 3 + virt/kvm/introspection/kvmi_msg.c | 17 ++++- 7 files changed, 149 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index f9095e1a9417..bd35002c3254 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -612,6 +612,35 @@ registers, the special registers and the requested set of MSRs. * -KVM_EAGAIN - the selected vCPU can't be introspected yet * -KVM_ENOMEM - there is not enough memory to allocate the reply +12. KVMI_VCPU_SET_REGISTERS +--------------------------- + +:Architectures: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvm_regs; + +:Returns: + +:: + + struct kvmi_error_code + +Sets the general purpose registers for the given vCPU. The changes become +visible to other threads accessing the KVM vCPU structure after the event +currently being handled is replied to. + +:Errors: + +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet +* -KVM_EOPNOTSUPP - the command hasn't been received during an introspection event + Events ===== diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 9625c8f19379..857b75a2664a 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -33,6 +33,9 @@ struct kvm_vcpu_introspection { bool waiting_for_reply; unsigned long *ev_enable_mask; + + struct kvm_regs delayed_regs; + bool have_delayed_regs; }; struct kvm_introspection { diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 39ff54b4b661..5f637a21a907 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -29,6 +29,7 @@ enum { KVMI_VCPU_PAUSE = 9, KVMI_VCPU_CONTROL_EVENTS = 10, KVMI_VCPU_GET_REGISTERS = 11, + KVMI_VCPU_SET_REGISTERS = 12, KVMI_NUM_MESSAGES }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 73aafc5d959a..ffd0337d0567 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -961,6 +961,78 @@ static void test_cmd_vcpu_get_registers(struct kvm_vm *vm) test_invalid_vcpu_get_registers(vm); } +static int __cmd_vcpu_set_registers(struct kvm_vm *vm, + struct kvm_regs *regs) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvm_regs regs; + } req = {}; + + memcpy(&req.regs, regs, sizeof(req.regs)); + + return __do_vcpu0_command(KVMI_VCPU_SET_REGISTERS, + &req.hdr, sizeof(req), NULL, 0); +} + +static void test_invalid_cmd_vcpu_set_registers(struct kvm_vm *vm) +{ + struct vcpu_worker_data data = {.vm = vm, .vcpu_id = VCPU_ID}; + pthread_t vcpu_thread; + struct kvm_regs regs; + int r; + + vcpu_thread = start_vcpu_worker(&data); + + r = __cmd_vcpu_set_registers(vm, ®s); + + stop_vcpu_worker(vcpu_thread, &data); + + TEST_ASSERT(r == -KVM_EOPNOTSUPP, + "KVMI_VCPU_SET_REGISTERS didn't failed with KVM_EOPNOTSUPP, error %d(%s)\n", + -r, kvm_strerror(-r)); +} + +static void __set_registers(struct kvm_vm *vm, + struct kvm_regs *regs) +{ + int r; + + r = __cmd_vcpu_set_registers(vm, regs); + TEST_ASSERT(r == 0, + "KVMI_VCPU_SET_REGISTERS failed, error %d(%s)\n", + -r, kvm_strerror(-r)); +} + +static void test_cmd_vcpu_set_registers(struct kvm_vm *vm) +{ + struct vcpu_worker_data data = {.vm = vm, .vcpu_id = VCPU_ID}; + __u16 event_id = KVMI_EVENT_PAUSE_VCPU; + struct kvmi_msg_hdr hdr; + pthread_t vcpu_thread; + struct kvmi_event ev; + struct vcpu_reply rpl = {}; + struct kvm_regs regs = {}; + + cmd_vcpu_get_registers(vm, ®s); + + test_invalid_cmd_vcpu_set_registers(vm); + + pause_vcpu(); + + vcpu_thread = start_vcpu_worker(&data); + + receive_event(&hdr, &ev, sizeof(ev), event_id); + + __set_registers(vm, &ev.arch.regs); + + reply_to_event(&hdr, &ev, KVMI_EVENT_ACTION_CONTINUE, + &rpl, sizeof(rpl)); + + stop_vcpu_worker(vcpu_thread, &data); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -979,6 +1051,7 @@ static void test_introspection(struct kvm_vm *vm) test_pause(vm); test_cmd_vcpu_control_events(vm); test_cmd_vcpu_get_registers(vm); + test_cmd_vcpu_set_registers(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 286a81e55d9d..2bffe9ee5b69 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -861,3 +861,27 @@ int kvmi_cmd_vcpu_pause(struct kvm_vcpu *vcpu, bool wait) return 0; } + +int kvmi_cmd_vcpu_set_registers(struct kvm_vcpu *vcpu, + const struct kvm_regs *regs) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + if (!vcpui->waiting_for_reply) + return -KVM_EOPNOTSUPP; + + memcpy(&vcpui->delayed_regs, regs, sizeof(vcpui->delayed_regs)); + vcpui->have_delayed_regs = true; + + return 0; +} + +void kvmi_post_reply(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + if (vcpui->have_delayed_regs) { + kvm_arch_vcpu_set_regs(vcpu, &vcpui->delayed_regs, false); + vcpui->have_delayed_regs = false; + } +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 740eea3a9531..1d5a07277072 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -40,6 +40,7 @@ int kvmi_add_job(struct kvm_vcpu *vcpu, void (*fct)(struct kvm_vcpu *vcpu, void *ctx), void *ctx, void (*free_fct)(void *ctx)); void kvmi_run_jobs(struct kvm_vcpu *vcpu); +void kvmi_post_reply(struct kvm_vcpu *vcpu); int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, unsigned int event_id, bool enable); int kvmi_cmd_vcpu_control_events(struct kvm_vcpu *vcpu, @@ -52,6 +53,8 @@ int kvmi_cmd_read_physical(struct kvm *kvm, u64 gpa, size_t size, int kvmi_cmd_write_physical(struct kvm *kvm, u64 gpa, size_t size, const void *buf); int kvmi_cmd_vcpu_pause(struct kvm_vcpu *vcpu, bool wait); +int kvmi_cmd_vcpu_set_registers(struct kvm_vcpu *vcpu, + const struct kvm_regs *regs); /* arch */ int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 6c7a600dd477..ed43e4d5f5b2 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -437,6 +437,18 @@ static int handle_vcpu_get_registers(const struct kvmi_vcpu_msg_job *job, return err; } +static int handle_vcpu_set_registers(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvm_regs *regs = _req; + int ec; + + ec = kvmi_cmd_vcpu_set_registers(job->vcpu, regs); + + return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -449,6 +461,7 @@ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, [KVMI_VCPU_GET_REGISTERS] = handle_vcpu_get_registers, + [KVMI_VCPU_SET_REGISTERS] = handle_vcpu_set_registers, }; static bool is_vcpu_command(u16 id) @@ -743,8 +756,10 @@ static int kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, err = vcpui->reply.error; - if (!err) + if (!err) { + kvmi_post_reply(vcpu); *action = vcpui->reply.action; + } out: if (err)
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 58/84] KVM: introspection: add KVMI_VCPU_GET_CPUID
From: Marian Rotariu <marian.c.rotariu at gmail.com> This command returns a CPUID leaf (as seen by the guest OS). Signed-off-by: Marian Rotariu <marian.c.rotariu at gmail.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 36 +++++++++++++++++++ arch/x86/include/uapi/asm/kvmi.h | 12 +++++++ arch/x86/kvm/kvmi.c | 19 ++++++++++ include/uapi/linux/kvmi.h | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 34 ++++++++++++++++++ virt/kvm/introspection/kvmi_int.h | 3 ++ virt/kvm/introspection/kvmi_msg.c | 15 ++++++++ 7 files changed, 120 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index bd35002c3254..fc2e8c756191 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -641,6 +641,42 @@ currently being handled is replied to. * -KVM_EAGAIN - the selected vCPU can't be introspected yet * -KVM_EOPNOTSUPP - the command hasn't been received during an introspection event +13. KVMI_VCPU_GET_CPUID +----------------------- + +:Architectures: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_get_cpuid { + __u32 function; + __u32 index; + }; + +:Returns: + +:: + + struct kvmi_error_code; + struct kvmi_vcpu_get_cpuid_reply { + __u32 eax; + __u32 ebx; + __u32 ecx; + __u32 edx; + }; + +Returns a CPUID leaf (as seen by the guest OS). + +:Errors: + +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet +* -KVM_ENOENT - the selected leaf is not present or is invalid + Events ===== diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index f14674c3c109..57c48ace417f 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -45,4 +45,16 @@ struct kvmi_vcpu_get_registers_reply { struct kvm_msrs msrs; }; +struct kvmi_vcpu_get_cpuid { + __u32 function; + __u32 index; +}; + +struct kvmi_vcpu_get_cpuid_reply { + __u32 eax; + __u32 ebx; + __u32 ecx; + __u32 edx; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 4fd7a3c17ef5..53c4a37e10c6 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -7,6 +7,7 @@ #include "linux/kvm_host.h" #include "x86.h" +#include "cpuid.h" #include "../../../virt/kvm/introspection/kvmi_int.h" static unsigned int kvmi_vcpu_mode(const struct kvm_vcpu *vcpu, @@ -191,3 +192,21 @@ int kvmi_arch_cmd_vcpu_get_registers(struct kvm_vcpu *vcpu, return err; } + +int kvmi_arch_cmd_vcpu_get_cpuid(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_get_cpuid *req, + struct kvmi_vcpu_get_cpuid_reply *rpl) +{ + struct kvm_cpuid_entry2 *e; + + e = kvm_find_cpuid_entry(vcpu, req->function, req->index); + if (!e) + return -KVM_ENOENT; + + rpl->eax = e->eax; + rpl->ebx = e->ebx; + rpl->ecx = e->ecx; + rpl->edx = e->edx; + + return 0; +} diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 5f637a21a907..d7f4360e609e 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -30,6 +30,7 @@ enum { KVMI_VCPU_CONTROL_EVENTS = 10, KVMI_VCPU_GET_REGISTERS = 11, KVMI_VCPU_SET_REGISTERS = 12, + KVMI_VCPU_GET_CPUID = 13, KVMI_NUM_MESSAGES }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index ffd0337d0567..7269afd4c36d 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -1033,6 +1033,39 @@ static void test_cmd_vcpu_set_registers(struct kvm_vm *vm) stop_vcpu_worker(vcpu_thread, &data); } +static int cmd_vcpu_get_cpuid(struct kvm_vm *vm, + __u32 function, __u32 index, + struct kvmi_vcpu_get_cpuid_reply *rpl) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_get_cpuid cmd; + } req = {}; + + req.cmd.function = function; + req.cmd.index = index; + + return do_vcpu0_command(vm, KVMI_VCPU_GET_CPUID, &req.hdr, sizeof(req), + rpl, sizeof(*rpl)); +} + +static void test_cmd_vcpu_get_cpuid(struct kvm_vm *vm) +{ + struct kvmi_vcpu_get_cpuid_reply rpl = {}; + __u32 function = 0; + __u32 index = 0; + int r; + + r = cmd_vcpu_get_cpuid(vm, function, index, &rpl); + TEST_ASSERT(r == 0, + "KVMI_VCPU_GET_CPUID failed, error %d(%s)\n", + -r, kvm_strerror(-r)); + + pr_info("cpuid(%u, %u) => eax 0x%.8x, ebx 0x%.8x, ecx 0x%.8x, edx 0x%.8x\n", + function, index, rpl.eax, rpl.ebx, rpl.ecx, rpl.edx); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1052,6 +1085,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_control_events(vm); test_cmd_vcpu_get_registers(vm); test_cmd_vcpu_set_registers(vm); + test_cmd_vcpu_get_cpuid(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 1d5a07277072..bc8b5c03b057 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -67,5 +67,8 @@ int kvmi_arch_cmd_vcpu_get_registers(struct kvm_vcpu *vcpu, const struct kvmi_vcpu_get_registers *req, struct kvmi_vcpu_get_registers_reply **dest, size_t *dest_size); +int kvmi_arch_cmd_vcpu_get_cpuid(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_get_cpuid *req, + struct kvmi_vcpu_get_cpuid_reply *rpl); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index ed43e4d5f5b2..61c96a24a730 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -449,6 +449,20 @@ static int handle_vcpu_set_registers(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); } +static int handle_vcpu_get_cpuid(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + struct kvmi_vcpu_get_cpuid_reply rpl; + int ec; + + memset(&rpl, 0, sizeof(rpl)); + + ec = kvmi_arch_cmd_vcpu_get_cpuid(job->vcpu, req, &rpl); + + return kvmi_msg_vcpu_reply(job, msg, ec, &rpl, sizeof(rpl)); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -459,6 +473,7 @@ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, const struct kvmi_msg_hdr *, const void *) = { [KVMI_EVENT] = handle_vcpu_event_reply, [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, + [KVMI_VCPU_GET_CPUID] = handle_vcpu_get_cpuid, [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, [KVMI_VCPU_GET_REGISTERS] = handle_vcpu_get_registers, [KVMI_VCPU_SET_REGISTERS] = handle_vcpu_set_registers,
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 59/84] KVM: introspection: add KVMI_EVENT_HYPERCALL
From: Mihai Don?u <mdontu at bitdefender.com> This event is sent on a specific hypercall. It is used by the code residing inside the introspected guest to call the introspection tool and to report certain details about its operation. For example, a classic antimalware remediation tool can report what it has found during a scan. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/hypercalls.rst | 35 ++++++++++++++++ Documentation/virt/kvm/kvmi.rst | 36 +++++++++++++++- arch/x86/include/uapi/asm/kvmi.h | 2 + arch/x86/kvm/kvmi.c | 32 ++++++++++++++ arch/x86/kvm/x86.c | 18 ++++++-- include/linux/kvmi_host.h | 2 + include/uapi/linux/kvm_para.h | 1 + include/uapi/linux/kvmi.h | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 42 +++++++++++++++++++ virt/kvm/introspection/kvmi.c | 23 ++++++++++ virt/kvm/introspection/kvmi_int.h | 9 ++++ virt/kvm/introspection/kvmi_msg.c | 12 ++++++ 12 files changed, 208 insertions(+), 5 deletions(-) diff --git a/Documentation/virt/kvm/hypercalls.rst b/Documentation/virt/kvm/hypercalls.rst index 70e77c66b64c..abfbff96b9e3 100644 --- a/Documentation/virt/kvm/hypercalls.rst +++ b/Documentation/virt/kvm/hypercalls.rst @@ -169,3 +169,38 @@ a0: destination APIC ID :Usage example: When sending a call-function IPI-many to vCPUs, yield if any of the IPI target vCPUs was preempted. + +9. KVM_HC_XEN_HVM_OP +-------------------- + +:Architecture: x86 +:Status: active +:Purpose: To enable communication between a guest agent and a VMI application + +Usage: + +An event will be sent to the VMI application (see kvmi.rst) if the following +registers, which differ between 32bit and 64bit, have the following values: + + ======== ===== ====+ 32bit 64bit value + ======== ===== ====+ ebx (a0) rdi KVM_HC_XEN_HVM_OP_GUEST_REQUEST_VM_EVENT + ecx (a1) rsi 0 + ======== ===== ====+ +This specification copies Xen's { __HYPERVISOR_hvm_op, +HVMOP_guest_request_vm_event } hypercall and can originate from kernel or +userspace. + +It returns 0 if successful, or a negative POSIX.1 error code if it fails. The +absence of an active VMI application is not signaled in any way. + +The following registers are clobbered: + + * 32bit: edx, esi, edi, ebp + * 64bit: rdx, r10, r8, r9 + +In particular, for KVM_HC_XEN_HVM_OP_GUEST_REQUEST_VM_EVENT, the last two +registers can be poisoned deliberately and cannot be used for passing +information. diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index fc2e8c756191..d062f2ccf365 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -546,7 +546,10 @@ command) before returning to guest. struct kvmi_error_code -Enables/disables vCPU introspection events. +Enables/disables vCPU introspection events. This command can be used with +the following events:: + + KVMI_EVENT_HYPERCALL When an event is enabled, the introspection tool is notified and must reply with: continue, retry, crash, etc. (see **Events** below). @@ -786,3 +789,34 @@ cannot be controlled with *KVMI_VCPU_CONTROL_EVENTS*. Because it has a low priority, it will be sent after any other vCPU introspection event and when no other vCPU introspection command is queued. + +3. KVMI_EVENT_HYPERCALL +----------------------- + +:Architectures: x86 +:Versions: >= 1 +:Actions: CONTINUE, CRASH +:Parameters: + +:: + + struct kvmi_event; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + +This event is sent on a specific user hypercall when the introspection has +been enabled for this event (see *KVMI_VCPU_CONTROL_EVENTS*). + +The hypercall number must be ``KVM_HC_XEN_HVM_OP`` with the +``KVM_HC_XEN_HVM_OP_GUEST_REQUEST_VM_EVENT`` sub-function +(see hypercalls.rst). + +It is used by the code residing inside the introspected guest to call the +introspection tool and to report certain details about its operation. For +example, a classic antimalware remediation tool can report what it has +found during a scan. diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index 57c48ace417f..9882e68cab75 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -8,6 +8,8 @@ #include <asm/kvm.h> +#define KVM_HC_XEN_HVM_OP_GUEST_REQUEST_VM_EVENT 24 + struct kvmi_event_arch { __u8 mode; /* 2, 4 or 8 */ __u8 padding[7]; diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 53c4a37e10c6..45f1a45d5c0f 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -210,3 +210,35 @@ int kvmi_arch_cmd_vcpu_get_cpuid(struct kvm_vcpu *vcpu, return 0; } + +bool kvmi_arch_is_agent_hypercall(struct kvm_vcpu *vcpu) +{ + unsigned long subfunc1, subfunc2; + bool longmode = is_64_bit_mode(vcpu); + + if (longmode) { + subfunc1 = kvm_rdi_read(vcpu); + subfunc2 = kvm_rsi_read(vcpu); + } else { + subfunc1 = kvm_rbx_read(vcpu); + subfunc1 &= 0xFFFFFFFF; + subfunc2 = kvm_rcx_read(vcpu); + subfunc2 &= 0xFFFFFFFF; + } + + return (subfunc1 == KVM_HC_XEN_HVM_OP_GUEST_REQUEST_VM_EVENT + && subfunc2 == 0); +} + +void kvmi_arch_hypercall_event(struct kvm_vcpu *vcpu) +{ + u32 action; + + action = kvmi_msg_send_hypercall(vcpu); + switch (action) { + case KVMI_EVENT_ACTION_CONTINUE: + break; + default: + kvmi_handle_common_event_actions(vcpu->kvm, action); + } +} diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7f56e2149f18..0d5ce07c4164 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7681,11 +7681,14 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) { unsigned long nr, a0, a1, a2, a3, ret; int op_64_bit; - - if (kvm_hv_hypercall_enabled(vcpu->kvm)) - return kvm_hv_hypercall(vcpu); + bool kvmi_hc; nr = kvm_rax_read(vcpu); + kvmi_hc = (u32)nr == KVM_HC_XEN_HVM_OP; + + if (kvm_hv_hypercall_enabled(vcpu->kvm) && !kvmi_hc) + return kvm_hv_hypercall(vcpu); + a0 = kvm_rbx_read(vcpu); a1 = kvm_rcx_read(vcpu); a2 = kvm_rdx_read(vcpu); @@ -7702,7 +7705,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) a3 &= 0xFFFFFFFF; } - if (kvm_x86_ops.get_cpl(vcpu) != 0) { + if (kvm_x86_ops.get_cpl(vcpu) != 0 && !kvmi_hc) { ret = -KVM_EPERM; goto out; } @@ -7728,6 +7731,13 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) kvm_sched_yield(vcpu->kvm, a0); ret = 0; break; +#ifdef CONFIG_KVM_INTROSPECTION + case KVM_HC_XEN_HVM_OP: + ret = 0; + if (!kvmi_hypercall_event(vcpu)) + ret = -KVM_ENOSYS; + break; +#endif /* CONFIG_KVM_INTROSPECTION */ default: ret = -KVM_ENOSYS; break; diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 857b75a2664a..5c4b9c160019 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -72,6 +72,7 @@ int kvmi_ioctl_event(struct kvm *kvm, int kvmi_ioctl_preunhook(struct kvm *kvm); void kvmi_handle_requests(struct kvm_vcpu *vcpu); +bool kvmi_hypercall_event(struct kvm_vcpu *vcpu); #else @@ -82,6 +83,7 @@ static inline void kvmi_destroy_vm(struct kvm *kvm) { } static inline void kvmi_vcpu_uninit(struct kvm_vcpu *vcpu) { } static inline void kvmi_handle_requests(struct kvm_vcpu *vcpu) { } +static inline bool kvmi_hypercall_event(struct kvm_vcpu *vcpu) { return false; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h index 3ce388249682..53cebbe22099 100644 --- a/include/uapi/linux/kvm_para.h +++ b/include/uapi/linux/kvm_para.h @@ -33,6 +33,7 @@ #define KVM_HC_CLOCK_PAIRING 9 #define KVM_HC_SEND_IPI 10 #define KVM_HC_SCHED_YIELD 11 +#define KVM_HC_XEN_HVM_OP 34 /* Xen's __HYPERVISOR_hvm_op */ /* * hypercalls use architecture specific diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index d7f4360e609e..7d1febd671d7 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -38,6 +38,7 @@ enum { enum { KVMI_EVENT_UNHOOK = 0, KVMI_EVENT_PAUSE_VCPU = 1, + KVMI_EVENT_HYPERCALL = 2, KVMI_NUM_EVENTS }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 7269afd4c36d..3a76842e48c6 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -49,6 +49,7 @@ struct vcpu_worker_data { enum { GUEST_TEST_NOOP = 0, + GUEST_TEST_HYPERCALL, }; #define GUEST_REQUEST_TEST() GUEST_SYNC(0) @@ -62,12 +63,23 @@ static int guest_test_id(void) return READ_ONCE(test_id); } +static void guest_hypercall_test(void) +{ + asm volatile("mov $34, %rax"); + asm volatile("mov $24, %rdi"); + asm volatile("mov $0, %rsi"); + asm volatile(".byte 0x0f,0x01,0xc1"); +} + static void guest_code(void) { while (true) { switch (guest_test_id()) { case GUEST_TEST_NOOP: break; + case GUEST_TEST_HYPERCALL: + guest_hypercall_test(); + break; } GUEST_SIGNAL_TEST_DONE(); } @@ -1066,6 +1078,35 @@ static void test_cmd_vcpu_get_cpuid(struct kvm_vm *vm) function, index, rpl.eax, rpl.ebx, rpl.ecx, rpl.edx); } +static void test_event_hypercall(struct kvm_vm *vm) +{ + struct vcpu_worker_data data = { + .vm = vm, + .vcpu_id = VCPU_ID, + .test_id = GUEST_TEST_HYPERCALL, + }; + struct kvmi_msg_hdr hdr; + struct kvmi_event ev; + struct vcpu_reply rpl = {}; + __u16 event_id = KVMI_EVENT_HYPERCALL; + pthread_t vcpu_thread; + + enable_vcpu_event(vm, event_id); + + vcpu_thread = start_vcpu_worker(&data); + + receive_event(&hdr, &ev, sizeof(ev), event_id); + + pr_info("Hypercall event, rip 0x%llx\n", ev.arch.regs.rip); + + reply_to_event(&hdr, &ev, KVMI_EVENT_ACTION_CONTINUE, + &rpl, sizeof(rpl)); + + stop_vcpu_worker(vcpu_thread, &data); + + disable_vcpu_event(vm, event_id); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1086,6 +1127,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_get_registers(vm); test_cmd_vcpu_set_registers(vm); test_cmd_vcpu_get_cpuid(vm); + test_event_hypercall(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 2bffe9ee5b69..571b79c52353 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -97,6 +97,7 @@ static void setup_known_events(void) set_bit(KVMI_EVENT_UNHOOK, Kvmi_known_vm_events); bitmap_zero(Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); + set_bit(KVMI_EVENT_HYPERCALL, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_PAUSE_VCPU, Kvmi_known_vcpu_events); bitmap_or(Kvmi_known_events, Kvmi_known_vm_events, @@ -885,3 +886,25 @@ void kvmi_post_reply(struct kvm_vcpu *vcpu) vcpui->have_delayed_regs = false; } } + +bool kvmi_hypercall_event(struct kvm_vcpu *vcpu) +{ + struct kvm_introspection *kvmi; + bool ret = false; + + if (!kvmi_arch_is_agent_hypercall(vcpu)) + return ret; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return ret; + + if (is_event_enabled(vcpu, KVMI_EVENT_HYPERCALL)) { + kvmi_arch_hypercall_event(vcpu); + ret = true; + } + + kvmi_put(vcpu->kvm); + + return ret; +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index bc8b5c03b057..8c02528e3056 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -21,6 +21,11 @@ #define KVMI(kvm) ((kvm)->kvmi) #define VCPUI(vcpu) ((vcpu)->kvmi) +static inline bool is_event_enabled(struct kvm_vcpu *vcpu, int event) +{ + return test_bit(event, VCPUI(vcpu)->ev_enable_mask); +} + /* kvmi_msg.c */ bool kvmi_sock_get(struct kvm_introspection *kvmi, int fd); void kvmi_sock_shutdown(struct kvm_introspection *kvmi); @@ -28,6 +33,7 @@ void kvmi_sock_put(struct kvm_introspection *kvmi); bool kvmi_msg_process(struct kvm_introspection *kvmi); int kvmi_msg_send_unhook(struct kvm_introspection *kvmi); u32 kvmi_msg_send_vcpu_pause(struct kvm_vcpu *vcpu); +u32 kvmi_msg_send_hypercall(struct kvm_vcpu *vcpu); /* kvmi.c */ void *kvmi_msg_alloc(void); @@ -41,6 +47,7 @@ int kvmi_add_job(struct kvm_vcpu *vcpu, void *ctx, void (*free_fct)(void *ctx)); void kvmi_run_jobs(struct kvm_vcpu *vcpu); void kvmi_post_reply(struct kvm_vcpu *vcpu); +void kvmi_handle_common_event_actions(struct kvm *kvm, u32 action); int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, unsigned int event_id, bool enable); int kvmi_cmd_vcpu_control_events(struct kvm_vcpu *vcpu, @@ -70,5 +77,7 @@ int kvmi_arch_cmd_vcpu_get_registers(struct kvm_vcpu *vcpu, int kvmi_arch_cmd_vcpu_get_cpuid(struct kvm_vcpu *vcpu, const struct kvmi_vcpu_get_cpuid *req, struct kvmi_vcpu_get_cpuid_reply *rpl); +bool kvmi_arch_is_agent_hypercall(struct kvm_vcpu *vcpu); +void kvmi_arch_hypercall_event(struct kvm_vcpu *vcpu); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 61c96a24a730..80ade6055747 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -793,3 +793,15 @@ u32 kvmi_msg_send_vcpu_pause(struct kvm_vcpu *vcpu) return action; } + +u32 kvmi_msg_send_hypercall(struct kvm_vcpu *vcpu) +{ + int err, action; + + err = kvmi_send_event(vcpu, KVMI_EVENT_HYPERCALL, NULL, 0, + NULL, 0, &action); + if (err) + return KVMI_EVENT_ACTION_CONTINUE; + + return action; +}
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 60/84] KVM: introspection: add KVMI_EVENT_BREAKPOINT
From: Mihai Don?u <mdontu at bitdefender.com> This event is sent when a breakpoint was reached. The introspection tool can place breakpoints and use them as notification for when the OS or an application has reached a certain state or is trying to perform a certain operation (eg. create a process). Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 40 ++++++++++++++++ arch/x86/include/uapi/asm/kvmi.h | 6 +++ arch/x86/kvm/kvmi.c | 48 +++++++++++++++++++ arch/x86/kvm/svm/svm.c | 34 +++++++++++++ arch/x86/kvm/vmx/vmx.c | 17 +++++-- include/linux/kvmi_host.h | 4 ++ include/uapi/linux/kvmi.h | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 46 ++++++++++++++++++ virt/kvm/introspection/kvmi.c | 23 ++++++++- virt/kvm/introspection/kvmi_int.h | 4 ++ virt/kvm/introspection/kvmi_msg.c | 17 +++++++ 11 files changed, 235 insertions(+), 5 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index d062f2ccf365..110a6e7a7d2a 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -549,6 +549,7 @@ command) before returning to guest. Enables/disables vCPU introspection events. This command can be used with the following events:: + KVMI_EVENT_BREAKPOINT KVMI_EVENT_HYPERCALL When an event is enabled, the introspection tool is notified and @@ -570,6 +571,9 @@ the *KVMI_VM_CONTROL_EVENTS* command. * -KVM_EINVAL - the event ID is unknown (use *KVMI_VM_CHECK_EVENT* first) * -KVM_EPERM - the access is disallowed (use *KVMI_VM_CHECK_EVENT* first) * -KVM_EAGAIN - the selected vCPU can't be introspected yet +* -KVM_EBUSY - the event can't be intercepted right now + (e.g. KVMI_EVENT_BREAKPOINT if the #BP event is already intercepted + by userspace) 11. KVMI_VCPU_GET_REGISTERS --------------------------- @@ -820,3 +824,39 @@ It is used by the code residing inside the introspected guest to call the introspection tool and to report certain details about its operation. For example, a classic antimalware remediation tool can report what it has found during a scan. + +4. KVMI_EVENT_BREAKPOINT +------------------------ + +:Architectures: x86 +:Versions: >= 1 +:Actions: CONTINUE, CRASH, RETRY +:Parameters: + +:: + + struct kvmi_event; + struct kvmi_event_breakpoint { + __u64 gpa; + __u8 insn_len; + __u8 padding[7]; + }; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + +This event is sent when a breakpoint was reached and the introspection has +been enabled for this event (see *KVMI_VCPU_CONTROL_EVENTS*). + +Some of these breakpoints could have been injected by the introspection tool, +placed in the slack space of various functions and used as notification +for when the OS or an application has reached a certain state or is +trying to perform a certain operation (like creating a process). + +``kvmi_event`` and the guest physical address are sent to the introspection tool. + +The *RETRY* action is used by the introspection tool for its own breakpoints. diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index 9882e68cab75..1605777256a3 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -59,4 +59,10 @@ struct kvmi_vcpu_get_cpuid_reply { __u32 edx; }; +struct kvmi_event_breakpoint { + __u64 gpa; + __u8 insn_len; + __u8 padding[7]; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 45f1a45d5c0f..f13272350bc9 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -242,3 +242,51 @@ void kvmi_arch_hypercall_event(struct kvm_vcpu *vcpu) kvmi_handle_common_event_actions(vcpu->kvm, action); } } + +static int kvmi_control_bp_intercept(struct kvm_vcpu *vcpu, bool enable) +{ + struct kvm_guest_debug dbg = {}; + int err = 0; + + if (enable) + dbg.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; + err = kvm_arch_vcpu_set_guest_debug(vcpu, &dbg); + + return err; +} + +int kvmi_arch_cmd_control_intercept(struct kvm_vcpu *vcpu, + unsigned int event_id, bool enable) +{ + int err = 0; + + switch (event_id) { + case KVMI_EVENT_BREAKPOINT: + err = kvmi_control_bp_intercept(vcpu, enable); + break; + default: + break; + } + + return err; +} + +void kvmi_arch_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len) +{ + u32 action; + u64 gpa; + + gpa = kvm_mmu_gva_to_gpa_system(vcpu, gva, 0, NULL); + + action = kvmi_msg_send_bp(vcpu, gpa, insn_len); + switch (action) { + case KVMI_EVENT_ACTION_CONTINUE: + kvm_queue_exception(vcpu, BP_VECTOR); + break; + case KVMI_EVENT_ACTION_RETRY: + /* rip was most likely adjusted past the INT 3 instruction */ + break; + default: + kvmi_handle_common_event_actions(vcpu->kvm, action); + } +} diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 580997701b1c..47c50e2f0f86 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -25,6 +25,7 @@ #include <linux/pagemap.h> #include <linux/swap.h> #include <linux/rwsem.h> +#include <linux/kvmi_host.h> #include <asm/apic.h> #include <asm/perf_event.h> @@ -1841,10 +1842,43 @@ static int db_interception(struct vcpu_svm *svm) return 1; } +static unsigned svm_get_instruction_len(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + unsigned long rip = kvm_rip_read(vcpu); + unsigned long next_rip = 0; + unsigned insn_len; + + if (static_cpu_has(X86_FEATURE_NRIPS)) + next_rip = svm->vmcb->control.next_rip; + + if (!next_rip) { + if (!kvm_emulate_instruction(vcpu, EMULTYPE_SKIP)) + return 0; + + next_rip = kvm_rip_read(vcpu); + kvm_rip_write(vcpu, rip); + } + + insn_len = next_rip - rip; + if (insn_len > MAX_INST_SIZE) { + pr_err("%s: ip 0x%lx next 0x%lx\n", + __func__, rip, next_rip); + return 0; + } + + return insn_len; +} + static int bp_interception(struct vcpu_svm *svm) { struct kvm_run *kvm_run = svm->vcpu.run; + if (!kvmi_breakpoint_event(&svm->vcpu, svm->vmcb->save.cs.base + + svm->vmcb->save.rip, + svm_get_instruction_len(&svm->vcpu))) + return 1; + kvm_run->exit_reason = KVM_EXIT_DEBUG; kvm_run->debug.arch.pc = svm->vmcb->save.cs.base + svm->vmcb->save.rip; kvm_run->debug.arch.exception = BP_VECTOR; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 4ef4f3c1b78a..9e1bea74b74c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -27,6 +27,7 @@ #include <linux/slab.h> #include <linux/tboot.h> #include <linux/trace_events.h> +#include <linux/kvmi_host.h> #include <asm/apic.h> #include <asm/asm.h> @@ -4798,7 +4799,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu); struct kvm_run *kvm_run = vcpu->run; u32 intr_info, ex_no, error_code; - unsigned long cr2, rip, dr6; + unsigned long cr2, dr6; u32 vect_info; vect_info = vmx->idt_vectoring_info; @@ -4871,7 +4872,10 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1 | DR6_RTM; kvm_run->debug.arch.dr7 = vmcs_readl(GUEST_DR7); /* fall through */ - case BP_VECTOR: + case BP_VECTOR: { + unsigned long gva = vmcs_readl(GUEST_CS_BASE) + + kvm_rip_read(vcpu); + /* * Update instruction length as we may reinject #BP from * user space while in guest debugging mode. Reading it for @@ -4879,11 +4883,16 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) */ vmx->vcpu.arch.event_exit_inst_len vmcs_read32(VM_EXIT_INSTRUCTION_LEN); + + if (!kvmi_breakpoint_event(vcpu, gva, + vmx->vcpu.arch.event_exit_inst_len)) + return 1; + kvm_run->exit_reason = KVM_EXIT_DEBUG; - rip = kvm_rip_read(vcpu); - kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip; + kvm_run->debug.arch.pc = gva; kvm_run->debug.arch.exception = ex_no; break; + } case AC_VECTOR: if (guest_inject_ac(vcpu)) { kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 5c4b9c160019..c4fac41bd5c7 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -73,6 +73,7 @@ int kvmi_ioctl_preunhook(struct kvm *kvm); void kvmi_handle_requests(struct kvm_vcpu *vcpu); bool kvmi_hypercall_event(struct kvm_vcpu *vcpu); +bool kvmi_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len); #else @@ -84,6 +85,9 @@ static inline void kvmi_vcpu_uninit(struct kvm_vcpu *vcpu) { } static inline void kvmi_handle_requests(struct kvm_vcpu *vcpu) { } static inline bool kvmi_hypercall_event(struct kvm_vcpu *vcpu) { return false; } +static inline bool kvmi_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, + u8 insn_len) + { return true; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 7d1febd671d7..026ae5911b1c 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -39,6 +39,7 @@ enum { KVMI_EVENT_UNHOOK = 0, KVMI_EVENT_PAUSE_VCPU = 1, KVMI_EVENT_HYPERCALL = 2, + KVMI_EVENT_BREAKPOINT = 3, KVMI_NUM_EVENTS }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 3a76842e48c6..3b921e3cf958 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -49,6 +49,7 @@ struct vcpu_worker_data { enum { GUEST_TEST_NOOP = 0, + GUEST_TEST_BP, GUEST_TEST_HYPERCALL, }; @@ -63,6 +64,11 @@ static int guest_test_id(void) return READ_ONCE(test_id); } +static void guest_bp_test(void) +{ + asm volatile("int3"); +} + static void guest_hypercall_test(void) { asm volatile("mov $34, %rax"); @@ -77,6 +83,9 @@ static void guest_code(void) switch (guest_test_id()) { case GUEST_TEST_NOOP: break; + case GUEST_TEST_BP: + guest_bp_test(); + break; case GUEST_TEST_HYPERCALL: guest_hypercall_test(); break; @@ -1107,6 +1116,42 @@ static void test_event_hypercall(struct kvm_vm *vm) disable_vcpu_event(vm, event_id); } +static void test_event_breakpoint(struct kvm_vm *vm) +{ + struct vcpu_worker_data data = { + .vm = vm, + .vcpu_id = VCPU_ID, + .test_id = GUEST_TEST_BP, + }; + struct kvmi_msg_hdr hdr; + struct { + struct kvmi_event common; + struct kvmi_event_breakpoint bp; + } ev; + struct vcpu_reply rpl = {}; + __u16 event_id = KVMI_EVENT_BREAKPOINT; + pthread_t vcpu_thread; + + enable_vcpu_event(vm, event_id); + + vcpu_thread = start_vcpu_worker(&data); + + receive_event(&hdr, &ev.common, sizeof(ev), event_id); + + pr_info("Breakpoint event, rip 0x%llx, len %u\n", + ev.common.arch.regs.rip, ev.bp.insn_len); + + ev.common.arch.regs.rip += ev.bp.insn_len; + __set_registers(vm, &ev.common.arch.regs); + + reply_to_event(&hdr, &ev.common, KVMI_EVENT_ACTION_RETRY, + &rpl, sizeof(rpl)); + + stop_vcpu_worker(vcpu_thread, &data); + + disable_vcpu_event(vm, event_id); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1128,6 +1173,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_set_registers(vm); test_cmd_vcpu_get_cpuid(vm); test_event_hypercall(vm); + test_event_breakpoint(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 571b79c52353..a5264696c630 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -97,6 +97,7 @@ static void setup_known_events(void) set_bit(KVMI_EVENT_UNHOOK, Kvmi_known_vm_events); bitmap_zero(Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); + set_bit(KVMI_EVENT_BREAKPOINT, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_HYPERCALL, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_PAUSE_VCPU, Kvmi_known_vcpu_events); @@ -637,7 +638,7 @@ int kvmi_cmd_vcpu_control_events(struct kvm_vcpu *vcpu, else clear_bit(event_id, vcpui->ev_enable_mask); - return 0; + return kvmi_arch_cmd_control_intercept(vcpu, event_id, enable); } static unsigned long gfn_to_hva_safe(struct kvm *kvm, gfn_t gfn) @@ -908,3 +909,23 @@ bool kvmi_hypercall_event(struct kvm_vcpu *vcpu) return ret; } + +bool kvmi_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len) +{ + struct kvm_introspection *kvmi; + bool ret = false; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return true; + + if (is_event_enabled(vcpu, KVMI_EVENT_BREAKPOINT)) + kvmi_arch_breakpoint_event(vcpu, gva, insn_len); + else + ret = true; + + kvmi_put(vcpu->kvm); + + return ret; +} +EXPORT_SYMBOL(kvmi_breakpoint_event); diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 8c02528e3056..810dde913ad6 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -34,6 +34,7 @@ bool kvmi_msg_process(struct kvm_introspection *kvmi); int kvmi_msg_send_unhook(struct kvm_introspection *kvmi); u32 kvmi_msg_send_vcpu_pause(struct kvm_vcpu *vcpu); u32 kvmi_msg_send_hypercall(struct kvm_vcpu *vcpu); +u32 kvmi_msg_send_bp(struct kvm_vcpu *vcpu, u64 gpa, u8 insn_len); /* kvmi.c */ void *kvmi_msg_alloc(void); @@ -79,5 +80,8 @@ int kvmi_arch_cmd_vcpu_get_cpuid(struct kvm_vcpu *vcpu, struct kvmi_vcpu_get_cpuid_reply *rpl); bool kvmi_arch_is_agent_hypercall(struct kvm_vcpu *vcpu); void kvmi_arch_hypercall_event(struct kvm_vcpu *vcpu); +void kvmi_arch_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len); +int kvmi_arch_cmd_control_intercept(struct kvm_vcpu *vcpu, + unsigned int event_id, bool enable); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 80ade6055747..4a03980e0bbb 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -805,3 +805,20 @@ u32 kvmi_msg_send_hypercall(struct kvm_vcpu *vcpu) return action; } + +u32 kvmi_msg_send_bp(struct kvm_vcpu *vcpu, u64 gpa, u8 insn_len) +{ + struct kvmi_event_breakpoint e; + int err, action; + + memset(&e, 0, sizeof(e)); + e.gpa = gpa; + e.insn_len = insn_len; + + err = kvmi_send_event(vcpu, KVMI_EVENT_BREAKPOINT, &e, sizeof(e), + NULL, 0, &action); + if (err) + return KVMI_EVENT_ACTION_CONTINUE; + + return action; +}
Adalbert Lazăr
2020-Jul-21 21:08 UTC
[PATCH v9 61/84] KVM: introspection: add cleanup support for vCPUs
From: Nicu?or C??u <ncitu at bitdefender.com> On unhook the introspection channel is closed. This will signal the receiving thread to call kvmi_put() and exit. There might be vCPU threads handling introspection commands or waiting for event replies. These will also call kvmi_put() and re-enter in guest. Once the reference counter reaches zero, the structures keeping the introspection data will be freed. In order to restore the interception of CRs, MSRs, BP, descriptor-table registers, from all vCPUs (some of which might run from userspace), we keep the needed information in another structure (kvmi_interception) which will be used and freed by each of them before re-entering in guest. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvm_host.h | 3 ++ arch/x86/include/asm/kvmi_host.h | 4 +++ arch/x86/kvm/kvmi.c | 49 +++++++++++++++++++++++++++++++ virt/kvm/introspection/kvmi.c | 32 ++++++++++++++++++-- virt/kvm/introspection/kvmi_int.h | 5 ++++ 5 files changed, 90 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 8a119fb7c623..acfcebce51dd 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -840,6 +840,9 @@ struct kvm_vcpu_arch { /* #PF translated error code from EPT/NPT exit reason */ u64 error_code; + + /* Control the interception for KVM Introspection */ + struct kvmi_interception *kvmi; }; struct kvm_lpage_info { diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 05ade3a16b24..6d274f173fb5 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -4,6 +4,10 @@ #include <asm/kvmi.h> +struct kvmi_interception { + bool restore_interception; +}; + struct kvm_vcpu_arch_introspection { }; diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index f13272350bc9..ca2ce7498cfe 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -290,3 +290,52 @@ void kvmi_arch_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len) kvmi_handle_common_event_actions(vcpu->kvm, action); } } + +static void kvmi_arch_restore_interception(struct kvm_vcpu *vcpu) +{ +} + +bool kvmi_arch_clean_up_interception(struct kvm_vcpu *vcpu) +{ + struct kvmi_interception *arch_vcpui = vcpu->arch.kvmi; + + if (!arch_vcpui) + return false; + + if (!arch_vcpui->restore_interception) + return false; + + kvmi_arch_restore_interception(vcpu); + + return true; +} + +bool kvmi_arch_vcpu_alloc_interception(struct kvm_vcpu *vcpu) +{ + struct kvmi_interception *arch_vcpui; + + arch_vcpui = kzalloc(sizeof(*arch_vcpui), GFP_KERNEL); + if (!arch_vcpui) + return false; + + return true; +} + +void kvmi_arch_vcpu_free_interception(struct kvm_vcpu *vcpu) +{ + kfree(vcpu->arch.kvmi); + WRITE_ONCE(vcpu->arch.kvmi, NULL); +} + +bool kvmi_arch_vcpu_introspected(struct kvm_vcpu *vcpu) +{ + return !!READ_ONCE(vcpu->arch.kvmi); +} + +void kvmi_arch_request_interception_cleanup(struct kvm_vcpu *vcpu) +{ + struct kvmi_interception *arch_vcpui = READ_ONCE(vcpu->arch.kvmi); + + if (arch_vcpui) + arch_vcpui->restore_interception = true; +} diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index a5264696c630..083dd8be9252 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -197,7 +197,7 @@ static bool alloc_vcpui(struct kvm_vcpu *vcpu) vcpu->kvmi = vcpui; - return true; + return kvmi_arch_vcpu_alloc_interception(vcpu); } static int create_vcpui(struct kvm_vcpu *vcpu) @@ -231,6 +231,9 @@ static void free_vcpui(struct kvm_vcpu *vcpu) kfree(vcpui); vcpu->kvmi = NULL; + + kvmi_arch_request_interception_cleanup(vcpu); + kvmi_make_request(vcpu, false); } static void free_kvmi(struct kvm *kvm) @@ -253,6 +256,7 @@ void kvmi_vcpu_uninit(struct kvm_vcpu *vcpu) { mutex_lock(&vcpu->kvm->kvmi_lock); free_vcpui(vcpu); + kvmi_arch_vcpu_free_interception(vcpu); mutex_unlock(&vcpu->kvm->kvmi_lock); } @@ -404,6 +408,21 @@ static int kvmi_recv_thread(void *arg) return 0; } +static bool ready_to_hook(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu; + int i; + + if (kvm->kvmi) + return false; + + kvm_for_each_vcpu(i, vcpu, kvm) + if (kvmi_arch_vcpu_introspected(vcpu)) + return false; + + return true; +} + int kvmi_hook(struct kvm *kvm, const struct kvm_introspection_hook *hook) { struct kvm_introspection *kvmi; @@ -411,7 +430,7 @@ int kvmi_hook(struct kvm *kvm, const struct kvm_introspection_hook *hook) mutex_lock(&kvm->kvmi_lock); - if (kvm->kvmi) { + if (!ready_to_hook(kvm)) { err = -EEXIST; goto out; } @@ -836,7 +855,7 @@ void kvmi_handle_requests(struct kvm_vcpu *vcpu) kvmi = kvmi_get(vcpu->kvm); if (!kvmi) - return; + goto out; for (;;) { kvmi_run_jobs(vcpu); @@ -848,6 +867,13 @@ void kvmi_handle_requests(struct kvm_vcpu *vcpu) } kvmi_put(vcpu->kvm); + +out: + if (kvmi_arch_clean_up_interception(vcpu)) { + mutex_lock(&vcpu->kvm->kvmi_lock); + kvmi_arch_vcpu_free_interception(vcpu); + mutex_unlock(&vcpu->kvm->kvmi_lock); + } } int kvmi_cmd_vcpu_pause(struct kvm_vcpu *vcpu, bool wait) diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 810dde913ad6..05bfde7d7f1a 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -65,6 +65,11 @@ int kvmi_cmd_vcpu_set_registers(struct kvm_vcpu *vcpu, const struct kvm_regs *regs); /* arch */ +bool kvmi_arch_vcpu_alloc_interception(struct kvm_vcpu *vcpu); +void kvmi_arch_vcpu_free_interception(struct kvm_vcpu *vcpu); +bool kvmi_arch_vcpu_introspected(struct kvm_vcpu *vcpu); +void kvmi_arch_request_interception_cleanup(struct kvm_vcpu *vcpu); +bool kvmi_arch_clean_up_interception(struct kvm_vcpu *vcpu); int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, struct kvmi_vcpu_get_info_reply *rpl); void kvmi_arch_setup_event(struct kvm_vcpu *vcpu, struct kvmi_event *ev);
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 62/84] KVM: introspection: restore the state of #BP interception on unhook
From: Nicu?or C??u <ncitu at bitdefender.com> This commit also ensures that only the userspace or the introspection tool can control the #BP interception exclusively at one time. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvmi_host.h | 18 ++++++ arch/x86/kvm/kvmi.c | 60 +++++++++++++++++++ arch/x86/kvm/x86.c | 5 ++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 16 +++++ 4 files changed, 99 insertions(+) diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 6d274f173fb5..5f2a968831d3 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -4,8 +4,15 @@ #include <asm/kvmi.h> +struct kvmi_monitor_interception { + bool kvmi_intercepted; + bool kvm_intercepted; + bool (*monitor_fct)(struct kvm_vcpu *vcpu, bool enable); +}; + struct kvmi_interception { bool restore_interception; + struct kvmi_monitor_interception breakpoint; }; struct kvm_vcpu_arch_introspection { @@ -14,4 +21,15 @@ struct kvm_vcpu_arch_introspection { struct kvm_arch_introspection { }; +#ifdef CONFIG_KVM_INTROSPECTION + +bool kvmi_monitor_bp_intercept(struct kvm_vcpu *vcpu, u32 dbg); + +#else /* CONFIG_KVM_INTROSPECTION */ + +static inline bool kvmi_monitor_bp_intercept(struct kvm_vcpu *vcpu, u32 dbg) + { return false; } + +#endif /* CONFIG_KVM_INTROSPECTION */ + #endif /* _ASM_X86_KVMI_HOST_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index ca2ce7498cfe..56c02dad3b57 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -243,18 +243,71 @@ void kvmi_arch_hypercall_event(struct kvm_vcpu *vcpu) } } +/* + * Returns true if one side (kvm or kvmi) tries to enable/disable the breakpoint + * interception while the other side is still tracking it. + */ +bool kvmi_monitor_bp_intercept(struct kvm_vcpu *vcpu, u32 dbg) +{ + struct kvmi_interception *arch_vcpui = READ_ONCE(vcpu->arch.kvmi); + u32 bp_mask = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; + bool enable = false; + + if ((dbg & bp_mask) == bp_mask) + enable = true; + + return (arch_vcpui && arch_vcpui->breakpoint.monitor_fct(vcpu, enable)); +} +EXPORT_SYMBOL(kvmi_monitor_bp_intercept); + +static bool monitor_bp_fct_kvmi(struct kvm_vcpu *vcpu, bool enable) +{ + if (enable) { + if (kvm_x86_ops.bp_intercepted(vcpu)) + return true; + } else if (!vcpu->arch.kvmi->breakpoint.kvmi_intercepted) + return true; + + vcpu->arch.kvmi->breakpoint.kvmi_intercepted = enable; + + return false; +} + +static bool monitor_bp_fct_kvm(struct kvm_vcpu *vcpu, bool enable) +{ + if (enable) { + if (kvm_x86_ops.bp_intercepted(vcpu)) + return true; + } else if (!vcpu->arch.kvmi->breakpoint.kvm_intercepted) + return true; + + vcpu->arch.kvmi->breakpoint.kvm_intercepted = enable; + + return false; +} + static int kvmi_control_bp_intercept(struct kvm_vcpu *vcpu, bool enable) { struct kvm_guest_debug dbg = {}; int err = 0; + vcpu->arch.kvmi->breakpoint.monitor_fct = monitor_bp_fct_kvmi; if (enable) dbg.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; err = kvm_arch_vcpu_set_guest_debug(vcpu, &dbg); + vcpu->arch.kvmi->breakpoint.monitor_fct = monitor_bp_fct_kvm; return err; } +static void kvmi_arch_disable_bp_intercept(struct kvm_vcpu *vcpu) +{ + kvmi_control_bp_intercept(vcpu, false); + + vcpu->arch.kvmi->breakpoint.kvmi_intercepted = false; + vcpu->arch.kvmi->breakpoint.kvm_intercepted = false; +} + int kvmi_arch_cmd_control_intercept(struct kvm_vcpu *vcpu, unsigned int event_id, bool enable) { @@ -293,6 +346,7 @@ void kvmi_arch_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len) static void kvmi_arch_restore_interception(struct kvm_vcpu *vcpu) { + kvmi_arch_disable_bp_intercept(vcpu); } bool kvmi_arch_clean_up_interception(struct kvm_vcpu *vcpu) @@ -318,6 +372,12 @@ bool kvmi_arch_vcpu_alloc_interception(struct kvm_vcpu *vcpu) if (!arch_vcpui) return false; + arch_vcpui->breakpoint.monitor_fct = monitor_bp_fct_kvm; + + /* pair with kvmi_monitor_bp_intercept() */ + smp_wmb(); + WRITE_ONCE(vcpu->arch.kvmi, arch_vcpui); + return true; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0d5ce07c4164..9c8b7a3c5758 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9325,6 +9325,11 @@ int kvm_arch_vcpu_set_guest_debug(struct kvm_vcpu *vcpu, kvm_queue_exception(vcpu, BP_VECTOR); } + if (kvmi_monitor_bp_intercept(vcpu, dbg->control)) { + r = -EBUSY; + goto out; + } + /* * Read rflags as long as potentially injected trace flags are still * filtered out. diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 3b921e3cf958..1418e31918be 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -58,6 +58,10 @@ enum { #define HOST_SEND_TEST(uc) (uc.cmd == UCALL_SYNC && uc.args[1] == 0) +static pthread_t start_vcpu_worker(struct vcpu_worker_data *data); +static void stop_vcpu_worker(pthread_t vcpu_thread, + struct vcpu_worker_data *data); + static int guest_test_id(void) { GUEST_REQUEST_TEST(); @@ -171,8 +175,10 @@ static void allow_command(struct kvm_vm *vm, __s32 id) static void hook_introspection(struct kvm_vm *vm) { + struct vcpu_worker_data data = {.vm = vm, .vcpu_id = VCPU_ID }; __u32 allow = 1, disallow = 0, allow_inval = 2; __u32 padding = 1, no_padding = 0; + pthread_t vcpu_thread; __s32 all_IDs = -1; set_command_perm(vm, all_IDs, allow, EFAULT); @@ -180,6 +186,16 @@ static void hook_introspection(struct kvm_vm *vm) do_hook_ioctl(vm, Kvm_socket, padding, EINVAL); do_hook_ioctl(vm, -1, no_padding, EINVAL); + + /* + * The last call failed "too late". + * We have to let the vCPU run and clean up its structures, + * otherwise the next call will fail with EEXIST. + */ + vcpu_thread = start_vcpu_worker(&data); + sleep(1); + stop_vcpu_worker(vcpu_thread, &data); + do_hook_ioctl(vm, Kvm_socket, no_padding, 0); do_hook_ioctl(vm, Kvm_socket, no_padding, EEXIST);
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 63/84] KVM: introspection: add KVMI_VM_CONTROL_CLEANUP
This command will allow more control over the guest state on unhook. However, the memory restrictions (e.g. those set with KVMI_VM_SET_PAGE_ACCESS) will be removed on unhook. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> -- It will be more interesting if the userspace could control the cleanup behavior through the use of the KVM_INTROSPECTION_COMMAND ioctl. Now, by disallowing this command, the userspace can only keep the default behavior (to not automatically clean up). Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 30 ++++++++++++++++ arch/x86/include/asm/kvmi_host.h | 1 + arch/x86/kvm/kvmi.c | 17 +++++----- include/linux/kvmi_host.h | 2 ++ include/uapi/linux/kvmi.h | 9 +++++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 34 +++++++++++++++++++ virt/kvm/introspection/kvmi.c | 14 +++++--- virt/kvm/introspection/kvmi_int.h | 4 ++- virt/kvm/introspection/kvmi_msg.c | 34 ++++++++++++++----- 9 files changed, 124 insertions(+), 21 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 110a6e7a7d2a..f760957b27f4 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -684,6 +684,36 @@ Returns a CPUID leaf (as seen by the guest OS). * -KVM_EAGAIN - the selected vCPU can't be introspected yet * -KVM_ENOENT - the selected leaf is not present or is invalid +14. KVMI_VM_CONTROL_CLEANUP +--------------------------- +:Architectures: all +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vm_control_cleanup { + __u8 enable; + __u8 padding1; + __u16 padding2; + __u32 padding3; + }; + +:Returns: + +:: + + struct kvmi_error_code + +Enables/disables the automatic cleanup of the changes made by +the introspection tool at the hypervisor level (e.g. CR/MSR/BP +interceptions). By default it is disabled. + +:Errors: + +* -KVM_EINVAL - the padding is not zero +* -KVM_EINVAL - 'enabled' is not 1 or 0 + Events ===== diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 5f2a968831d3..3e85ae4fe5f0 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -11,6 +11,7 @@ struct kvmi_monitor_interception { }; struct kvmi_interception { + bool cleanup; bool restore_interception; struct kvmi_monitor_interception breakpoint; }; diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 56c02dad3b57..89fa158a6535 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -353,13 +353,11 @@ bool kvmi_arch_clean_up_interception(struct kvm_vcpu *vcpu) { struct kvmi_interception *arch_vcpui = vcpu->arch.kvmi; - if (!arch_vcpui) + if (!arch_vcpui || !arch_vcpui->cleanup) return false; - if (!arch_vcpui->restore_interception) - return false; - - kvmi_arch_restore_interception(vcpu); + if (arch_vcpui->restore_interception) + kvmi_arch_restore_interception(vcpu); return true; } @@ -392,10 +390,13 @@ bool kvmi_arch_vcpu_introspected(struct kvm_vcpu *vcpu) return !!READ_ONCE(vcpu->arch.kvmi); } -void kvmi_arch_request_interception_cleanup(struct kvm_vcpu *vcpu) +void kvmi_arch_request_interception_cleanup(struct kvm_vcpu *vcpu, + bool restore_interception) { struct kvmi_interception *arch_vcpui = READ_ONCE(vcpu->arch.kvmi); - if (arch_vcpui) - arch_vcpui->restore_interception = true; + if (arch_vcpui) { + arch_vcpui->restore_interception = restore_interception; + arch_vcpui->cleanup = true; + } } diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index c4fac41bd5c7..01219c56d042 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -53,6 +53,8 @@ struct kvm_introspection { unsigned long *vm_event_enable_mask; atomic_t ev_seq; + + bool cleanup_on_unhook; }; int kvmi_version(void); diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 026ae5911b1c..20bf5bf194a4 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -32,6 +32,8 @@ enum { KVMI_VCPU_SET_REGISTERS = 12, KVMI_VCPU_GET_CPUID = 13, + KVMI_VM_CONTROL_CLEANUP = 14, + KVMI_NUM_MESSAGES }; @@ -135,6 +137,13 @@ struct kvmi_vcpu_control_events { __u32 padding2; }; +struct kvmi_vm_control_cleanup { + __u8 enable; + __u8 padding1; + __u16 padding2; + __u32 padding3; +}; + struct kvmi_event { __u16 size; __u16 vcpu; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 1418e31918be..d3b7778a64d4 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -1168,6 +1168,39 @@ static void test_event_breakpoint(struct kvm_vm *vm) disable_vcpu_event(vm, event_id); } +static void cmd_vm_control_cleanup(__u8 enable, __u8 padding, + int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vm_control_cleanup cmd; + } req = {}; + int r; + + req.cmd.enable = enable; + req.cmd.padding1 = padding; + req.cmd.padding2 = padding; + req.cmd.padding3 = padding; + + r = do_command(KVMI_VM_CONTROL_CLEANUP, &req.hdr, sizeof(req), + NULL, 0); + TEST_ASSERT(r == expected_err, + "KVMI_VM_CONTROL_CLEANUP failed, error %d (%s), expected error %d\n", + -r, kvm_strerror(-r), expected_err); +} + +static void test_cmd_vm_control_cleanup(struct kvm_vm *vm) +{ + __u8 disable = 0, enable = 1, enable_inval = 2; + __u16 padding = 1, no_padding = 0; + + cmd_vm_control_cleanup(enable, padding, -KVM_EINVAL); + cmd_vm_control_cleanup(enable_inval, no_padding, -KVM_EINVAL); + + cmd_vm_control_cleanup(enable, no_padding, 0); + cmd_vm_control_cleanup(disable, no_padding, 0); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1190,6 +1223,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_get_cpuid(vm); test_event_hypercall(vm); test_event_breakpoint(vm); + test_cmd_vm_control_cleanup(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 083dd8be9252..db1f4523cec5 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -218,7 +218,7 @@ static void free_vcpu_jobs(struct kvm_vcpu_introspection *vcpui) } } -static void free_vcpui(struct kvm_vcpu *vcpu) +static void free_vcpui(struct kvm_vcpu *vcpu, bool restore_interception) { struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); @@ -232,17 +232,18 @@ static void free_vcpui(struct kvm_vcpu *vcpu) kfree(vcpui); vcpu->kvmi = NULL; - kvmi_arch_request_interception_cleanup(vcpu); + kvmi_arch_request_interception_cleanup(vcpu, restore_interception); kvmi_make_request(vcpu, false); } static void free_kvmi(struct kvm *kvm) { + bool restore_interception = KVMI(kvm)->cleanup_on_unhook; struct kvm_vcpu *vcpu; int i; kvm_for_each_vcpu(i, vcpu, kvm) - free_vcpui(vcpu); + free_vcpui(vcpu, restore_interception); bitmap_free(kvm->kvmi->cmd_allow_mask); bitmap_free(kvm->kvmi->event_allow_mask); @@ -255,7 +256,7 @@ static void free_kvmi(struct kvm *kvm) void kvmi_vcpu_uninit(struct kvm_vcpu *vcpu) { mutex_lock(&vcpu->kvm->kvmi_lock); - free_vcpui(vcpu); + free_vcpui(vcpu, false); kvmi_arch_vcpu_free_interception(vcpu); mutex_unlock(&vcpu->kvm->kvmi_lock); } @@ -660,6 +661,11 @@ int kvmi_cmd_vcpu_control_events(struct kvm_vcpu *vcpu, return kvmi_arch_cmd_control_intercept(vcpu, event_id, enable); } +void kvmi_cmd_vm_control_cleanup(struct kvm_introspection *kvmi, bool enable) +{ + kvmi->cleanup_on_unhook = enable; +} + static unsigned long gfn_to_hva_safe(struct kvm *kvm, gfn_t gfn) { unsigned long hva; diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 05bfde7d7f1a..831e7e14524f 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -49,6 +49,7 @@ int kvmi_add_job(struct kvm_vcpu *vcpu, void kvmi_run_jobs(struct kvm_vcpu *vcpu); void kvmi_post_reply(struct kvm_vcpu *vcpu); void kvmi_handle_common_event_actions(struct kvm *kvm, u32 action); +void kvmi_cmd_vm_control_cleanup(struct kvm_introspection *kvmi, bool enable); int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, unsigned int event_id, bool enable); int kvmi_cmd_vcpu_control_events(struct kvm_vcpu *vcpu, @@ -68,7 +69,8 @@ int kvmi_cmd_vcpu_set_registers(struct kvm_vcpu *vcpu, bool kvmi_arch_vcpu_alloc_interception(struct kvm_vcpu *vcpu); void kvmi_arch_vcpu_free_interception(struct kvm_vcpu *vcpu); bool kvmi_arch_vcpu_introspected(struct kvm_vcpu *vcpu); -void kvmi_arch_request_interception_cleanup(struct kvm_vcpu *vcpu); +void kvmi_arch_request_interception_cleanup(struct kvm_vcpu *vcpu, + bool restore_interception); bool kvmi_arch_clean_up_interception(struct kvm_vcpu *vcpu); int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, struct kvmi_vcpu_get_info_reply *rpl); diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 4a03980e0bbb..86cee47d214f 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -305,19 +305,37 @@ static int handle_vcpu_pause(struct kvm_introspection *kvmi, return kvmi_msg_vm_reply(kvmi, msg, err, NULL, 0); } +static int handle_vm_control_cleanup(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vm_control_cleanup *req = _req; + int ec = 0; + + if (req->padding1 || req->padding2 || req->padding3) + ec = -KVM_EINVAL; + else if (req->enable > 1) + ec = -KVM_EINVAL; + else + kvmi_cmd_vm_control_cleanup(kvmi, req->enable == 1); + + return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); +} + /* * These commands are executed by the receiving thread. */ static int(*const msg_vm[])(struct kvm_introspection *, const struct kvmi_msg_hdr *, const void *) = { - [KVMI_GET_VERSION] = handle_get_version, - [KVMI_VCPU_PAUSE] = handle_vcpu_pause, - [KVMI_VM_CHECK_COMMAND] = handle_vm_check_command, - [KVMI_VM_CHECK_EVENT] = handle_vm_check_event, - [KVMI_VM_CONTROL_EVENTS] = handle_vm_control_events, - [KVMI_VM_GET_INFO] = handle_vm_get_info, - [KVMI_VM_READ_PHYSICAL] = handle_vm_read_physical, - [KVMI_VM_WRITE_PHYSICAL] = handle_vm_write_physical, + [KVMI_GET_VERSION] = handle_get_version, + [KVMI_VCPU_PAUSE] = handle_vcpu_pause, + [KVMI_VM_CHECK_COMMAND] = handle_vm_check_command, + [KVMI_VM_CHECK_EVENT] = handle_vm_check_event, + [KVMI_VM_CONTROL_CLEANUP] = handle_vm_control_cleanup, + [KVMI_VM_CONTROL_EVENTS] = handle_vm_control_events, + [KVMI_VM_GET_INFO] = handle_vm_get_info, + [KVMI_VM_READ_PHYSICAL] = handle_vm_read_physical, + [KVMI_VM_WRITE_PHYSICAL] = handle_vm_write_physical, }; static bool is_vm_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 64/84] KVM: introspection: add KVMI_VCPU_CONTROL_CR and KVMI_EVENT_CR
From: Mihai Don?u <mdontu at bitdefender.com> Using the KVMI_VCPU_CONTROL_CR command, the introspection tool subscribes to KVMI_EVENT_CR events that will be sent when a control register (CR0, CR3 or CR4) is going to be changed. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 70 +++++++++++ arch/x86/include/asm/kvmi_host.h | 11 ++ arch/x86/include/uapi/asm/kvmi.h | 18 +++ arch/x86/kvm/kvmi.c | 111 ++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 6 +- arch/x86/kvm/x86.c | 12 +- include/uapi/linux/kvmi.h | 3 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 108 +++++++++++++++++ virt/kvm/introspection/kvmi.c | 1 + virt/kvm/introspection/kvmi_int.h | 7 ++ virt/kvm/introspection/kvmi_msg.c | 18 ++- 11 files changed, 359 insertions(+), 6 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index f760957b27f4..e1f978fc799b 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -550,6 +550,7 @@ Enables/disables vCPU introspection events. This command can be used with the following events:: KVMI_EVENT_BREAKPOINT + KVMI_EVENT_CR KVMI_EVENT_HYPERCALL When an event is enabled, the introspection tool is notified and @@ -714,6 +715,40 @@ interceptions). By default it is disabled. * -KVM_EINVAL - the padding is not zero * -KVM_EINVAL - 'enabled' is not 1 or 0 +15. KVMI_VCPU_CONTROL_CR +------------------------ + +:Architectures: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_control_cr { + __u8 cr; + __u8 enable; + __u16 padding1; + __u32 padding2; + }; + +:Returns: + +:: + + struct kvmi_error_code + +Enables/disables introspection for a specific control register and must +be used in addition to *KVMI_VCPU_CONTROL_EVENTS* with the *KVMI_EVENT_CR* +ID set. + +:Errors: + +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the specified control register is not CR0, CR3 or CR4 +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet + Events ===== @@ -890,3 +925,38 @@ trying to perform a certain operation (like creating a process). ``kvmi_event`` and the guest physical address are sent to the introspection tool. The *RETRY* action is used by the introspection tool for its own breakpoints. + +5. KVMI_EVENT_CR +---------------- + +:Architectures: x86 +:Versions: >= 1 +:Actions: CONTINUE, CRASH +:Parameters: + +:: + + struct kvmi_event; + struct kvmi_event_cr { + __u8 cr; + __u8 padding[7]; + __u64 old_value; + __u64 new_value; + }; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + struct kvmi_event_cr_reply { + __u64 new_val; + }; + +This event is sent when a control register is going to be changed and the +introspection has been enabled for this event and for this specific +register (see **KVMI_VCPU_CONTROL_EVENTS**). + +``kvmi_event``, the control register number, the old value and the new value +are sent to the introspection tool. The *CONTINUE* action will set the ``new_val``. diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 3e85ae4fe5f0..1aff91ef8475 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -4,6 +4,8 @@ #include <asm/kvmi.h> +#define KVMI_NUM_CR 5 + struct kvmi_monitor_interception { bool kvmi_intercepted; bool kvm_intercepted; @@ -17,6 +19,7 @@ struct kvmi_interception { }; struct kvm_vcpu_arch_introspection { + DECLARE_BITMAP(cr_mask, KVMI_NUM_CR); }; struct kvm_arch_introspection { @@ -25,11 +28,19 @@ struct kvm_arch_introspection { #ifdef CONFIG_KVM_INTROSPECTION bool kvmi_monitor_bp_intercept(struct kvm_vcpu *vcpu, u32 dbg); +bool kvmi_cr_event(struct kvm_vcpu *vcpu, unsigned int cr, + unsigned long old_value, unsigned long *new_value); +bool kvmi_cr3_intercepted(struct kvm_vcpu *vcpu); #else /* CONFIG_KVM_INTROSPECTION */ static inline bool kvmi_monitor_bp_intercept(struct kvm_vcpu *vcpu, u32 dbg) { return false; } +static inline bool kvmi_cr_event(struct kvm_vcpu *vcpu, unsigned int cr, + unsigned long old_value, + unsigned long *new_value) + { return true; } +static inline bool kvmi_cr3_intercepted(struct kvm_vcpu *vcpu) { return false; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index 1605777256a3..4c59c9fe6b00 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -65,4 +65,22 @@ struct kvmi_event_breakpoint { __u8 padding[7]; }; +struct kvmi_vcpu_control_cr { + __u8 cr; + __u8 enable; + __u16 padding1; + __u32 padding2; +}; + +struct kvmi_event_cr { + __u8 cr; + __u8 padding[7]; + __u64 old_value; + __u64 new_value; +}; + +struct kvmi_event_cr_reply { + __u64 new_val; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 89fa158a6535..e72b2ef5b28a 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -400,3 +400,114 @@ void kvmi_arch_request_interception_cleanup(struct kvm_vcpu *vcpu, arch_vcpui->cleanup = true; } } + +int kvmi_arch_cmd_vcpu_control_cr(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_control_cr *req) +{ + u32 cr = req->cr; + + if (req->padding1 || req->padding2 || req->enable > 1) + return -KVM_EINVAL; + + switch (cr) { + case 0: + break; + case 3: + kvm_x86_ops.control_cr3_intercept(vcpu, CR_TYPE_W, + req->enable == 1); + break; + case 4: + break; + default: + return -KVM_EINVAL; + } + + if (req->enable) + set_bit(cr, VCPUI(vcpu)->arch.cr_mask); + else + clear_bit(cr, VCPUI(vcpu)->arch.cr_mask); + + return 0; +} + +static u32 kvmi_send_cr(struct kvm_vcpu *vcpu, u32 cr, u64 old_value, + u64 new_value, u64 *ret_value) +{ + struct kvmi_event_cr e; + struct kvmi_event_cr_reply r; + int err, action; + + memset(&e, 0, sizeof(e)); + e.cr = cr; + e.old_value = old_value; + e.new_value = new_value; + + err = kvmi_send_event(vcpu, KVMI_EVENT_CR, &e, sizeof(e), + &r, sizeof(r), &action); + if (err) + return KVMI_EVENT_ACTION_CONTINUE; + + *ret_value = r.new_val; + return action; +} + +static bool __kvmi_cr_event(struct kvm_vcpu *vcpu, unsigned int cr, + u64 old_value, unsigned long *new_value) +{ + u64 ret_value = *new_value; + bool ret = false; + u32 action; + + if (!test_bit(cr, VCPUI(vcpu)->arch.cr_mask)) + return true; + + action = kvmi_send_cr(vcpu, cr, old_value, *new_value, &ret_value); + switch (action) { + case KVMI_EVENT_ACTION_CONTINUE: + *new_value = ret_value; + ret = true; + break; + default: + kvmi_handle_common_event_actions(vcpu->kvm, action); + } + + return ret; +} + +bool kvmi_cr_event(struct kvm_vcpu *vcpu, unsigned int cr, + unsigned long old_value, unsigned long *new_value) +{ + struct kvm_introspection *kvmi; + bool ret = true; + + if (old_value == *new_value) + return true; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return true; + + if (is_event_enabled(vcpu, KVMI_EVENT_CR)) + ret = __kvmi_cr_event(vcpu, cr, old_value, new_value); + + kvmi_put(vcpu->kvm); + + return ret; +} + +bool kvmi_cr3_intercepted(struct kvm_vcpu *vcpu) +{ + struct kvm_introspection *kvmi; + bool ret; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return false; + + ret = test_bit(3, VCPUI(vcpu)->arch.cr_mask); + + kvmi_put(vcpu->kvm); + + return ret; +} +EXPORT_SYMBOL(kvmi_cr3_intercepted); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9e1bea74b74c..5d0876420dd9 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5038,7 +5038,8 @@ static int handle_cr(struct kvm_vcpu *vcpu) err = handle_set_cr0(vcpu, val); return kvm_complete_insn_gp(vcpu, err); case 3: - WARN_ON_ONCE(enable_unrestricted_guest); + WARN_ON_ONCE(enable_unrestricted_guest && + !kvmi_cr3_intercepted(vcpu)); err = kvm_set_cr3(vcpu, val); return kvm_complete_insn_gp(vcpu, err); case 4: @@ -5071,7 +5072,8 @@ static int handle_cr(struct kvm_vcpu *vcpu) case 1: /*mov from cr*/ switch (cr) { case 3: - WARN_ON_ONCE(enable_unrestricted_guest); + WARN_ON_ONCE(enable_unrestricted_guest && + !kvmi_cr3_intercepted(vcpu)); val = kvm_read_cr3(vcpu); kvm_register_write(vcpu, reg, val); trace_kvm_cr_read(cr, val); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9c8b7a3c5758..a12aa8e125d3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -813,6 +813,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE)) return 1; + if (!kvmi_cr_event(vcpu, 0, old_cr0, &cr0)) + return 1; + kvm_x86_ops.set_cr0(vcpu, cr0); if ((cr0 ^ old_cr0) & X86_CR0_PG) { @@ -993,6 +996,9 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) return 1; } + if (!kvmi_cr_event(vcpu, 4, old_cr4, &cr4)) + return 1; + if (kvm_x86_ops.set_cr4(vcpu, cr4)) return 1; @@ -1009,6 +1015,7 @@ EXPORT_SYMBOL_GPL(kvm_set_cr4); int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) { + unsigned long old_cr3 = kvm_read_cr3(vcpu); bool skip_tlb_flush = false; #ifdef CONFIG_X86_64 bool pcid_enabled = kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE); @@ -1019,7 +1026,7 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) } #endif - if (cr3 == kvm_read_cr3(vcpu) && !pdptrs_changed(vcpu)) { + if (cr3 == old_cr3 && !pdptrs_changed(vcpu)) { if (!skip_tlb_flush) { kvm_mmu_sync_roots(vcpu); kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); @@ -1034,6 +1041,9 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)) return 1; + if (!kvmi_cr_event(vcpu, 3, old_cr3, &cr3)) + return 1; + kvm_mmu_new_pgd(vcpu, cr3, skip_tlb_flush, skip_tlb_flush); vcpu->arch.cr3 = cr3; kvm_register_mark_available(vcpu, VCPU_EXREG_CR3); diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 20bf5bf194a4..e31b474e3496 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -34,6 +34,8 @@ enum { KVMI_VM_CONTROL_CLEANUP = 14, + KVMI_VCPU_CONTROL_CR = 15, + KVMI_NUM_MESSAGES }; @@ -42,6 +44,7 @@ enum { KVMI_EVENT_PAUSE_VCPU = 1, KVMI_EVENT_HYPERCALL = 2, KVMI_EVENT_BREAKPOINT = 3, + KVMI_EVENT_CR = 4, KVMI_NUM_EVENTS }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index d3b7778a64d4..7694fa8fef89 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -50,6 +50,7 @@ struct vcpu_worker_data { enum { GUEST_TEST_NOOP = 0, GUEST_TEST_BP, + GUEST_TEST_CR, GUEST_TEST_HYPERCALL, }; @@ -73,6 +74,11 @@ static void guest_bp_test(void) asm volatile("int3"); } +static void guest_cr_test(void) +{ + set_cr4(get_cr4() | X86_CR4_OSXSAVE); +} + static void guest_hypercall_test(void) { asm volatile("mov $34, %rax"); @@ -90,6 +96,9 @@ static void guest_code(void) case GUEST_TEST_BP: guest_bp_test(); break; + case GUEST_TEST_CR: + guest_cr_test(); + break; case GUEST_TEST_HYPERCALL: guest_hypercall_test(); break; @@ -1201,6 +1210,104 @@ static void test_cmd_vm_control_cleanup(struct kvm_vm *vm) cmd_vm_control_cleanup(disable, no_padding, 0); } +static void cmd_vcpu_control_cr(struct kvm_vm *vm, __u8 cr, __u8 enable, + __u8 padding, int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_control_cr cmd; + } req = {}; + int r; + + req.cmd.cr = cr; + req.cmd.enable = enable; + req.cmd.padding1 = padding; + req.cmd.padding2 = padding; + + r = do_vcpu0_command(vm, KVMI_VCPU_CONTROL_CR, &req.hdr, sizeof(req), + NULL, 0); + TEST_ASSERT(r == expected_err, + "KVMI_VCPU_CONTROL_CR failed, error %d(%s), expected error %d\n", + -r, kvm_strerror(-r), expected_err); +} + +static void enable_cr_events(struct kvm_vm *vm, __u8 cr) +{ + enable_vcpu_event(vm, KVMI_EVENT_CR); + + cmd_vcpu_control_cr(vm, cr, 1, 0, 0); +} + +static void disable_cr_events(struct kvm_vm *vm, __u8 cr) +{ + cmd_vcpu_control_cr(vm, cr, 0, 0, 0); + + disable_vcpu_event(vm, KVMI_EVENT_CR); +} + +static void test_invalid_vcpu_control_cr(struct kvm_vm *vm) +{ + __u8 enable = 1, enable_inval = 2; + __u8 no_padding = 0, padding = 1; + __u8 cr_inval = 99, cr = 0; + + cmd_vcpu_control_cr(vm, cr, enable_inval, no_padding, -KVM_EINVAL); + cmd_vcpu_control_cr(vm, cr, enable, padding, -KVM_EINVAL); + cmd_vcpu_control_cr(vm, cr_inval, enable, no_padding, -KVM_EINVAL); +} + +static void test_cmd_vcpu_control_cr(struct kvm_vm *vm) +{ + struct vcpu_worker_data data = { + .vm = vm, + .vcpu_id = VCPU_ID, + .test_id = GUEST_TEST_CR, + }; + struct kvmi_msg_hdr hdr; + struct { + struct kvmi_event common; + struct kvmi_event_cr cr; + } ev; + struct { + struct vcpu_reply common; + struct kvmi_event_cr_reply cr; + } rpl = {}; + __u16 event_id = KVMI_EVENT_CR; + __u8 cr_no = 4; + struct kvm_sregs sregs; + pthread_t vcpu_thread; + + enable_cr_events(vm, cr_no); + + vcpu_thread = start_vcpu_worker(&data); + + receive_event(&hdr, &ev.common, sizeof(ev), event_id); + + pr_info("CR%u, old 0x%llx, new 0x%llx\n", + ev.cr.cr, ev.cr.old_value, ev.cr.new_value); + + TEST_ASSERT(ev.cr.cr == cr_no, + "Unexpected CR event, received CR%u, expected CR%u", + ev.cr.cr, cr_no); + + rpl.cr.new_val = ev.cr.old_value; + + reply_to_event(&hdr, &ev.common, KVMI_EVENT_ACTION_CONTINUE, + &rpl.common, sizeof(rpl)); + + stop_vcpu_worker(vcpu_thread, &data); + + disable_cr_events(vm, cr_no); + + vcpu_sregs_get(vm, VCPU_ID, &sregs); + TEST_ASSERT(sregs.cr4 == ev.cr.old_value, + "Failed to block CR4 update, CR4 0x%llx, expected 0x%llx", + sregs.cr4, ev.cr.old_value); + + test_invalid_vcpu_control_cr(vm); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1224,6 +1331,7 @@ static void test_introspection(struct kvm_vm *vm) test_event_hypercall(vm); test_event_breakpoint(vm); test_cmd_vm_control_cleanup(vm); + test_cmd_vcpu_control_cr(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index db1f4523cec5..2dd82aa5e11c 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -98,6 +98,7 @@ static void setup_known_events(void) bitmap_zero(Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); set_bit(KVMI_EVENT_BREAKPOINT, Kvmi_known_vcpu_events); + set_bit(KVMI_EVENT_CR, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_HYPERCALL, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_PAUSE_VCPU, Kvmi_known_vcpu_events); diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 831e7e14524f..c206376eb0ad 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -31,6 +31,9 @@ bool kvmi_sock_get(struct kvm_introspection *kvmi, int fd); void kvmi_sock_shutdown(struct kvm_introspection *kvmi); void kvmi_sock_put(struct kvm_introspection *kvmi); bool kvmi_msg_process(struct kvm_introspection *kvmi); +int kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, + void *ev, size_t ev_size, + void *rpl, size_t rpl_size, int *action); int kvmi_msg_send_unhook(struct kvm_introspection *kvmi); u32 kvmi_msg_send_vcpu_pause(struct kvm_vcpu *vcpu); u32 kvmi_msg_send_hypercall(struct kvm_vcpu *vcpu); @@ -50,6 +53,8 @@ void kvmi_run_jobs(struct kvm_vcpu *vcpu); void kvmi_post_reply(struct kvm_vcpu *vcpu); void kvmi_handle_common_event_actions(struct kvm *kvm, u32 action); void kvmi_cmd_vm_control_cleanup(struct kvm_introspection *kvmi, bool enable); +struct kvm_introspection * __must_check kvmi_get(struct kvm *kvm); +void kvmi_put(struct kvm *kvm); int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, unsigned int event_id, bool enable); int kvmi_cmd_vcpu_control_events(struct kvm_vcpu *vcpu, @@ -90,5 +95,7 @@ void kvmi_arch_hypercall_event(struct kvm_vcpu *vcpu); void kvmi_arch_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len); int kvmi_arch_cmd_control_intercept(struct kvm_vcpu *vcpu, unsigned int event_id, bool enable); +int kvmi_arch_cmd_vcpu_control_cr(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_control_cr *req); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 86cee47d214f..330fad27e1df 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -481,6 +481,17 @@ static int handle_vcpu_get_cpuid(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, ec, &rpl, sizeof(rpl)); } +static int handle_vcpu_control_cr(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + int ec; + + ec = kvmi_arch_cmd_vcpu_control_cr(job->vcpu, req); + + return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -490,6 +501,7 @@ static int handle_vcpu_get_cpuid(const struct kvmi_vcpu_msg_job *job, static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, const struct kvmi_msg_hdr *, const void *) = { [KVMI_EVENT] = handle_vcpu_event_reply, + [KVMI_VCPU_CONTROL_CR] = handle_vcpu_control_cr, [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, [KVMI_VCPU_GET_CPUID] = handle_vcpu_get_cpuid, [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, @@ -758,9 +770,9 @@ static void kvmi_setup_vcpu_reply(struct kvm_vcpu_introspection *vcpui, vcpui->waiting_for_reply = true; } -static int kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, - void *ev, size_t ev_size, - void *rpl, size_t rpl_size, int *action) +int kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, + void *ev, size_t ev_size, + void *rpl, size_t rpl_size, int *action) { struct kvmi_msg_hdr hdr; struct kvmi_event common;
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 65/84] KVM: introspection: restore the state of CR3 interception on unhook
From: Nicu?or C??u <ncitu at bitdefender.com> This commit also ensures that the introspection tool and the userspace do not disable each other the CR3-write VM-exit. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvmi_host.h | 4 ++ arch/x86/kvm/kvmi.c | 64 ++++++++++++++++++++++++++++++-- arch/x86/kvm/svm/svm.c | 5 +++ arch/x86/kvm/vmx/vmx.c | 5 +++ 4 files changed, 75 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 1aff91ef8475..44580f77e34e 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -16,6 +16,7 @@ struct kvmi_interception { bool cleanup; bool restore_interception; struct kvmi_monitor_interception breakpoint; + struct kvmi_monitor_interception cr3w; }; struct kvm_vcpu_arch_introspection { @@ -31,6 +32,7 @@ bool kvmi_monitor_bp_intercept(struct kvm_vcpu *vcpu, u32 dbg); bool kvmi_cr_event(struct kvm_vcpu *vcpu, unsigned int cr, unsigned long old_value, unsigned long *new_value); bool kvmi_cr3_intercepted(struct kvm_vcpu *vcpu); +bool kvmi_monitor_cr3w_intercept(struct kvm_vcpu *vcpu, bool enable); #else /* CONFIG_KVM_INTROSPECTION */ @@ -41,6 +43,8 @@ static inline bool kvmi_cr_event(struct kvm_vcpu *vcpu, unsigned int cr, unsigned long *new_value) { return true; } static inline bool kvmi_cr3_intercepted(struct kvm_vcpu *vcpu) { return false; } +static inline bool kvmi_monitor_cr3w_intercept(struct kvm_vcpu *vcpu, + bool enable) { return false; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index e72b2ef5b28a..e340a2c3500f 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -308,6 +308,59 @@ static void kvmi_arch_disable_bp_intercept(struct kvm_vcpu *vcpu) vcpu->arch.kvmi->breakpoint.kvm_intercepted = false; } +static bool monitor_cr3w_fct_kvmi(struct kvm_vcpu *vcpu, bool enable) +{ + vcpu->arch.kvmi->cr3w.kvmi_intercepted = enable; + + if (enable) + vcpu->arch.kvmi->cr3w.kvm_intercepted + kvm_x86_ops.cr3_write_intercepted(vcpu); + else if (vcpu->arch.kvmi->cr3w.kvm_intercepted) + return true; + + return false; +} + +static bool monitor_cr3w_fct_kvm(struct kvm_vcpu *vcpu, bool enable) +{ + if (!vcpu->arch.kvmi->cr3w.kvmi_intercepted) + return false; + + vcpu->arch.kvmi->cr3w.kvm_intercepted = enable; + + if (!enable) + return true; + + return false; +} + +/* + * Returns true if one side (kvm or kvmi) tries to disable the CR3 write + * interception while the other side is still tracking it. + */ +bool kvmi_monitor_cr3w_intercept(struct kvm_vcpu *vcpu, bool enable) +{ + struct kvmi_interception *arch_vcpui = READ_ONCE(vcpu->arch.kvmi); + + return (arch_vcpui && arch_vcpui->cr3w.monitor_fct(vcpu, enable)); +} +EXPORT_SYMBOL(kvmi_monitor_cr3w_intercept); + +static void kvmi_control_cr3w_intercept(struct kvm_vcpu *vcpu, bool enable) +{ + vcpu->arch.kvmi->cr3w.monitor_fct = monitor_cr3w_fct_kvmi; + kvm_x86_ops.control_cr3_intercept(vcpu, CR_TYPE_W, enable); + vcpu->arch.kvmi->cr3w.monitor_fct = monitor_cr3w_fct_kvm; +} + +static void kvmi_arch_disable_cr3w_intercept(struct kvm_vcpu *vcpu) +{ + kvmi_control_cr3w_intercept(vcpu, false); + + vcpu->arch.kvmi->cr3w.kvmi_intercepted = false; + vcpu->arch.kvmi->cr3w.kvm_intercepted = false; +} + int kvmi_arch_cmd_control_intercept(struct kvm_vcpu *vcpu, unsigned int event_id, bool enable) { @@ -347,6 +400,7 @@ void kvmi_arch_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len) static void kvmi_arch_restore_interception(struct kvm_vcpu *vcpu) { kvmi_arch_disable_bp_intercept(vcpu); + kvmi_arch_disable_cr3w_intercept(vcpu); } bool kvmi_arch_clean_up_interception(struct kvm_vcpu *vcpu) @@ -371,8 +425,13 @@ bool kvmi_arch_vcpu_alloc_interception(struct kvm_vcpu *vcpu) return false; arch_vcpui->breakpoint.monitor_fct = monitor_bp_fct_kvm; + arch_vcpui->cr3w.monitor_fct = monitor_cr3w_fct_kvm; - /* pair with kvmi_monitor_bp_intercept() */ + /* + * paired with: + * - kvmi_monitor_bp_intercept() + * - kvmi_monitor_cr3w_intercept() + */ smp_wmb(); WRITE_ONCE(vcpu->arch.kvmi, arch_vcpui); @@ -413,8 +472,7 @@ int kvmi_arch_cmd_vcpu_control_cr(struct kvm_vcpu *vcpu, case 0: break; case 3: - kvm_x86_ops.control_cr3_intercept(vcpu, CR_TYPE_W, - req->enable == 1); + kvmi_control_cr3w_intercept(vcpu, req->enable == 1); break; case 4: break; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 47c50e2f0f86..bc886cedf45c 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1668,6 +1668,11 @@ static void svm_control_cr3_intercept(struct kvm_vcpu *vcpu, int type, { struct vcpu_svm *svm = to_svm(vcpu); +#ifdef CONFIG_KVM_INTROSPECTION + if ((type & CR_TYPE_W) && kvmi_monitor_cr3w_intercept(vcpu, enable)) + type &= ~CR_TYPE_W; +#endif /* CONFIG_KVM_INTROSPECTION */ + if (type & CR_TYPE_R) enable ? set_cr_intercept(svm, INTERCEPT_CR3_READ) : clr_cr_intercept(svm, INTERCEPT_CR3_READ); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 5d0876420dd9..69aa1157a0f0 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3011,6 +3011,11 @@ static void vmx_control_cr3_intercept(struct kvm_vcpu *vcpu, int type, struct vcpu_vmx *vmx = to_vmx(vcpu); u32 cr3_exec_control = 0; +#ifdef CONFIG_KVM_INTROSPECTION + if ((type & CR_TYPE_W) && kvmi_monitor_cr3w_intercept(vcpu, enable)) + type &= ~CR_TYPE_W; +#endif /* CONFIG_KVM_INTROSPECTION */ + if (type & CR_TYPE_R) cr3_exec_control |= CPU_BASED_CR3_STORE_EXITING; if (type & CR_TYPE_W)
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 66/84] KVM: introspection: add KVMI_VCPU_INJECT_EXCEPTION + KVMI_EVENT_TRAP
From: Mihai Don?u <mdontu at bitdefender.com> The KVMI_VCPU_INJECT_EXCEPTION command is used by the introspection tool to inject exceptions, for example, to get a page from swap. The exception is queued right before entering in guest unless there is already an exception pending. The introspection tool is notified with an KVMI_EVENT_TRAP event about the success of the injection. In case of failure, the introspecion tool is expected to try again later. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 74 +++++++++++ arch/x86/kvm/kvmi.c | 103 ++++++++++++++++ arch/x86/kvm/x86.c | 3 + include/linux/kvmi_host.h | 12 ++ include/uapi/linux/kvmi.h | 20 ++- .../testing/selftests/kvm/x86_64/kvmi_test.c | 115 +++++++++++++++++- virt/kvm/introspection/kvmi.c | 45 +++++++ virt/kvm/introspection/kvmi_int.h | 8 ++ virt/kvm/introspection/kvmi_msg.c | 50 ++++++-- 9 files changed, 418 insertions(+), 12 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index e1f978fc799b..4263a9ac90e4 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -561,6 +561,7 @@ because these are sent as a result of certain commands (but they can be disallowed by the device manager) :: KVMI_EVENT_PAUSE_VCPU + KVMI_EVENT_TRAP The VM events (e.g. *KVMI_EVENT_UNHOOK*) are controlled with the *KVMI_VM_CONTROL_EVENTS* command. @@ -749,6 +750,45 @@ ID set. * -KVM_EINVAL - the padding is not zero * -KVM_EAGAIN - the selected vCPU can't be introspected yet +16. KVMI_VCPU_INJECT_EXCEPTION +------------------------------ + +:Architectures: all +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_inject_exception { + __u8 nr; + __u8 padding1; + __u16 padding2; + __u32 error_code; + __u64 address; + }; + +:Returns: + +:: + + struct kvmi_error_code + +Injects a vCPU exception with or without an error code. In case of page fault +exception, the guest virtual address has to be specified. + +The *KVMI_EVENT_TRAP* event will be sent with the effective injected +exception. + +:Errors: + +* -KVM_EPERM - the *KVMI_EVENT_TRAP* event is disallowed +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet +* -KVM_EBUSY - another *KVMI_VCPU_INJECT_EXCEPTION*-*KVMI_EVENT_TRAP* pair + is in progress + Events ===== @@ -960,3 +1000,37 @@ register (see **KVMI_VCPU_CONTROL_EVENTS**). ``kvmi_event``, the control register number, the old value and the new value are sent to the introspection tool. The *CONTINUE* action will set the ``new_val``. + +6. KVMI_EVENT_TRAP +------------------ + +:Architectures: all +:Versions: >= 1 +:Actions: CONTINUE, CRASH +:Parameters: + +:: + + struct kvmi_event; + struct kvmi_event_trap { + __u8 nr; + __u8 padding1; + __u16 padding2; + __u32 error_code; + __u64 address; + }; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + +This event is sent if a previous *KVMI_VCPU_INJECT_EXCEPTION* command +took place. Because it has a high priority, it will be sent before any +other vCPU introspection event. + +``kvmi_event``, exception/interrupt number, exception code +(``error_code``) and address are sent to the introspection tool, +which should check if its exception has been injected or overridden. diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index e340a2c3500f..0c6ab136084f 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -569,3 +569,106 @@ bool kvmi_cr3_intercepted(struct kvm_vcpu *vcpu) return ret; } EXPORT_SYMBOL(kvmi_cr3_intercepted); + +int kvmi_arch_cmd_vcpu_inject_exception(struct kvm_vcpu *vcpu, u8 vector, + u32 error_code, u64 address) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + bool has_error; + + if (vcpui->exception.pending || vcpui->exception.send_event) + return -KVM_EBUSY; + + vcpui->exception.pending = true; + + has_error = x86_exception_has_error_code(vector); + + vcpui->exception.nr = vector; + vcpui->exception.error_code = has_error ? error_code : 0; + vcpui->exception.error_code_valid = has_error; + vcpui->exception.address = address; + + return 0; +} + +static void kvmi_arch_queue_exception(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + struct x86_exception e = { + .vector = vcpui->exception.nr, + .error_code_valid = vcpui->exception.error_code_valid, + .error_code = vcpui->exception.error_code, + .address = vcpui->exception.address, + }; + + if (e.vector == PF_VECTOR) + kvm_inject_page_fault(vcpu, &e); + else if (e.error_code_valid) + kvm_queue_exception_e(vcpu, e.vector, e.error_code); + else + kvm_queue_exception(vcpu, e.vector); +} + +static void kvmi_arch_save_injected_event(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + vcpui->exception.error_code = 0; + vcpui->exception.error_code_valid = false; + + vcpui->exception.address = vcpu->arch.cr2; + if (vcpu->arch.exception.injected) { + vcpui->exception.nr = vcpu->arch.exception.nr; + vcpui->exception.error_code_valid + x86_exception_has_error_code(vcpu->arch.exception.nr); + vcpui->exception.error_code = vcpu->arch.exception.error_code; + } else if (vcpu->arch.interrupt.injected) { + vcpui->exception.nr = vcpu->arch.interrupt.nr; + } +} + +void kvmi_arch_inject_exception(struct kvm_vcpu *vcpu) +{ + if (!kvm_event_needs_reinjection(vcpu)) { + kvmi_arch_queue_exception(vcpu); + kvm_inject_pending_exception(vcpu); + } + + kvmi_arch_save_injected_event(vcpu); +} + +static u32 kvmi_send_trap(struct kvm_vcpu *vcpu, u8 nr, + u32 error_code, u64 addr) +{ + struct kvmi_event_trap e; + int err, action; + + memset(&e, 0, sizeof(e)); + e.nr = nr; + e.error_code = error_code; + e.address = addr; + + err = __kvmi_send_event(vcpu, KVMI_EVENT_TRAP, &e, sizeof(e), + NULL, 0, &action); + if (err) + return KVMI_EVENT_ACTION_CONTINUE; + + return action; +} + +void kvmi_arch_send_trap_event(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + u32 action; + + action = kvmi_send_trap(vcpu, vcpui->exception.nr, + vcpui->exception.error_code, + vcpui->exception.address); + + switch (action) { + case KVMI_EVENT_ACTION_CONTINUE: + break; + default: + kvmi_handle_common_event_actions(vcpu->kvm, action); + } +} diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a12aa8e125d3..af987ad1a174 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8566,6 +8566,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) goto cancel_injection; } + if (!kvmi_enter_guest(vcpu)) + req_immediate_exit = true; + if (req_immediate_exit) { kvm_make_request(KVM_REQ_EVENT, vcpu); kvm_x86_ops.request_immediate_exit(vcpu); diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 01219c56d042..1fae589d9d35 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -36,6 +36,15 @@ struct kvm_vcpu_introspection { struct kvm_regs delayed_regs; bool have_delayed_regs; + + struct { + u8 nr; + u32 error_code; + bool error_code_valid; + u64 address; + bool pending; + bool send_event; + } exception; }; struct kvm_introspection { @@ -76,6 +85,7 @@ int kvmi_ioctl_preunhook(struct kvm *kvm); void kvmi_handle_requests(struct kvm_vcpu *vcpu); bool kvmi_hypercall_event(struct kvm_vcpu *vcpu); bool kvmi_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len); +bool kvmi_enter_guest(struct kvm_vcpu *vcpu); #else @@ -90,6 +100,8 @@ static inline bool kvmi_hypercall_event(struct kvm_vcpu *vcpu) { return false; } static inline bool kvmi_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len) { return true; } +static inline bool kvmi_enter_guest(struct kvm_vcpu *vcpu) + { return true; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index e31b474e3496..faf4624d7a97 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -34,7 +34,8 @@ enum { KVMI_VM_CONTROL_CLEANUP = 14, - KVMI_VCPU_CONTROL_CR = 15, + KVMI_VCPU_CONTROL_CR = 15, + KVMI_VCPU_INJECT_EXCEPTION = 16, KVMI_NUM_MESSAGES }; @@ -45,6 +46,7 @@ enum { KVMI_EVENT_HYPERCALL = 2, KVMI_EVENT_BREAKPOINT = 3, KVMI_EVENT_CR = 4, + KVMI_EVENT_TRAP = 5, KVMI_NUM_EVENTS }; @@ -162,4 +164,20 @@ struct kvmi_event_reply { __u32 padding2; }; +struct kvmi_event_trap { + __u8 nr; + __u8 padding1; + __u16 padding2; + __u32 error_code; + __u64 address; +}; + +struct kvmi_vcpu_inject_exception { + __u8 nr; + __u8 padding1; + __u16 padding2; + __u32 error_code; + __u64 address; +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 7694fa8fef89..9abf4ec0d09a 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -45,6 +45,8 @@ struct vcpu_worker_data { int vcpu_id; int test_id; bool stop; + bool shutdown; + bool restart_on_shutdown; }; enum { @@ -687,11 +689,19 @@ static void *vcpu_worker(void *data) vcpu_run(ctx->vm, ctx->vcpu_id); - TEST_ASSERT(run->exit_reason == KVM_EXIT_IO, + TEST_ASSERT(run->exit_reason == KVM_EXIT_IO + || (run->exit_reason == KVM_EXIT_SHUTDOWN + && ctx->shutdown), "vcpu_run() failed, test_id %d, exit reason %u (%s)\n", ctx->test_id, run->exit_reason, exit_reason_str(run->exit_reason)); + if (run->exit_reason == KVM_EXIT_SHUTDOWN) { + if (ctx->restart_on_shutdown) + continue; + break; + } + TEST_ASSERT(get_ucall(ctx->vm, ctx->vcpu_id, &uc), "No guest request\n"); @@ -1308,6 +1318,108 @@ static void test_cmd_vcpu_control_cr(struct kvm_vm *vm) test_invalid_vcpu_control_cr(vm); } +static void __inject_exception(int nr) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_inject_exception cmd; + } req = {}; + int r; + + req.cmd.nr = nr; + + r = __do_vcpu0_command(KVMI_VCPU_INJECT_EXCEPTION, + &req.hdr, sizeof(req), NULL, 0); + TEST_ASSERT(r == 0, + "KVMI_VCPU_INJECT_EXCEPTION failed, error %d(%s)\n", + -r, kvm_strerror(-r)); +} + +static void receive_exception_event(int nr) +{ + struct kvmi_msg_hdr hdr; + struct { + struct kvmi_event common; + struct kvmi_event_trap trap; + } ev; + struct vcpu_reply rpl = {}; + + receive_event(&hdr, &ev.common, sizeof(ev), KVMI_EVENT_TRAP); + + pr_info("Exception event: vector %u, error_code 0x%x, address 0x%llx\n", + ev.trap.nr, ev.trap.error_code, ev.trap.address); + + TEST_ASSERT(ev.trap.nr == nr, + "Injected exception %u instead of %u\n", + ev.trap.nr, nr); + + reply_to_event(&hdr, &ev.common, KVMI_EVENT_ACTION_CONTINUE, + &rpl, sizeof(rpl)); +} + +static void test_succeded_ud_injection(void) +{ + __u8 ud_vector = 6; + + __inject_exception(ud_vector); + + receive_exception_event(ud_vector); +} + +static void test_failed_ud_injection(struct kvm_vm *vm, + struct vcpu_worker_data *data) +{ + struct kvmi_msg_hdr hdr; + struct { + struct kvmi_event common; + struct kvmi_event_breakpoint bp; + } ev; + struct vcpu_reply rpl = {}; + __u8 ud_vector = 6, bp_vector = 3; + + WRITE_ONCE(data->test_id, GUEST_TEST_BP); + + receive_event(&hdr, &ev.common, sizeof(ev), KVMI_EVENT_BREAKPOINT); + + /* skip the breakpoint instruction, next time guest_bp_test() runs */ + ev.common.arch.regs.rip += ev.bp.insn_len; + __set_registers(vm, &ev.common.arch.regs); + + __inject_exception(ud_vector); + + /* reinject the #BP exception because of the continue action */ + reply_to_event(&hdr, &ev.common, KVMI_EVENT_ACTION_CONTINUE, + &rpl, sizeof(rpl)); + + receive_exception_event(bp_vector); +} + +static void test_cmd_vcpu_inject_exception(struct kvm_vm *vm) +{ + struct vcpu_worker_data data = { + .vm = vm, + .vcpu_id = VCPU_ID, + .shutdown = true, + .restart_on_shutdown = true, + }; + pthread_t vcpu_thread; + + if (!is_intel_cpu()) { + print_skip("TODO: %s() - make it work with AMD", __func__); + return; + } + + enable_vcpu_event(vm, KVMI_EVENT_BREAKPOINT); + vcpu_thread = start_vcpu_worker(&data); + + test_succeded_ud_injection(); + test_failed_ud_injection(vm, &data); + + stop_vcpu_worker(vcpu_thread, &data); + disable_vcpu_event(vm, KVMI_EVENT_BREAKPOINT); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1332,6 +1444,7 @@ static void test_introspection(struct kvm_vm *vm) test_event_breakpoint(vm); test_cmd_vm_control_cleanup(vm); test_cmd_vcpu_control_cr(vm); + test_cmd_vcpu_inject_exception(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 2dd82aa5e11c..d4b39d0800ee 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -101,6 +101,7 @@ static void setup_known_events(void) set_bit(KVMI_EVENT_CR, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_HYPERCALL, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_PAUSE_VCPU, Kvmi_known_vcpu_events); + set_bit(KVMI_EVENT_TRAP, Kvmi_known_vcpu_events); bitmap_or(Kvmi_known_events, Kvmi_known_vm_events, Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); @@ -855,6 +856,16 @@ static void kvmi_vcpu_pause_event(struct kvm_vcpu *vcpu) } } +void kvmi_send_pending_event(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + if (vcpui->exception.send_event) { + vcpui->exception.send_event = false; + kvmi_arch_send_trap_event(vcpu); + } +} + void kvmi_handle_requests(struct kvm_vcpu *vcpu) { struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); @@ -864,6 +875,8 @@ void kvmi_handle_requests(struct kvm_vcpu *vcpu) if (!kvmi) goto out; + kvmi_send_pending_event(vcpu); + for (;;) { kvmi_run_jobs(vcpu); @@ -962,3 +975,35 @@ bool kvmi_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len) return ret; } EXPORT_SYMBOL(kvmi_breakpoint_event); + +static void kvmi_inject_pending_exception(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + kvmi_arch_inject_exception(vcpu); + + vcpui->exception.pending = false; + vcpui->exception.send_event = true; + kvm_make_request(KVM_REQ_INTROSPECTION, vcpu); +} + +bool kvmi_enter_guest(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui; + struct kvm_introspection *kvmi; + bool r = true; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return true; + + vcpui = VCPUI(vcpu); + + if (vcpui->exception.pending) { + kvmi_inject_pending_exception(vcpu); + r = false; + } + + kvmi_put(vcpu->kvm); + return r; +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index c206376eb0ad..51c03097a7d5 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -34,6 +34,9 @@ bool kvmi_msg_process(struct kvm_introspection *kvmi); int kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, void *ev, size_t ev_size, void *rpl, size_t rpl_size, int *action); +int __kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, + void *ev, size_t ev_size, + void *rpl, size_t rpl_size, int *action); int kvmi_msg_send_unhook(struct kvm_introspection *kvmi); u32 kvmi_msg_send_vcpu_pause(struct kvm_vcpu *vcpu); u32 kvmi_msg_send_hypercall(struct kvm_vcpu *vcpu); @@ -55,6 +58,7 @@ void kvmi_handle_common_event_actions(struct kvm *kvm, u32 action); void kvmi_cmd_vm_control_cleanup(struct kvm_introspection *kvmi, bool enable); struct kvm_introspection * __must_check kvmi_get(struct kvm *kvm); void kvmi_put(struct kvm *kvm); +void kvmi_send_pending_event(struct kvm_vcpu *vcpu); int kvmi_cmd_vm_control_events(struct kvm_introspection *kvmi, unsigned int event_id, bool enable); int kvmi_cmd_vcpu_control_events(struct kvm_vcpu *vcpu, @@ -97,5 +101,9 @@ int kvmi_arch_cmd_control_intercept(struct kvm_vcpu *vcpu, unsigned int event_id, bool enable); int kvmi_arch_cmd_vcpu_control_cr(struct kvm_vcpu *vcpu, const struct kvmi_vcpu_control_cr *req); +int kvmi_arch_cmd_vcpu_inject_exception(struct kvm_vcpu *vcpu, u8 vector, + u32 error_code, u64 address); +void kvmi_arch_send_trap_event(struct kvm_vcpu *vcpu); +void kvmi_arch_inject_exception(struct kvm_vcpu *vcpu); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 330fad27e1df..63efb85ff1ae 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -492,6 +492,25 @@ static int handle_vcpu_control_cr(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); } +static int handle_vcpu_inject_exception(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vcpu_inject_exception *req = _req; + int ec; + + if (!is_event_allowed(KVMI(job->vcpu->kvm), KVMI_EVENT_TRAP)) + ec = -KVM_EPERM; + else if (req->padding1 || req->padding2) + ec = -KVM_EINVAL; + else + ec = kvmi_arch_cmd_vcpu_inject_exception(job->vcpu, req->nr, + req->error_code, + req->address); + + return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -500,13 +519,14 @@ static int handle_vcpu_control_cr(const struct kvmi_vcpu_msg_job *job, */ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, const struct kvmi_msg_hdr *, const void *) = { - [KVMI_EVENT] = handle_vcpu_event_reply, - [KVMI_VCPU_CONTROL_CR] = handle_vcpu_control_cr, - [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, - [KVMI_VCPU_GET_CPUID] = handle_vcpu_get_cpuid, - [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, - [KVMI_VCPU_GET_REGISTERS] = handle_vcpu_get_registers, - [KVMI_VCPU_SET_REGISTERS] = handle_vcpu_set_registers, + [KVMI_EVENT] = handle_vcpu_event_reply, + [KVMI_VCPU_CONTROL_CR] = handle_vcpu_control_cr, + [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, + [KVMI_VCPU_GET_CPUID] = handle_vcpu_get_cpuid, + [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, + [KVMI_VCPU_GET_REGISTERS] = handle_vcpu_get_registers, + [KVMI_VCPU_INJECT_EXCEPTION] = handle_vcpu_inject_exception, + [KVMI_VCPU_SET_REGISTERS] = handle_vcpu_set_registers, }; static bool is_vcpu_command(u16 id) @@ -770,9 +790,9 @@ static void kvmi_setup_vcpu_reply(struct kvm_vcpu_introspection *vcpui, vcpui->waiting_for_reply = true; } -int kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, - void *ev, size_t ev_size, - void *rpl, size_t rpl_size, int *action) +int __kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, + void *ev, size_t ev_size, + void *rpl, size_t rpl_size, int *action) { struct kvmi_msg_hdr hdr; struct kvmi_event common; @@ -812,6 +832,16 @@ int kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, return err; } +int kvmi_send_event(struct kvm_vcpu *vcpu, u32 ev_id, + void *ev, size_t ev_size, + void *rpl, size_t rpl_size, int *action) +{ + kvmi_send_pending_event(vcpu); + + return __kvmi_send_event(vcpu, ev_id, ev, ev_size, + rpl, rpl_size, action); +} + u32 kvmi_msg_send_vcpu_pause(struct kvm_vcpu *vcpu) { int err, action;
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 67/84] KVM: introspection: add KVMI_VM_GET_MAX_GFN
From: ?tefan ?icleru <ssicleru at bitdefender.com> The introspection tool will use this command to get the memory address range for which it can set access restrictions. Signed-off-by: ?tefan ?icleru <ssicleru at bitdefender.com> Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 19 +++++++++++++++++++ include/uapi/linux/kvmi.h | 6 ++++++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 12 ++++++++++++ virt/kvm/introspection/kvmi_msg.c | 13 +++++++++++++ 4 files changed, 50 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 4263a9ac90e4..7da8efd18b89 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -789,6 +789,25 @@ exception. * -KVM_EBUSY - another *KVMI_VCPU_INJECT_EXCEPTION*-*KVMI_EVENT_TRAP* pair is in progress +17. KVMI_VM_GET_MAX_GFN +----------------------- + +:Architecture: all +:Versions: >= 1 +:Parameters: none +:Returns: + +:: + + struct kvmi_error_code; + struct kvmi_vm_get_max_gfn_reply { + __u64 gfn; + }; + +Provides the maximum GFN allocated to the VM by walking through all +memory slots allocated by KVM. Stricly speaking, the returned value refers +to the first inaccessible GFN, next to the maximum accessible GFN. + Events ===== diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index faf4624d7a97..2a4cc8c41465 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -37,6 +37,8 @@ enum { KVMI_VCPU_CONTROL_CR = 15, KVMI_VCPU_INJECT_EXCEPTION = 16, + KVMI_VM_GET_MAX_GFN = 17, + KVMI_NUM_MESSAGES }; @@ -149,6 +151,10 @@ struct kvmi_vm_control_cleanup { __u32 padding3; }; +struct kvmi_vm_get_max_gfn_reply { + __u64 gfn; +}; + struct kvmi_event { __u16 size; __u16 vcpu; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 9abf4ec0d09a..105adf75a68d 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -1420,6 +1420,17 @@ static void test_cmd_vcpu_inject_exception(struct kvm_vm *vm) disable_vcpu_event(vm, KVMI_EVENT_BREAKPOINT); } +static void test_cmd_vm_get_max_gfn(void) +{ + struct kvmi_vm_get_max_gfn_reply rpl; + struct kvmi_msg_hdr req; + + test_vm_command(KVMI_VM_GET_MAX_GFN, &req, sizeof(req), + &rpl, sizeof(rpl)); + + pr_info("max_gfn: 0x%llx\n", rpl.gfn); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1445,6 +1456,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vm_control_cleanup(vm); test_cmd_vcpu_control_cr(vm); test_cmd_vcpu_inject_exception(vm); + test_cmd_vm_get_max_gfn(); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 63efb85ff1ae..18bc1a711845 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -322,6 +322,18 @@ static int handle_vm_control_cleanup(struct kvm_introspection *kvmi, return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); } +static int handle_vm_get_max_gfn(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + struct kvmi_vm_get_max_gfn_reply rpl; + + memset(&rpl, 0, sizeof(rpl)); + rpl.gfn = kvm_get_max_gfn(kvmi->kvm); + + return kvmi_msg_vm_reply(kvmi, msg, 0, &rpl, sizeof(rpl)); +} + /* * These commands are executed by the receiving thread. */ @@ -334,6 +346,7 @@ static int(*const msg_vm[])(struct kvm_introspection *, [KVMI_VM_CONTROL_CLEANUP] = handle_vm_control_cleanup, [KVMI_VM_CONTROL_EVENTS] = handle_vm_control_events, [KVMI_VM_GET_INFO] = handle_vm_get_info, + [KVMI_VM_GET_MAX_GFN] = handle_vm_get_max_gfn, [KVMI_VM_READ_PHYSICAL] = handle_vm_read_physical, [KVMI_VM_WRITE_PHYSICAL] = handle_vm_write_physical, };
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 68/84] KVM: introspection: add KVMI_EVENT_XSETBV
From: Mihai Don?u <mdontu at bitdefender.com> This event is sent when an extended control register XCR is going to be changed. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 33 ++++++++ arch/x86/include/asm/kvmi_host.h | 4 + arch/x86/include/uapi/asm/kvmi.h | 7 ++ arch/x86/kvm/kvmi.c | 48 +++++++++++ arch/x86/kvm/x86.c | 6 ++ include/uapi/linux/kvmi.h | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 84 +++++++++++++++++++ virt/kvm/introspection/kvmi.c | 1 + 8 files changed, 184 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 7da8efd18b89..283e9a2dfda1 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -552,6 +552,7 @@ the following events:: KVMI_EVENT_BREAKPOINT KVMI_EVENT_CR KVMI_EVENT_HYPERCALL + KVMI_EVENT_XSETBV When an event is enabled, the introspection tool is notified and must reply with: continue, retry, crash, etc. (see **Events** below). @@ -1053,3 +1054,35 @@ other vCPU introspection event. ``kvmi_event``, exception/interrupt number, exception code (``error_code``) and address are sent to the introspection tool, which should check if its exception has been injected or overridden. + +7. KVMI_EVENT_XSETBV +-------------------- + +:Architectures: x86 +:Versions: >= 1 +:Actions: CONTINUE, CRASH +:Parameters: + +:: + + struct kvmi_event; + struct kvmi_event_xsetbv { + __u8 xcr; + __u8 padding[7]; + __u64 old_value; + __u64 new_value; + }; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + +This event is sent when the extended control register XCR is going +to be changed and the introspection has been enabled for this event +(see *KVMI_VCPU_CONTROL_EVENTS*). + +``kvmi_event``, the extended control register number, the old value and +the new value are sent to the introspection tool. diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 44580f77e34e..aed8a4b88a68 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -33,6 +33,8 @@ bool kvmi_cr_event(struct kvm_vcpu *vcpu, unsigned int cr, unsigned long old_value, unsigned long *new_value); bool kvmi_cr3_intercepted(struct kvm_vcpu *vcpu); bool kvmi_monitor_cr3w_intercept(struct kvm_vcpu *vcpu, bool enable); +void kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, + u64 old_value, u64 new_value); #else /* CONFIG_KVM_INTROSPECTION */ @@ -45,6 +47,8 @@ static inline bool kvmi_cr_event(struct kvm_vcpu *vcpu, unsigned int cr, static inline bool kvmi_cr3_intercepted(struct kvm_vcpu *vcpu) { return false; } static inline bool kvmi_monitor_cr3w_intercept(struct kvm_vcpu *vcpu, bool enable) { return false; } +static inline void kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, + u64 old_value, u64 new_value) { } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index 4c59c9fe6b00..2f69a4f5d2e0 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -83,4 +83,11 @@ struct kvmi_event_cr_reply { __u64 new_val; }; +struct kvmi_event_xsetbv { + __u8 xcr; + __u8 padding[7]; + __u64 old_value; + __u64 new_value; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 0c6ab136084f..55c5e290730c 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -672,3 +672,51 @@ void kvmi_arch_send_trap_event(struct kvm_vcpu *vcpu) kvmi_handle_common_event_actions(vcpu->kvm, action); } } + +static u32 kvmi_send_xsetbv(struct kvm_vcpu *vcpu, u8 xcr, u64 old_value, + u64 new_value) +{ + struct kvmi_event_xsetbv e; + int err, action; + + memset(&e, 0, sizeof(e)); + e.xcr = xcr; + e.old_value = old_value; + e.new_value = new_value; + + err = kvmi_send_event(vcpu, KVMI_EVENT_XSETBV, &e, sizeof(e), + NULL, 0, &action); + if (err) + return KVMI_EVENT_ACTION_CONTINUE; + + return action; +} + +static void __kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, + u64 old_value, u64 new_value) +{ + u32 action; + + action = kvmi_send_xsetbv(vcpu, xcr, old_value, new_value); + switch (action) { + case KVMI_EVENT_ACTION_CONTINUE: + break; + default: + kvmi_handle_common_event_actions(vcpu->kvm, action); + } +} + +void kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, + u64 old_value, u64 new_value) +{ + struct kvm_introspection *kvmi; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return; + + if (is_event_enabled(vcpu, KVMI_EVENT_XSETBV)) + __kvmi_xsetbv_event(vcpu, xcr, old_value, new_value); + + kvmi_put(vcpu->kvm); +} diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index af987ad1a174..c3557a11817f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -919,6 +919,12 @@ static int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) } vcpu->arch.xcr0 = xcr0; +#ifdef CONFIG_KVM_INTROSPECTION + if (index == 0 && xcr0 != old_xcr0) + kvmi_xsetbv_event(vcpu, 0, old_xcr0, xcr0); +#endif /* CONFIG_KVM_INTROSPECTION */ + + if ((xcr0 ^ old_xcr0) & XFEATURE_MASK_EXTEND) kvm_update_cpuid(vcpu); return 0; diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 2a4cc8c41465..348a9a551091 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -49,6 +49,7 @@ enum { KVMI_EVENT_BREAKPOINT = 3, KVMI_EVENT_CR = 4, KVMI_EVENT_TRAP = 5, + KVMI_EVENT_XSETBV = 6, KVMI_NUM_EVENTS }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 105adf75a68d..0569185a7064 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -22,6 +22,8 @@ #define VCPU_ID 5 +#define X86_FEATURE_XSAVE (1<<26) + static int socket_pair[2]; #define Kvm_socket socket_pair[0] #define Userspace_socket socket_pair[1] @@ -54,6 +56,7 @@ enum { GUEST_TEST_BP, GUEST_TEST_CR, GUEST_TEST_HYPERCALL, + GUEST_TEST_XSETBV, }; #define GUEST_REQUEST_TEST() GUEST_SYNC(0) @@ -89,6 +92,45 @@ static void guest_hypercall_test(void) asm volatile(".byte 0x0f,0x01,0xc1"); } +/* from fpu/internal.h */ +static u64 xgetbv(u32 index) +{ + u32 eax, edx; + + asm volatile(".byte 0x0f,0x01,0xd0" /* xgetbv */ + : "=a" (eax), "=d" (edx) + : "c" (index)); + return eax + ((u64)edx << 32); +} + +/* from fpu/internal.h */ +static void xsetbv(u32 index, u64 value) +{ + u32 eax = value; + u32 edx = value >> 32; + + asm volatile(".byte 0x0f,0x01,0xd1" /* xsetbv */ + : : "a" (eax), "d" (edx), "c" (index)); +} + +static void guest_xsetbv_test(void) +{ + const int SSE_BIT = 1 << 1; + const int AVX_BIT = 1 << 2; + u64 xcr0; + + /* avoid #UD */ + set_cr4(get_cr4() | X86_CR4_OSXSAVE); + + xcr0 = xgetbv(0); + if (xcr0 & AVX_BIT) + xcr0 &= ~AVX_BIT; + else + xcr0 |= (AVX_BIT | SSE_BIT); + + xsetbv(0, xcr0); +} + static void guest_code(void) { while (true) { @@ -104,6 +146,9 @@ static void guest_code(void) case GUEST_TEST_HYPERCALL: guest_hypercall_test(); break; + case GUEST_TEST_XSETBV: + guest_xsetbv_test(); + break; } GUEST_SIGNAL_TEST_DONE(); } @@ -1431,6 +1476,44 @@ static void test_cmd_vm_get_max_gfn(void) pr_info("max_gfn: 0x%llx\n", rpl.gfn); } +static void test_event_xsetbv(struct kvm_vm *vm) +{ + struct vcpu_worker_data data = { + .vm = vm, + .vcpu_id = VCPU_ID, + .test_id = GUEST_TEST_XSETBV, + }; + __u16 event_id = KVMI_EVENT_XSETBV; + struct kvm_cpuid_entry2 *entry; + struct vcpu_reply rpl = {}; + struct kvmi_msg_hdr hdr; + pthread_t vcpu_thread; + struct { + struct kvmi_event common; + struct kvmi_event_xsetbv xsetbv; + } ev; + + entry = kvm_get_supported_cpuid_entry(1); + if (!(entry->ecx & X86_FEATURE_XSAVE)) { + print_skip("XSAVE not supported, ecx 0x%x", entry->ecx); + return; + } + + enable_vcpu_event(vm, event_id); + vcpu_thread = start_vcpu_worker(&data); + + receive_event(&hdr, &ev.common, sizeof(ev), event_id); + + pr_debug("XSETBV event, XCR%u, old 0x%llx, new 0x%llx\n", + ev.xsetbv.xcr, ev.xsetbv.old_value, ev.xsetbv.new_value); + + reply_to_event(&hdr, &ev.common, KVMI_EVENT_ACTION_CONTINUE, + &rpl, sizeof(rpl)); + + stop_vcpu_worker(vcpu_thread, &data); + disable_vcpu_event(vm, event_id); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1457,6 +1540,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_control_cr(vm); test_cmd_vcpu_inject_exception(vm); test_cmd_vm_get_max_gfn(); + test_event_xsetbv(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index d4b39d0800ee..761d3fccddce 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -102,6 +102,7 @@ static void setup_known_events(void) set_bit(KVMI_EVENT_HYPERCALL, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_PAUSE_VCPU, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_TRAP, Kvmi_known_vcpu_events); + set_bit(KVMI_EVENT_XSETBV, Kvmi_known_vcpu_events); bitmap_or(Kvmi_known_events, Kvmi_known_vm_events, Kvmi_known_vcpu_events, KVMI_NUM_EVENTS);
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 69/84] KVM: introspection: add KVMI_VCPU_GET_XCR
This can be used by the introspection tool to emulate SSE instructions. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 33 ++++++++++++++++ arch/x86/include/uapi/asm/kvmi.h | 9 +++++ arch/x86/kvm/kvmi.c | 17 +++++++++ include/uapi/linux/kvmi.h | 2 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 38 +++++++++++++++++++ virt/kvm/introspection/kvmi_int.h | 3 ++ virt/kvm/introspection/kvmi_msg.c | 15 ++++++++ 7 files changed, 117 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 283e9a2dfda1..1fdb73c6a6ff 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -809,6 +809,39 @@ Provides the maximum GFN allocated to the VM by walking through all memory slots allocated by KVM. Stricly speaking, the returned value refers to the first inaccessible GFN, next to the maximum accessible GFN. +18. KVMI_VCPU_GET_XCR +--------------------- + +:Architecture: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_get_xcr { + __u8 xcr; + __u8 padding[7]; + }; + +:Returns: + +:: + + struct kvmi_error_code; + struct kvmi_vcpu_get_xcr_reply { + u64 value; + }; + +Returns the value of an extended control register XCR. + +:Errors: + +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the specified control register is not XCR0 +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet + Events ===== diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index 2f69a4f5d2e0..cc4cb85366c8 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -90,4 +90,13 @@ struct kvmi_event_xsetbv { __u64 new_value; }; +struct kvmi_vcpu_get_xcr { + __u8 xcr; + __u8 padding[7]; +}; + +struct kvmi_vcpu_get_xcr_reply { + u64 value; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 55c5e290730c..ff70c9a33ccf 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -720,3 +720,20 @@ void kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, kvmi_put(vcpu->kvm); } + +int kvmi_arch_cmd_vcpu_get_xcr(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_get_xcr *req, + struct kvmi_vcpu_get_xcr_reply *rpl) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(req->padding); i++) + if (req->padding[i]) + return -KVM_EINVAL; + + if (req->xcr != 0) + return -KVM_EINVAL; + + rpl->value = vcpu->arch.xcr0; + return 0; +} diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 348a9a551091..d1c564ff3d6c 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -39,6 +39,8 @@ enum { KVMI_VM_GET_MAX_GFN = 17, + KVMI_VCPU_GET_XCR = 18, + KVMI_NUM_MESSAGES }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 0569185a7064..8b92ff7cee6b 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -1514,6 +1514,43 @@ static void test_event_xsetbv(struct kvm_vm *vm) disable_vcpu_event(vm, event_id); } +static void cmd_vcpu_get_xcr(struct kvm_vm *vm, u8 xcr, u64 *value, + u8 padding, int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_get_xcr cmd; + } req = { 0 }; + struct kvmi_vcpu_get_xcr_reply rpl = { 0 }; + int r; + + req.cmd.xcr = xcr; + req.cmd.padding[6] = padding; + + r = do_vcpu0_command(vm, KVMI_VCPU_GET_XCR, &req.hdr, sizeof(req), + &rpl, sizeof(rpl)); + + TEST_ASSERT(r == expected_err, + "KVMI_VCPU_GET_XCR failed with error %d (%s), expected err %d\n", + -r, kvm_strerror(-r), expected_err); + + *value = r == 0 ? rpl.value : 0; +} + +static void test_cmd_vcpu_get_xcr(struct kvm_vm *vm) +{ + u8 no_padding = 0, padding = 1; + u8 xcr0 = 0, xcr1 = 1; + u64 value; + + cmd_vcpu_get_xcr(vm, xcr0, &value, no_padding, 0); + pr_info("XCR0 0x%lx\n", value); + + cmd_vcpu_get_xcr(vm, xcr0, &value, padding, -KVM_EINVAL); + cmd_vcpu_get_xcr(vm, xcr1, &value, no_padding, -KVM_EINVAL); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1541,6 +1578,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_inject_exception(vm); test_cmd_vm_get_max_gfn(); test_event_xsetbv(vm); + test_cmd_vcpu_get_xcr(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 51c03097a7d5..0b84ac365ae7 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -105,5 +105,8 @@ int kvmi_arch_cmd_vcpu_inject_exception(struct kvm_vcpu *vcpu, u8 vector, u32 error_code, u64 address); void kvmi_arch_send_trap_event(struct kvm_vcpu *vcpu); void kvmi_arch_inject_exception(struct kvm_vcpu *vcpu); +int kvmi_arch_cmd_vcpu_get_xcr(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_get_xcr *req, + struct kvmi_vcpu_get_xcr_reply *rpl); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 18bc1a711845..aa4fb202115e 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -524,6 +524,20 @@ static int handle_vcpu_inject_exception(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); } +static int handle_vcpu_get_xcr(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + struct kvmi_vcpu_get_xcr_reply rpl; + int ec; + + memset(&rpl, 0, sizeof(rpl)); + + ec = kvmi_arch_cmd_vcpu_get_xcr(job->vcpu, req, &rpl); + + return kvmi_msg_vcpu_reply(job, msg, ec, &rpl, sizeof(rpl)); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -538,6 +552,7 @@ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, [KVMI_VCPU_GET_CPUID] = handle_vcpu_get_cpuid, [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, [KVMI_VCPU_GET_REGISTERS] = handle_vcpu_get_registers, + [KVMI_VCPU_GET_XCR] = handle_vcpu_get_xcr, [KVMI_VCPU_INJECT_EXCEPTION] = handle_vcpu_inject_exception, [KVMI_VCPU_SET_REGISTERS] = handle_vcpu_set_registers, };
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 70/84] KVM: introspection: add KVMI_VCPU_GET_XSAVE
From: Mihai Don?u <mdontu at bitdefender.com> This vCPU command is used to get the XSAVE area. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 31 +++++++++++++++++++ arch/x86/include/uapi/asm/kvmi.h | 4 +++ arch/x86/kvm/kvmi.c | 24 ++++++++++++++ include/uapi/linux/kvmi.h | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 30 ++++++++++++++++++ virt/kvm/introspection/kvmi_int.h | 3 ++ virt/kvm/introspection/kvmi_msg.c | 16 ++++++++++ 7 files changed, 109 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 1fdb73c6a6ff..9bf8b08eb585 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -842,6 +842,37 @@ Returns the value of an extended control register XCR. * -KVM_EINVAL - the padding is not zero * -KVM_EAGAIN - the selected vCPU can't be introspected yet +19. KVMI_VCPU_GET_XSAVE +----------------------- + +:Architecture: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + +:Returns: + +:: + + struct kvmi_error_code; + struct kvmi_vcpu_get_xsave_reply { + __u32 region[0]; + }; + +Returns a buffer containing the XSAVE area. Currently, the size of +``kvm_xsave`` is used, but it could change. The userspace should get +the buffer size from the message size (kvmi_msg_hdr.size). + +:Errors: + +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet +* -KVM_ENOMEM - there is not enough memory to allocate the reply + Events ===== diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index cc4cb85366c8..a165c259ba0e 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -99,4 +99,8 @@ struct kvmi_vcpu_get_xcr_reply { u64 value; }; +struct kvmi_vcpu_get_xsave_reply { + __u32 region[0]; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index ff70c9a33ccf..7b7122784f0f 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -737,3 +737,27 @@ int kvmi_arch_cmd_vcpu_get_xcr(struct kvm_vcpu *vcpu, rpl->value = vcpu->arch.xcr0; return 0; } + +int kvmi_arch_cmd_vcpu_get_xsave(struct kvm_vcpu *vcpu, + struct kvmi_vcpu_get_xsave_reply **dest, + size_t *dest_size) +{ + struct kvmi_vcpu_get_xsave_reply *rpl = NULL; + size_t rpl_size = sizeof(*rpl) + sizeof(struct kvm_xsave); + struct kvm_xsave *area; + + if (!valid_reply_size(rpl_size)) + return -KVM_EINVAL; + + rpl = kvmi_msg_alloc(); + if (!rpl) + return -KVM_ENOMEM; + + area = (struct kvm_xsave *) &rpl->region[0]; + kvm_vcpu_ioctl_x86_get_xsave(vcpu, area); + + *dest = rpl; + *dest_size = rpl_size; + + return 0; +} diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index d1c564ff3d6c..6b5662f54ba1 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -40,6 +40,7 @@ enum { KVMI_VM_GET_MAX_GFN = 17, KVMI_VCPU_GET_XCR = 18, + KVMI_VCPU_GET_XSAVE = 19, KVMI_NUM_MESSAGES }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 8b92ff7cee6b..657684ec6ff5 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -1551,6 +1551,35 @@ static void test_cmd_vcpu_get_xcr(struct kvm_vm *vm) cmd_vcpu_get_xcr(vm, xcr1, &value, no_padding, -KVM_EINVAL); } +static void cmd_vcpu_get_xsave(struct kvm_vm *vm) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + } req = {}; + struct kvm_xsave rpl; + int r; + + r = do_vcpu0_command(vm, KVMI_VCPU_GET_XSAVE, &req.hdr, sizeof(req), + &rpl, sizeof(rpl)); + TEST_ASSERT(r == 0, + "KVMI_VCPU_GET_XSAVE failed with error %d (%s)\n", + -r, kvm_strerror(-r)); +} + +static void test_cmd_vcpu_get_xsave(struct kvm_vm *vm) +{ + struct kvm_cpuid_entry2 *entry; + + entry = kvm_get_supported_cpuid_entry(1); + if (!(entry->ecx & X86_FEATURE_XSAVE)) { + print_skip("XSAVE not supported, ecx 0x%x", entry->ecx); + return; + } + + cmd_vcpu_get_xsave(vm); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1579,6 +1608,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vm_get_max_gfn(); test_event_xsetbv(vm); test_cmd_vcpu_get_xcr(vm); + test_cmd_vcpu_get_xsave(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 0b84ac365ae7..26fc27f3fddc 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -108,5 +108,8 @@ void kvmi_arch_inject_exception(struct kvm_vcpu *vcpu); int kvmi_arch_cmd_vcpu_get_xcr(struct kvm_vcpu *vcpu, const struct kvmi_vcpu_get_xcr *req, struct kvmi_vcpu_get_xcr_reply *rpl); +int kvmi_arch_cmd_vcpu_get_xsave(struct kvm_vcpu *vcpu, + struct kvmi_vcpu_get_xsave_reply **dest, + size_t *dest_size); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index aa4fb202115e..06a8fbb5c86d 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -538,6 +538,21 @@ static int handle_vcpu_get_xcr(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, ec, &rpl, sizeof(rpl)); } +static int handle_vcpu_get_xsave(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + struct kvmi_vcpu_get_xsave_reply *rpl = NULL; + size_t rpl_size = 0; + int err, ec; + + ec = kvmi_arch_cmd_vcpu_get_xsave(job->vcpu, &rpl, &rpl_size); + + err = kvmi_msg_vcpu_reply(job, msg, ec, rpl, rpl_size); + kvmi_msg_free(rpl); + return err; +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -553,6 +568,7 @@ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, [KVMI_VCPU_GET_REGISTERS] = handle_vcpu_get_registers, [KVMI_VCPU_GET_XCR] = handle_vcpu_get_xcr, + [KVMI_VCPU_GET_XSAVE] = handle_vcpu_get_xsave, [KVMI_VCPU_INJECT_EXCEPTION] = handle_vcpu_inject_exception, [KVMI_VCPU_SET_REGISTERS] = handle_vcpu_set_registers, };
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 71/84] KVM: introspection: add KVMI_VCPU_SET_XSAVE
This can be used by the introspection tool to emulate SSE instructions. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 29 +++++++++++++++++ arch/x86/include/uapi/asm/kvmi.h | 4 +++ arch/x86/kvm/kvmi.c | 23 ++++++++++++++ include/uapi/linux/kvmi.h | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 31 +++++++++++++++---- virt/kvm/introspection/kvmi_int.h | 3 ++ virt/kvm/introspection/kvmi_msg.c | 17 ++++++++++ 7 files changed, 102 insertions(+), 6 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 9bf8b08eb585..24605aaede9a 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -873,6 +873,35 @@ the buffer size from the message size (kvmi_msg_hdr.size). * -KVM_EAGAIN - the selected vCPU can't be introspected yet * -KVM_ENOMEM - there is not enough memory to allocate the reply +20. KVMI_VCPU_SET_XSAVE +----------------------- + +:Architecture: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_set_xsave { + __u32 region[0]; + }; + +:Returns: + +:: + + struct kvmi_error_code; + +Modifies the XSAVE area. + +:Errors: + +* -KVM_EINVAL - the buffer is larger than ``struct kvm_xsave`` +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet + Events ===== diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index a165c259ba0e..71206206973c 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -103,4 +103,8 @@ struct kvmi_vcpu_get_xsave_reply { __u32 region[0]; }; +struct kvmi_vcpu_set_xsave { + __u32 region[0]; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 7b7122784f0f..0776be48ada1 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -761,3 +761,26 @@ int kvmi_arch_cmd_vcpu_get_xsave(struct kvm_vcpu *vcpu, return 0; } + +int kvmi_arch_cmd_vcpu_set_xsave(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_set_xsave *req, + size_t req_size) +{ + struct kvm_xsave *area; + size_t dest_size = sizeof(*area); + int err; + + if (req_size > dest_size) + return -KVM_EINVAL; + + area = kzalloc(dest_size, GFP_KERNEL); + if (!area) + return -KVM_ENOMEM; + + memcpy(area, req, min(req_size, dest_size)); + + err = kvm_vcpu_ioctl_x86_set_xsave(vcpu, area); + kfree(area); + + return err ? -KVM_EINVAL : 0; +} diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 6b5662f54ba1..441586cec238 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -41,6 +41,7 @@ enum { KVMI_VCPU_GET_XCR = 18, KVMI_VCPU_GET_XSAVE = 19, + KVMI_VCPU_SET_XSAVE = 20, KVMI_NUM_MESSAGES }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 657684ec6ff5..30741983a3ab 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -1551,25 +1551,43 @@ static void test_cmd_vcpu_get_xcr(struct kvm_vm *vm) cmd_vcpu_get_xcr(vm, xcr1, &value, no_padding, -KVM_EINVAL); } -static void cmd_vcpu_get_xsave(struct kvm_vm *vm) +static void cmd_vcpu_get_xsave(struct kvm_vm *vm, struct kvm_xsave *rpl) { struct { struct kvmi_msg_hdr hdr; struct kvmi_vcpu_hdr vcpu_hdr; } req = {}; - struct kvm_xsave rpl; int r; r = do_vcpu0_command(vm, KVMI_VCPU_GET_XSAVE, &req.hdr, sizeof(req), - &rpl, sizeof(rpl)); + rpl, sizeof(*rpl)); TEST_ASSERT(r == 0, "KVMI_VCPU_GET_XSAVE failed with error %d (%s)\n", -r, kvm_strerror(-r)); } -static void test_cmd_vcpu_get_xsave(struct kvm_vm *vm) +static void cmd_vcpu_set_xsave(struct kvm_vm *vm, struct kvm_xsave *rpl) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvm_xsave xsave; + } req = {}; + int r; + + memcpy(&req.xsave, rpl, sizeof(*rpl)); + + r = do_vcpu0_command(vm, KVMI_VCPU_SET_XSAVE, &req.hdr, sizeof(req), + NULL, 0); + TEST_ASSERT(r == 0, + "KVMI_VCPU_SET_XSAVE failed with error %d (%s)\n", + -r, kvm_strerror(-r)); +} + +static void test_cmd_vcpu_xsave(struct kvm_vm *vm) { struct kvm_cpuid_entry2 *entry; + struct kvm_xsave xsave; entry = kvm_get_supported_cpuid_entry(1); if (!(entry->ecx & X86_FEATURE_XSAVE)) { @@ -1577,7 +1595,8 @@ static void test_cmd_vcpu_get_xsave(struct kvm_vm *vm) return; } - cmd_vcpu_get_xsave(vm); + cmd_vcpu_get_xsave(vm, &xsave); + cmd_vcpu_set_xsave(vm, &xsave); } static void test_introspection(struct kvm_vm *vm) @@ -1608,7 +1627,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vm_get_max_gfn(); test_event_xsetbv(vm); test_cmd_vcpu_get_xcr(vm); - test_cmd_vcpu_get_xsave(vm); + test_cmd_vcpu_xsave(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 26fc27f3fddc..f358ef1c0c73 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -111,5 +111,8 @@ int kvmi_arch_cmd_vcpu_get_xcr(struct kvm_vcpu *vcpu, int kvmi_arch_cmd_vcpu_get_xsave(struct kvm_vcpu *vcpu, struct kvmi_vcpu_get_xsave_reply **dest, size_t *dest_size); +int kvmi_arch_cmd_vcpu_set_xsave(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_set_xsave *req, + size_t req_size); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 06a8fbb5c86d..c8fe55c3d5c7 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -553,6 +553,22 @@ static int handle_vcpu_get_xsave(const struct kvmi_vcpu_msg_job *job, return err; } +static int handle_vcpu_set_xsave(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + size_t msg_size = msg->size, xsave_size; + int ec; + + if (check_sub_overflow(msg_size, sizeof(struct kvmi_vcpu_hdr), + &xsave_size)) + return -EINVAL; + + ec = kvmi_arch_cmd_vcpu_set_xsave(job->vcpu, req, xsave_size); + + return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -571,6 +587,7 @@ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, [KVMI_VCPU_GET_XSAVE] = handle_vcpu_get_xsave, [KVMI_VCPU_INJECT_EXCEPTION] = handle_vcpu_inject_exception, [KVMI_VCPU_SET_REGISTERS] = handle_vcpu_set_registers, + [KVMI_VCPU_SET_XSAVE] = handle_vcpu_set_xsave, }; static bool is_vcpu_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 72/84] KVM: introspection: add KVMI_VCPU_GET_MTRR_TYPE
From: Mihai Don?u <mdontu at bitdefender.com> This command returns the memory type for a guest physical address. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 32 +++++++++++++++++++ arch/x86/include/uapi/asm/kvmi.h | 9 ++++++ arch/x86/kvm/kvmi.c | 7 ++++ include/uapi/linux/kvmi.h | 7 ++-- .../testing/selftests/kvm/x86_64/kvmi_test.c | 19 +++++++++++ virt/kvm/introspection/kvmi_int.h | 1 + virt/kvm/introspection/kvmi_msg.c | 16 ++++++++++ 7 files changed, 88 insertions(+), 3 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 24605aaede9a..6de260c09f3b 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -902,6 +902,38 @@ Modifies the XSAVE area. * -KVM_EINVAL - the padding is not zero * -KVM_EAGAIN - the selected vCPU can't be introspected yet +21. KVMI_VCPU_GET_MTRR_TYPE +--------------------------- + +:Architecture: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_get_mtrr_type { + __u64 gpa; + }; + +:Returns: + +:: + + struct kvmi_error_code; + struct kvmi_vcpu_get_mtrr_type_reply { + __u8 type; + __u8 padding[7]; + }; + +Returns the guest memory type for a specific physical address. + +:Errors: + +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet + Events ===== diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index 71206206973c..33edbb5b32f4 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -107,4 +107,13 @@ struct kvmi_vcpu_set_xsave { __u32 region[0]; }; +struct kvmi_vcpu_get_mtrr_type { + __u64 gpa; +}; + +struct kvmi_vcpu_get_mtrr_type_reply { + __u8 type; + __u8 padding[7]; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 0776be48ada1..80ad67e875c4 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -784,3 +784,10 @@ int kvmi_arch_cmd_vcpu_set_xsave(struct kvm_vcpu *vcpu, return err ? -KVM_EINVAL : 0; } + +int kvmi_arch_cmd_vcpu_get_mtrr_type(struct kvm_vcpu *vcpu, u64 gpa, u8 *type) +{ + *type = kvm_mtrr_get_guest_memory_type(vcpu, gpa_to_gfn(gpa)); + + return 0; +} diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 441586cec238..207bcaeec05a 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -39,9 +39,10 @@ enum { KVMI_VM_GET_MAX_GFN = 17, - KVMI_VCPU_GET_XCR = 18, - KVMI_VCPU_GET_XSAVE = 19, - KVMI_VCPU_SET_XSAVE = 20, + KVMI_VCPU_GET_XCR = 18, + KVMI_VCPU_GET_XSAVE = 19, + KVMI_VCPU_SET_XSAVE = 20, + KVMI_VCPU_GET_MTRR_TYPE = 21, KVMI_NUM_MESSAGES }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 30741983a3ab..7174e464d759 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -1599,6 +1599,24 @@ static void test_cmd_vcpu_xsave(struct kvm_vm *vm) cmd_vcpu_set_xsave(vm, &xsave); } +static void test_cmd_vcpu_get_mtrr_type(struct kvm_vm *vm) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_get_mtrr_type cmd; + } req = {}; + struct kvmi_vcpu_get_mtrr_type_reply rpl; + + req.cmd.gpa = test_gpa; + + test_vcpu0_command(vm, KVMI_VCPU_GET_MTRR_TYPE, + &req.hdr, sizeof(req), + &rpl, sizeof(rpl)); + + pr_info("mtrr_type: gpa 0x%lx type 0x%x\n", test_gpa, rpl.type); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1628,6 +1646,7 @@ static void test_introspection(struct kvm_vm *vm) test_event_xsetbv(vm); test_cmd_vcpu_get_xcr(vm); test_cmd_vcpu_xsave(vm); + test_cmd_vcpu_get_mtrr_type(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index f358ef1c0c73..a9664d255d6d 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -114,5 +114,6 @@ int kvmi_arch_cmd_vcpu_get_xsave(struct kvm_vcpu *vcpu, int kvmi_arch_cmd_vcpu_set_xsave(struct kvm_vcpu *vcpu, const struct kvmi_vcpu_set_xsave *req, size_t req_size); +int kvmi_arch_cmd_vcpu_get_mtrr_type(struct kvm_vcpu *vcpu, u64 gpa, u8 *type); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index c8fe55c3d5c7..36819b54b3a0 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -569,6 +569,21 @@ static int handle_vcpu_set_xsave(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); } +static int handle_vcpu_get_mtrr_type(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vcpu_get_mtrr_type *req = _req; + struct kvmi_vcpu_get_mtrr_type_reply rpl; + int ec; + + memset(&rpl, 0, sizeof(rpl)); + + ec = kvmi_arch_cmd_vcpu_get_mtrr_type(job->vcpu, req->gpa, &rpl.type); + + return kvmi_msg_vcpu_reply(job, msg, ec, &rpl, sizeof(rpl)); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -582,6 +597,7 @@ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, [KVMI_VCPU_GET_CPUID] = handle_vcpu_get_cpuid, [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, + [KVMI_VCPU_GET_MTRR_TYPE] = handle_vcpu_get_mtrr_type, [KVMI_VCPU_GET_REGISTERS] = handle_vcpu_get_registers, [KVMI_VCPU_GET_XCR] = handle_vcpu_get_xcr, [KVMI_VCPU_GET_XSAVE] = handle_vcpu_get_xsave,
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 73/84] KVM: introspection: add KVMI_EVENT_DESCRIPTOR
From: Nicu?or C??u <ncitu at bitdefender.com> This event is sent when IDTR, GDTR, LDTR or TR are accessed. These could be used to implement a tiny agent which runs in the context of an introspected guest and uses virtualized exceptions (#VE) and alternate EPT views (VMFUNC #0) to filter converted VMEXITS. The events of interested will be suppressed (after some appropriate guest-side handling) while the rest will be sent to the introspector via a VMCALL. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 41 ++++++++++ arch/x86/include/asm/kvmi_host.h | 3 + arch/x86/include/uapi/asm/kvmi.h | 11 +++ arch/x86/kvm/kvmi.c | 75 +++++++++++++++++++ arch/x86/kvm/svm/svm.c | 33 ++++++++ arch/x86/kvm/vmx/vmx.c | 23 ++++++ include/uapi/linux/kvmi.h | 1 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 75 +++++++++++++++++++ virt/kvm/introspection/kvmi.c | 1 + 9 files changed, 263 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 6de260c09f3b..0294c141eb0a 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -551,6 +551,7 @@ the following events:: KVMI_EVENT_BREAKPOINT KVMI_EVENT_CR + KVMI_EVENT_DESCRIPTOR KVMI_EVENT_HYPERCALL KVMI_EVENT_XSETBV @@ -574,6 +575,8 @@ the *KVMI_VM_CONTROL_EVENTS* command. * -KVM_EINVAL - the event ID is unknown (use *KVMI_VM_CHECK_EVENT* first) * -KVM_EPERM - the access is disallowed (use *KVMI_VM_CHECK_EVENT* first) * -KVM_EAGAIN - the selected vCPU can't be introspected yet +* -KVM_EOPNOTSUPP - the event can't be intercepted in the current setup + (e.g. KVMI_EVENT_DESCRIPTOR with AMD) * -KVM_EBUSY - the event can't be intercepted right now (e.g. KVMI_EVENT_BREAKPOINT if the #BP event is already intercepted by userspace) @@ -1211,3 +1214,41 @@ to be changed and the introspection has been enabled for this event ``kvmi_event``, the extended control register number, the old value and the new value are sent to the introspection tool. + +8. KVMI_EVENT_DESCRIPTOR +------------------------ + +:Architecture: x86 +:Versions: >= 1 +:Actions: CONTINUE, RETRY, CRASH +:Parameters: + +:: + + struct kvmi_event; + struct kvmi_event_descriptor { + __u8 descriptor; + __u8 write; + __u8 padding[6]; + }; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + +This event is sent when a descriptor table register is accessed and the +introspection has been enabled for this event (see **KVMI_VCPU_CONTROL_EVENTS**). + +``kvmi_event`` and ``kvmi_event_descriptor`` are sent to the introspection tool. + +``descriptor`` can be one of:: + + KVMI_DESC_IDTR + KVMI_DESC_GDTR + KVMI_DESC_LDTR + KVMI_DESC_TR + +``write`` is 1 if the descriptor was written, 0 otherwise. diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index aed8a4b88a68..09ebed80a8cc 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -35,6 +35,7 @@ bool kvmi_cr3_intercepted(struct kvm_vcpu *vcpu); bool kvmi_monitor_cr3w_intercept(struct kvm_vcpu *vcpu, bool enable); void kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, u64 old_value, u64 new_value); +bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write); #else /* CONFIG_KVM_INTROSPECTION */ @@ -49,6 +50,8 @@ static inline bool kvmi_monitor_cr3w_intercept(struct kvm_vcpu *vcpu, bool enable) { return false; } static inline void kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, u64 old_value, u64 new_value) { } +static inline bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, + bool write) { return true; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index 33edbb5b32f4..b6ff39ba0ab3 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -116,4 +116,15 @@ struct kvmi_vcpu_get_mtrr_type_reply { __u8 padding[7]; }; +#define KVMI_DESC_IDTR 1 +#define KVMI_DESC_GDTR 2 +#define KVMI_DESC_LDTR 3 +#define KVMI_DESC_TR 4 + +struct kvmi_event_descriptor { + __u8 descriptor; + __u8 write; + __u8 padding[6]; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 80ad67e875c4..3ae43a4c8764 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -361,6 +361,21 @@ static void kvmi_arch_disable_cr3w_intercept(struct kvm_vcpu *vcpu) vcpu->arch.kvmi->cr3w.kvm_intercepted = false; } +static int kvmi_control_desc_intercept(struct kvm_vcpu *vcpu, bool enable) +{ + if (!kvm_x86_ops.desc_ctrl_supported()) + return -KVM_EOPNOTSUPP; + + kvm_x86_ops.control_desc_intercept(vcpu, enable); + + return 0; +} + +static void kvmi_arch_disable_desc_intercept(struct kvm_vcpu *vcpu) +{ + kvmi_control_desc_intercept(vcpu, false); +} + int kvmi_arch_cmd_control_intercept(struct kvm_vcpu *vcpu, unsigned int event_id, bool enable) { @@ -370,6 +385,9 @@ int kvmi_arch_cmd_control_intercept(struct kvm_vcpu *vcpu, case KVMI_EVENT_BREAKPOINT: err = kvmi_control_bp_intercept(vcpu, enable); break; + case KVMI_EVENT_DESCRIPTOR: + err = kvmi_control_desc_intercept(vcpu, enable); + break; default: break; } @@ -401,6 +419,7 @@ static void kvmi_arch_restore_interception(struct kvm_vcpu *vcpu) { kvmi_arch_disable_bp_intercept(vcpu); kvmi_arch_disable_cr3w_intercept(vcpu); + kvmi_arch_disable_desc_intercept(vcpu); } bool kvmi_arch_clean_up_interception(struct kvm_vcpu *vcpu) @@ -791,3 +810,59 @@ int kvmi_arch_cmd_vcpu_get_mtrr_type(struct kvm_vcpu *vcpu, u64 gpa, u8 *type) return 0; } + +static u32 kvmi_msg_send_descriptor(struct kvm_vcpu *vcpu, u8 descriptor, + bool write) +{ + struct kvmi_event_descriptor e; + int err, action; + + memset(&e, 0, sizeof(e)); + e.descriptor = descriptor; + e.write = write ? 1 : 0; + + err = kvmi_send_event(vcpu, KVMI_EVENT_DESCRIPTOR, &e, sizeof(e), + NULL, 0, &action); + if (err) + return KVMI_EVENT_ACTION_CONTINUE; + + return action; +} + +static bool __kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, + bool write) +{ + bool ret = false; + u32 action; + + action = kvmi_msg_send_descriptor(vcpu, descriptor, write); + switch (action) { + case KVMI_EVENT_ACTION_CONTINUE: + ret = true; + break; + case KVMI_EVENT_ACTION_RETRY: + break; + default: + kvmi_handle_common_event_actions(vcpu->kvm, action); + } + + return ret; +} + +bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write) +{ + struct kvm_introspection *kvmi; + bool ret = true; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return true; + + if (is_event_enabled(vcpu, KVMI_EVENT_DESCRIPTOR)) + ret = __kvmi_descriptor_event(vcpu, descriptor, write); + + kvmi_put(vcpu->kvm); + + return ret; +} +EXPORT_SYMBOL(kvmi_descriptor_event); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index bc886cedf45c..a0b91007e484 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -2326,6 +2326,39 @@ static int descriptor_access_interception(struct vcpu_svm *svm) { struct kvm_vcpu *vcpu = &svm->vcpu; +#ifdef CONFIG_KVM_INTROSPECTION + struct vmcb_control_area *c = &svm->vmcb->control; + bool cont; + + switch (c->exit_code) { + case SVM_EXIT_IDTR_READ: + case SVM_EXIT_IDTR_WRITE: + cont = kvmi_descriptor_event(vcpu, KVMI_DESC_IDTR, + c->exit_code == SVM_EXIT_IDTR_WRITE); + break; + case SVM_EXIT_GDTR_READ: + case SVM_EXIT_GDTR_WRITE: + cont = kvmi_descriptor_event(vcpu, KVMI_DESC_GDTR, + c->exit_code == SVM_EXIT_GDTR_WRITE); + break; + case SVM_EXIT_LDTR_READ: + case SVM_EXIT_LDTR_WRITE: + cont = kvmi_descriptor_event(vcpu, KVMI_DESC_LDTR, + c->exit_code == SVM_EXIT_LDTR_WRITE); + break; + case SVM_EXIT_TR_READ: + case SVM_EXIT_TR_WRITE: + cont = kvmi_descriptor_event(vcpu, KVMI_DESC_TR, + c->exit_code == SVM_EXIT_TR_WRITE); + break; + default: + cont = true; + break; + } + if (!cont) + return 1; +#endif /* CONFIG_KVM_INTROSPECTION */ + return kvm_emulate_instruction(vcpu, 0); } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 69aa1157a0f0..74bdcd4966ca 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5019,7 +5019,30 @@ static int handle_set_cr4(struct kvm_vcpu *vcpu, unsigned long val) static int handle_desc(struct kvm_vcpu *vcpu) { +#ifdef CONFIG_KVM_INTROSPECTION + struct vcpu_vmx *vmx = to_vmx(vcpu); + u32 exit_reason = vmx->exit_reason; + u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO); + u8 store = (vmx_instruction_info >> 29) & 0x1; + u8 descriptor = 0; + + if (exit_reason == EXIT_REASON_GDTR_IDTR) { + if ((vmx_instruction_info >> 28) & 0x1) + descriptor = KVMI_DESC_IDTR; + else + descriptor = KVMI_DESC_GDTR; + } else { + if ((vmx_instruction_info >> 28) & 0x1) + descriptor = KVMI_DESC_TR; + else + descriptor = KVMI_DESC_LDTR; + } + + if (!kvmi_descriptor_event(vcpu, descriptor, store)) + return 1; +#else WARN_ON(!(vcpu->arch.cr4 & X86_CR4_UMIP)); +#endif /* CONFIG_KVM_INTROSPECTION */ return kvm_emulate_instruction(vcpu, 0); } diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 207bcaeec05a..f2e802d2b712 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -55,6 +55,7 @@ enum { KVMI_EVENT_CR = 4, KVMI_EVENT_TRAP = 5, KVMI_EVENT_XSETBV = 6, + KVMI_EVENT_DESCRIPTOR = 7, KVMI_NUM_EVENTS }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 7174e464d759..0087b91574f0 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -55,6 +55,7 @@ enum { GUEST_TEST_NOOP = 0, GUEST_TEST_BP, GUEST_TEST_CR, + GUEST_TEST_DESCRIPTOR, GUEST_TEST_HYPERCALL, GUEST_TEST_XSETBV, }; @@ -84,6 +85,14 @@ static void guest_cr_test(void) set_cr4(get_cr4() | X86_CR4_OSXSAVE); } +static void guest_descriptor_test(void) +{ + void *ptr; + + asm volatile("sgdt %0" :: "m"(ptr)); + asm volatile("lgdt %0" :: "m"(ptr)); +} + static void guest_hypercall_test(void) { asm volatile("mov $34, %rax"); @@ -143,6 +152,9 @@ static void guest_code(void) case GUEST_TEST_CR: guest_cr_test(); break; + case GUEST_TEST_DESCRIPTOR: + guest_descriptor_test(); + break; case GUEST_TEST_HYPERCALL: guest_hypercall_test(); break; @@ -1617,6 +1629,68 @@ static void test_cmd_vcpu_get_mtrr_type(struct kvm_vm *vm) pr_info("mtrr_type: gpa 0x%lx type 0x%x\n", test_gpa, rpl.type); } +static void test_desc_read_access(__u16 event_id) +{ + struct kvmi_msg_hdr hdr; + struct { + struct kvmi_event common; + struct kvmi_event_descriptor desc; + } ev; + struct vcpu_reply rpl = {}; + + receive_event(&hdr, &ev.common, sizeof(ev), event_id); + + pr_info("Descriptor event (read), descriptor %u, write %u\n", + ev.desc.descriptor, ev.desc.write); + + TEST_ASSERT(ev.desc.write == 0, + "Received a write descriptor access\n"); + + reply_to_event(&hdr, &ev.common, KVMI_EVENT_ACTION_CONTINUE, + &rpl, sizeof(rpl)); +} + +static void test_desc_write_access(__u16 event_id) +{ + struct kvmi_msg_hdr hdr; + struct { + struct kvmi_event common; + struct kvmi_event_descriptor desc; + } ev; + struct vcpu_reply rpl = {}; + + receive_event(&hdr, &ev.common, sizeof(ev), event_id); + + pr_info("Descriptor event (write), descriptor %u, write %u\n", + ev.desc.descriptor, ev.desc.write); + + TEST_ASSERT(ev.desc.write == 1, + "Received a read descriptor access\n"); + + reply_to_event(&hdr, &ev.common, KVMI_EVENT_ACTION_CONTINUE, + &rpl, sizeof(rpl)); +} + +static void test_event_descriptor(struct kvm_vm *vm) +{ + struct vcpu_worker_data data = { + .vm = vm, + .vcpu_id = VCPU_ID, + .test_id = GUEST_TEST_DESCRIPTOR, + }; + __u16 event_id = KVMI_EVENT_DESCRIPTOR; + pthread_t vcpu_thread; + + enable_vcpu_event(vm, event_id); + vcpu_thread = start_vcpu_worker(&data); + + test_desc_read_access(event_id); + test_desc_write_access(event_id); + + stop_vcpu_worker(vcpu_thread, &data); + disable_vcpu_event(vm, event_id); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1647,6 +1721,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_get_xcr(vm); test_cmd_vcpu_xsave(vm); test_cmd_vcpu_get_mtrr_type(vm); + test_event_descriptor(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 761d3fccddce..a1bc9570ac1c 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -99,6 +99,7 @@ static void setup_known_events(void) bitmap_zero(Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); set_bit(KVMI_EVENT_BREAKPOINT, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_CR, Kvmi_known_vcpu_events); + set_bit(KVMI_EVENT_DESCRIPTOR, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_HYPERCALL, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_PAUSE_VCPU, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_TRAP, Kvmi_known_vcpu_events);
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 74/84] KVM: introspection: restore the state of descriptor-table register interception on unhook
From: Nicu?or C??u <ncitu at bitdefender.com> This commit also ensures that the introspection tool and the userspace do not disable each other the descriptor-table access VM-exit. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvmi_host.h | 4 +++ arch/x86/kvm/kvmi.c | 45 ++++++++++++++++++++++++++++++++ arch/x86/kvm/svm/svm.c | 3 +++ arch/x86/kvm/vmx/vmx.c | 3 +++ 4 files changed, 55 insertions(+) diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 09ebed80a8cc..0ed1879fd250 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -17,6 +17,7 @@ struct kvmi_interception { bool restore_interception; struct kvmi_monitor_interception breakpoint; struct kvmi_monitor_interception cr3w; + struct kvmi_monitor_interception descriptor; }; struct kvm_vcpu_arch_introspection { @@ -35,6 +36,7 @@ bool kvmi_cr3_intercepted(struct kvm_vcpu *vcpu); bool kvmi_monitor_cr3w_intercept(struct kvm_vcpu *vcpu, bool enable); void kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, u64 old_value, u64 new_value); +bool kvmi_monitor_desc_intercept(struct kvm_vcpu *vcpu, bool enable); bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write); #else /* CONFIG_KVM_INTROSPECTION */ @@ -50,6 +52,8 @@ static inline bool kvmi_monitor_cr3w_intercept(struct kvm_vcpu *vcpu, bool enable) { return false; } static inline void kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, u64 old_value, u64 new_value) { } +static inline bool kvmi_monitor_desc_intercept(struct kvm_vcpu *vcpu, + bool enable) { return false; } static inline bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write) { return true; } diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 3ae43a4c8764..dfe1b887b4f3 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -361,12 +361,52 @@ static void kvmi_arch_disable_cr3w_intercept(struct kvm_vcpu *vcpu) vcpu->arch.kvmi->cr3w.kvm_intercepted = false; } +/* + * Returns true if one side (kvm or kvmi) tries to disable the descriptor + * interception while the other side is still tracking it. + */ +bool kvmi_monitor_desc_intercept(struct kvm_vcpu *vcpu, bool enable) +{ + struct kvmi_interception *arch_vcpui = READ_ONCE(vcpu->arch.kvmi); + + return (arch_vcpui && arch_vcpui->descriptor.monitor_fct(vcpu, enable)); +} +EXPORT_SYMBOL(kvmi_monitor_desc_intercept); + +static bool monitor_desc_fct_kvmi(struct kvm_vcpu *vcpu, bool enable) +{ + vcpu->arch.kvmi->descriptor.kvmi_intercepted = enable; + + if (enable) + vcpu->arch.kvmi->descriptor.kvm_intercepted + kvm_x86_ops.desc_intercepted(vcpu); + else if (vcpu->arch.kvmi->descriptor.kvm_intercepted) + return true; + + return false; +} + +static bool monitor_desc_fct_kvm(struct kvm_vcpu *vcpu, bool enable) +{ + if (!vcpu->arch.kvmi->descriptor.kvmi_intercepted) + return false; + + vcpu->arch.kvmi->descriptor.kvm_intercepted = enable; + + if (!enable) + return true; + + return false; +} + static int kvmi_control_desc_intercept(struct kvm_vcpu *vcpu, bool enable) { if (!kvm_x86_ops.desc_ctrl_supported()) return -KVM_EOPNOTSUPP; + vcpu->arch.kvmi->descriptor.monitor_fct = monitor_desc_fct_kvmi; kvm_x86_ops.control_desc_intercept(vcpu, enable); + vcpu->arch.kvmi->descriptor.monitor_fct = monitor_desc_fct_kvm; return 0; } @@ -374,6 +414,9 @@ static int kvmi_control_desc_intercept(struct kvm_vcpu *vcpu, bool enable) static void kvmi_arch_disable_desc_intercept(struct kvm_vcpu *vcpu) { kvmi_control_desc_intercept(vcpu, false); + + vcpu->arch.kvmi->descriptor.kvmi_intercepted = false; + vcpu->arch.kvmi->descriptor.kvm_intercepted = false; } int kvmi_arch_cmd_control_intercept(struct kvm_vcpu *vcpu, @@ -445,11 +488,13 @@ bool kvmi_arch_vcpu_alloc_interception(struct kvm_vcpu *vcpu) arch_vcpui->breakpoint.monitor_fct = monitor_bp_fct_kvm; arch_vcpui->cr3w.monitor_fct = monitor_cr3w_fct_kvm; + arch_vcpui->descriptor.monitor_fct = monitor_desc_fct_kvm; /* * paired with: * - kvmi_monitor_bp_intercept() * - kvmi_monitor_cr3w_intercept() + * - kvmi_monitor_desc_intercept() */ smp_wmb(); WRITE_ONCE(vcpu->arch.kvmi, arch_vcpui); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a0b91007e484..20f6905b45aa 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1555,6 +1555,9 @@ static void svm_control_desc_intercept(struct kvm_vcpu *vcpu, bool enable) { struct vcpu_svm *svm = to_svm(vcpu); + if (kvmi_monitor_desc_intercept(vcpu, enable)) + return; + if (enable) { set_intercept(svm, INTERCEPT_STORE_IDTR); set_intercept(svm, INTERCEPT_STORE_GDTR); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 74bdcd4966ca..8d396a2d2309 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3162,6 +3162,9 @@ static void vmx_control_desc_intercept(struct kvm_vcpu *vcpu, bool enable) { struct vcpu_vmx *vmx = to_vmx(vcpu); + if (kvmi_monitor_desc_intercept(vcpu, enable)) + return; + if (enable) secondary_exec_controls_setbit(vmx, SECONDARY_EXEC_DESC); else
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 75/84] KVM: introspection: add KVMI_VCPU_CONTROL_MSR and KVMI_EVENT_MSR
From: Mihai Don?u <mdontu at bitdefender.com> This command is used to enable/disable introspection for a specific MSR. The KVMI_EVENT_MSR event is sent when the tracked MSR is going to be changed. The introspection tool can respond by allowing the guest to continue with normal execution or by discarding the change. This is meant to prevent malicious changes to MSRs such as MSR_IA32_SYSENTER_EIP. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 75 +++++++++ arch/x86/include/asm/kvmi_host.h | 12 ++ arch/x86/include/uapi/asm/kvmi.h | 18 ++ arch/x86/kvm/kvmi.c | 159 ++++++++++++++++++ arch/x86/kvm/x86.c | 3 + include/uapi/linux/kvmi.h | 2 + .../testing/selftests/kvm/x86_64/kvmi_test.c | 119 +++++++++++++ virt/kvm/introspection/kvmi.c | 1 + virt/kvm/introspection/kvmi_int.h | 2 + virt/kvm/introspection/kvmi_msg.c | 12 ++ 10 files changed, 403 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 0294c141eb0a..536d6ecec026 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -553,6 +553,7 @@ the following events:: KVMI_EVENT_CR KVMI_EVENT_DESCRIPTOR KVMI_EVENT_HYPERCALL + KVMI_EVENT_MSR KVMI_EVENT_XSETBV When an event is enabled, the introspection tool is notified and @@ -937,6 +938,45 @@ Returns the guest memory type for a specific physical address. * -KVM_EINVAL - the padding is not zero * -KVM_EAGAIN - the selected vCPU can't be introspected yet +22. KVMI_VCPU_CONTROL_MSR +------------------------- + +:Architectures: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_control_msr { + __u8 enable; + __u8 padding1; + __u16 padding2; + __u32 msr; + }; + +:Returns: + +:: + + struct kvmi_error_code + +Enables/disables introspection for a specific MSR and must be used +in addition to *KVMI_VCPU_CONTROL_EVENTS* with the *KVMI_EVENT_MSR* ID set. + +Currently, only MSRs within the following two ranges are supported. Trying +to control events for any other register will fail with -KVM_EINVAL:: + + 0 ... 0x00001fff + 0xc0000000 ... 0xc0001fff + +:Errors: + +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the specified MSR is invalid +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet + Events ===== @@ -1252,3 +1292,38 @@ introspection has been enabled for this event (see **KVMI_VCPU_CONTROL_EVENTS**) KVMI_DESC_TR ``write`` is 1 if the descriptor was written, 0 otherwise. + +9. KVMI_EVENT_MSR +----------------- + +:Architectures: x86 +:Versions: >= 1 +:Actions: CONTINUE, CRASH +:Parameters: + +:: + + struct kvmi_event; + struct kvmi_event_msr { + __u32 msr; + __u32 padding; + __u64 old_value; + __u64 new_value; + }; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + struct kvmi_event_msr_reply { + __u64 new_val; + }; + +This event is sent when a model specific register is going to be changed +and the introspection has been enabled for this event and for this specific +register (see **KVMI_VCPU_CONTROL_EVENTS**). + +``kvmi_event``, the MSR number, the old value and the new value are +sent to the introspection tool. The *CONTINUE* action will set the ``new_val``. diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 0ed1879fd250..5f2967d86b72 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -4,7 +4,10 @@ #include <asm/kvmi.h> +struct msr_data; + #define KVMI_NUM_CR 5 +#define KVMI_NUM_MSR 0x2000 struct kvmi_monitor_interception { bool kvmi_intercepted; @@ -18,6 +21,12 @@ struct kvmi_interception { struct kvmi_monitor_interception breakpoint; struct kvmi_monitor_interception cr3w; struct kvmi_monitor_interception descriptor; + struct { + struct { + DECLARE_BITMAP(low, KVMI_NUM_MSR); + DECLARE_BITMAP(high, KVMI_NUM_MSR); + } kvmi_mask; + } msrw; }; struct kvm_vcpu_arch_introspection { @@ -38,6 +47,7 @@ void kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, u64 old_value, u64 new_value); bool kvmi_monitor_desc_intercept(struct kvm_vcpu *vcpu, bool enable); bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write); +bool kvmi_msr_event(struct kvm_vcpu *vcpu, struct msr_data *msr); #else /* CONFIG_KVM_INTROSPECTION */ @@ -56,6 +66,8 @@ static inline bool kvmi_monitor_desc_intercept(struct kvm_vcpu *vcpu, bool enable) { return false; } static inline bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write) { return true; } +static inline bool kvmi_msr_event(struct kvm_vcpu *vcpu, struct msr_data *msr) + { return true; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index b6ff39ba0ab3..1bb13da61dbf 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -127,4 +127,22 @@ struct kvmi_event_descriptor { __u8 padding[6]; }; +struct kvmi_vcpu_control_msr { + __u8 enable; + __u8 padding1; + __u16 padding2; + __u32 msr; +}; + +struct kvmi_event_msr { + __u32 msr; + __u32 padding; + __u64 old_value; + __u64 new_value; +}; + +struct kvmi_event_msr_reply { + __u64 new_val; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index dfe1b887b4f3..a48e72f520da 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -419,6 +419,76 @@ static void kvmi_arch_disable_desc_intercept(struct kvm_vcpu *vcpu) vcpu->arch.kvmi->descriptor.kvm_intercepted = false; } +static bool kvmi_msr_valid(unsigned int msr) +{ + return msr <= 0x1fff || (msr >= 0xc0000000 && msr <= 0xc0001fff); +} + +static unsigned long *msr_mask(struct kvm_vcpu *vcpu, unsigned int *msr) +{ + switch (*msr) { + case 0 ... 0x1fff: + return vcpu->arch.kvmi->msrw.kvmi_mask.low; + case 0xc0000000 ... 0xc0001fff: + *msr &= 0x1fff; + return vcpu->arch.kvmi->msrw.kvmi_mask.high; + } + + return NULL; +} + +static bool test_msr_mask(struct kvm_vcpu *vcpu, unsigned int msr) +{ + unsigned long *mask = msr_mask(vcpu, &msr); + + if (!mask) + return false; + + return !!test_bit(msr, mask); +} + +static bool msr_control(struct kvm_vcpu *vcpu, unsigned int msr, bool enable) +{ + unsigned long *mask = msr_mask(vcpu, &msr); + + if (!mask) + return false; + + if (enable) + set_bit(msr, mask); + else + clear_bit(msr, mask); + + return true; +} + +static unsigned int msr_mask_to_base(struct kvm_vcpu *vcpu, unsigned long *mask) +{ + if (mask == vcpu->arch.kvmi->msrw.kvmi_mask.high) + return 0xc0000000; + + return 0; +} + +static void kvmi_arch_disable_msr_intercept(struct kvm_vcpu *vcpu, + unsigned long *mask) +{ + unsigned int msr_base = msr_mask_to_base(vcpu, mask); + int offset = -1; + + for (;;) { + offset = find_next_bit(mask, KVMI_NUM_MSR, offset + 1); + + if (offset >= KVMI_NUM_MSR) + break; + + kvm_x86_ops.control_msr_intercept(vcpu, msr_base + offset, + MSR_TYPE_W, false); + } + + bitmap_zero(mask, KVMI_NUM_MSR); +} + int kvmi_arch_cmd_control_intercept(struct kvm_vcpu *vcpu, unsigned int event_id, bool enable) { @@ -460,9 +530,13 @@ void kvmi_arch_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len) static void kvmi_arch_restore_interception(struct kvm_vcpu *vcpu) { + struct kvmi_interception *arch_vcpui = vcpu->arch.kvmi; + kvmi_arch_disable_bp_intercept(vcpu); kvmi_arch_disable_cr3w_intercept(vcpu); kvmi_arch_disable_desc_intercept(vcpu); + kvmi_arch_disable_msr_intercept(vcpu, arch_vcpui->msrw.kvmi_mask.low); + kvmi_arch_disable_msr_intercept(vcpu, arch_vcpui->msrw.kvmi_mask.high); } bool kvmi_arch_clean_up_interception(struct kvm_vcpu *vcpu) @@ -911,3 +985,88 @@ bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write) return ret; } EXPORT_SYMBOL(kvmi_descriptor_event); + +int kvmi_arch_cmd_vcpu_control_msr(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_control_msr *req) +{ + if (req->padding1 || req->padding2 || req->enable > 1) + return -KVM_EINVAL; + + if (!kvmi_msr_valid(req->msr)) + return -KVM_EINVAL; + + kvm_x86_ops.control_msr_intercept(vcpu, req->msr, MSR_TYPE_W, + req->enable == 1); + msr_control(vcpu, req->msr, req->enable); + + return 0; +} + +static u32 kvmi_send_msr(struct kvm_vcpu *vcpu, u32 msr, u64 old_value, + u64 new_value, u64 *ret_value) +{ + struct kvmi_event_msr e; + struct kvmi_event_msr_reply r; + int err, action; + + memset(&e, 0, sizeof(e)); + e.msr = msr; + e.old_value = old_value; + e.new_value = new_value; + + err = kvmi_send_event(vcpu, KVMI_EVENT_MSR, &e, sizeof(e), + &r, sizeof(r), &action); + if (err) + return KVMI_EVENT_ACTION_CONTINUE; + + *ret_value = r.new_val; + return action; +} + +static bool __kvmi_msr_event(struct kvm_vcpu *vcpu, struct msr_data *msr) +{ + struct msr_data old_msr = { + .host_initiated = true, + .index = msr->index, + }; + bool ret = false; + u64 ret_value = msr->data; + u32 action; + + if (!test_msr_mask(vcpu, msr->index)) + return true; + if (kvm_x86_ops.get_msr(vcpu, &old_msr)) + return true; + if (old_msr.data == msr->data) + return true; + + action = kvmi_send_msr(vcpu, msr->index, old_msr.data, msr->data, + &ret_value); + switch (action) { + case KVMI_EVENT_ACTION_CONTINUE: + msr->data = ret_value; + ret = true; + break; + default: + kvmi_handle_common_event_actions(vcpu->kvm, action); + } + + return ret; +} + +bool kvmi_msr_event(struct kvm_vcpu *vcpu, struct msr_data *msr) +{ + struct kvm_introspection *kvmi; + bool ret = true; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return true; + + if (is_event_enabled(vcpu, KVMI_EVENT_MSR)) + ret = __kvmi_msr_event(vcpu, msr); + + kvmi_put(vcpu->kvm); + + return ret; +} diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c3557a11817f..0add0b0b8f2d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1530,6 +1530,9 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data, msr.index = index; msr.host_initiated = host_initiated; + if (!host_initiated && !kvmi_msr_event(vcpu, &msr)) + return 1; + return kvm_x86_ops.set_msr(vcpu, &msr); } diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index f2e802d2b712..2872f90ff092 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -43,6 +43,7 @@ enum { KVMI_VCPU_GET_XSAVE = 19, KVMI_VCPU_SET_XSAVE = 20, KVMI_VCPU_GET_MTRR_TYPE = 21, + KVMI_VCPU_CONTROL_MSR = 22, KVMI_NUM_MESSAGES }; @@ -56,6 +57,7 @@ enum { KVMI_EVENT_TRAP = 5, KVMI_EVENT_XSETBV = 6, KVMI_EVENT_DESCRIPTOR = 7, + KVMI_EVENT_MSR = 8, KVMI_NUM_EVENTS }; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 0087b91574f0..0ddc0029e3e5 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -57,6 +57,7 @@ enum { GUEST_TEST_CR, GUEST_TEST_DESCRIPTOR, GUEST_TEST_HYPERCALL, + GUEST_TEST_MSR, GUEST_TEST_XSETBV, }; @@ -101,6 +102,15 @@ static void guest_hypercall_test(void) asm volatile(".byte 0x0f,0x01,0xc1"); } +static void guest_msr_test(void) +{ + uint64_t msr; + + msr = rdmsr(MSR_MISC_FEATURES_ENABLES); + msr |= 1; /* MSR_MISC_FEATURES_ENABLES_CPUID_FAULT */ + wrmsr(MSR_MISC_FEATURES_ENABLES, msr); +} + /* from fpu/internal.h */ static u64 xgetbv(u32 index) { @@ -158,6 +168,9 @@ static void guest_code(void) case GUEST_TEST_HYPERCALL: guest_hypercall_test(); break; + case GUEST_TEST_MSR: + guest_msr_test(); + break; case GUEST_TEST_XSETBV: guest_xsetbv_test(); break; @@ -1691,6 +1704,111 @@ static void test_event_descriptor(struct kvm_vm *vm) disable_vcpu_event(vm, event_id); } +static void cmd_control_msr(struct kvm_vm *vm, __u32 msr, __u8 enable, + __u8 padding, int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_control_msr cmd; + } req = {}; + int r; + + req.cmd.msr = msr; + req.cmd.enable = enable; + req.cmd.padding1 = padding; + req.cmd.padding2 = padding; + + r = do_vcpu0_command(vm, KVMI_VCPU_CONTROL_MSR, &req.hdr, sizeof(req), + NULL, 0); + TEST_ASSERT(r == expected_err, + "KVMI_VCPU_CONTROL_MSR failed, error %d(%s), expected error %d\n", + -r, kvm_strerror(-r), expected_err); +} + +static void enable_msr_events(struct kvm_vm *vm, __u32 msr) +{ + enable_vcpu_event(vm, KVMI_EVENT_MSR); + cmd_control_msr(vm, msr, 1, 0, 0); +} + +static void disable_msr_events(struct kvm_vm *vm, __u32 msr) +{ + cmd_control_msr(vm, msr, 0, 0, 0); + disable_vcpu_event(vm, KVMI_EVENT_MSR); +} + +static void handle_msr_event(struct kvm_vm *vm, __u16 event_id, __u32 msr, + __u64 *old_value) +{ + struct kvmi_msg_hdr hdr; + struct { + struct kvmi_event common; + struct kvmi_event_msr msr; + } ev; + struct { + struct vcpu_reply common; + struct kvmi_event_msr_reply msr; + } rpl = {}; + + receive_event(&hdr, &ev.common, sizeof(ev), event_id); + + pr_info("MSR 0x%x, old 0x%llx, new 0x%llx\n", + ev.msr.msr, ev.msr.old_value, ev.msr.new_value); + + TEST_ASSERT(ev.msr.msr == msr, + "Unexpected MSR event, received MSR 0x%x, expected MSR 0x%x", + ev.msr.msr, msr); + + *old_value = rpl.msr.new_val = ev.msr.old_value; + + reply_to_event(&hdr, &ev.common, KVMI_EVENT_ACTION_CONTINUE, + &rpl.common, sizeof(rpl)); +} + +static void test_invalid_control_msr(struct kvm_vm *vm, __u32 msr) +{ + __u8 enable = 1, enable_inval = 2; + __u8 no_padding = 0, padding = 1; + int expected_err = -KVM_EINVAL; + __u32 msr_inval = -1; + + cmd_control_msr(vm, msr, enable_inval, no_padding, expected_err); + cmd_control_msr(vm, msr, enable, padding, expected_err); + cmd_control_msr(vm, msr_inval, enable, no_padding, expected_err); +} + +static void test_cmd_vcpu_control_msr(struct kvm_vm *vm) +{ + struct vcpu_worker_data data = { + .vm = vm, + .vcpu_id = VCPU_ID, + .test_id = GUEST_TEST_MSR, + }; + __u16 event_id = KVMI_EVENT_MSR; + __u32 msr = MSR_MISC_FEATURES_ENABLES; + pthread_t vcpu_thread; + uint64_t msr_data; + __u64 old_value; + + enable_msr_events(vm, msr); + + vcpu_thread = start_vcpu_worker(&data); + + handle_msr_event(vm, event_id, msr, &old_value); + + stop_vcpu_worker(vcpu_thread, &data); + + disable_msr_events(vm, msr); + + msr_data = vcpu_get_msr(vm, VCPU_ID, msr); + TEST_ASSERT(msr_data == old_value, + "Failed to block MSR 0x%x update, value 0x%lx, expected 0x%llx", + msr, msr_data, old_value); + + test_invalid_control_msr(vm, msr); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1722,6 +1840,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_xsave(vm); test_cmd_vcpu_get_mtrr_type(vm); test_event_descriptor(vm); + test_cmd_vcpu_control_msr(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index a1bc9570ac1c..aa8a0708cc1c 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -101,6 +101,7 @@ static void setup_known_events(void) set_bit(KVMI_EVENT_CR, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_DESCRIPTOR, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_HYPERCALL, Kvmi_known_vcpu_events); + set_bit(KVMI_EVENT_MSR, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_PAUSE_VCPU, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_TRAP, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_XSETBV, Kvmi_known_vcpu_events); diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index a9664d255d6d..8f8557bd8ab0 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -115,5 +115,7 @@ int kvmi_arch_cmd_vcpu_set_xsave(struct kvm_vcpu *vcpu, const struct kvmi_vcpu_set_xsave *req, size_t req_size); int kvmi_arch_cmd_vcpu_get_mtrr_type(struct kvm_vcpu *vcpu, u64 gpa, u8 *type); +int kvmi_arch_cmd_vcpu_control_msr(struct kvm_vcpu *vcpu, + const struct kvmi_vcpu_control_msr *req); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 36819b54b3a0..d9cbbefb9993 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -584,6 +584,17 @@ static int handle_vcpu_get_mtrr_type(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, ec, &rpl, sizeof(rpl)); } +static int handle_vcpu_control_msr(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + int ec; + + ec = kvmi_arch_cmd_vcpu_control_msr(job->vcpu, req); + + return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -595,6 +606,7 @@ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, [KVMI_EVENT] = handle_vcpu_event_reply, [KVMI_VCPU_CONTROL_CR] = handle_vcpu_control_cr, [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, + [KVMI_VCPU_CONTROL_MSR] = handle_vcpu_control_msr, [KVMI_VCPU_GET_CPUID] = handle_vcpu_get_cpuid, [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, [KVMI_VCPU_GET_MTRR_TYPE] = handle_vcpu_get_mtrr_type,
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 76/84] KVM: introspection: restore the state of MSR interception on unhook
From: Nicu?or C??u <ncitu at bitdefender.com> This commit also ensures that the introspection tool and the userspace do not disable each other the MSR access VM-exit. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvmi_host.h | 12 +++ arch/x86/kvm/kvmi.c | 133 +++++++++++++++++++++++++++---- arch/x86/kvm/svm/svm.c | 11 +++ arch/x86/kvm/vmx/vmx.c | 11 +++ 4 files changed, 150 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 5f2967d86b72..acc003403c95 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -26,6 +26,12 @@ struct kvmi_interception { DECLARE_BITMAP(low, KVMI_NUM_MSR); DECLARE_BITMAP(high, KVMI_NUM_MSR); } kvmi_mask; + struct { + DECLARE_BITMAP(low, KVMI_NUM_MSR); + DECLARE_BITMAP(high, KVMI_NUM_MSR); + } kvm_mask; + bool (*monitor_fct)(struct kvm_vcpu *vcpu, u32 msr, + bool enable); } msrw; }; @@ -48,6 +54,8 @@ void kvmi_xsetbv_event(struct kvm_vcpu *vcpu, u8 xcr, bool kvmi_monitor_desc_intercept(struct kvm_vcpu *vcpu, bool enable); bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write); bool kvmi_msr_event(struct kvm_vcpu *vcpu, struct msr_data *msr); +bool kvmi_monitor_msrw_intercept(struct kvm_vcpu *vcpu, u32 msr, bool enable); +bool kvmi_msrw_intercept_originator(struct kvm_vcpu *vcpu); #else /* CONFIG_KVM_INTROSPECTION */ @@ -68,6 +76,10 @@ static inline bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write) { return true; } static inline bool kvmi_msr_event(struct kvm_vcpu *vcpu, struct msr_data *msr) { return true; } +static inline bool kvmi_monitor_msrw_intercept(struct kvm_vcpu *vcpu, u32 msr, + bool enable) { return false; } +static inline bool kvmi_msrw_intercept_originator(struct kvm_vcpu *vcpu) + { return false; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index a48e72f520da..0b1301ebafba 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -424,22 +424,25 @@ static bool kvmi_msr_valid(unsigned int msr) return msr <= 0x1fff || (msr >= 0xc0000000 && msr <= 0xc0001fff); } -static unsigned long *msr_mask(struct kvm_vcpu *vcpu, unsigned int *msr) +static unsigned long *msr_mask(struct kvm_vcpu *vcpu, unsigned int *msr, + bool kvmi) { switch (*msr) { case 0 ... 0x1fff: - return vcpu->arch.kvmi->msrw.kvmi_mask.low; + return kvmi ? vcpu->arch.kvmi->msrw.kvmi_mask.low : + vcpu->arch.kvmi->msrw.kvm_mask.low; case 0xc0000000 ... 0xc0001fff: *msr &= 0x1fff; - return vcpu->arch.kvmi->msrw.kvmi_mask.high; + return kvmi ? vcpu->arch.kvmi->msrw.kvmi_mask.high : + vcpu->arch.kvmi->msrw.kvm_mask.high; } return NULL; } -static bool test_msr_mask(struct kvm_vcpu *vcpu, unsigned int msr) +static bool test_msr_mask(struct kvm_vcpu *vcpu, unsigned int msr, bool kvmi) { - unsigned long *mask = msr_mask(vcpu, &msr); + unsigned long *mask = msr_mask(vcpu, &msr, kvmi); if (!mask) return false; @@ -447,9 +450,27 @@ static bool test_msr_mask(struct kvm_vcpu *vcpu, unsigned int msr) return !!test_bit(msr, mask); } -static bool msr_control(struct kvm_vcpu *vcpu, unsigned int msr, bool enable) +/* + * Returns true if one side (kvm or kvmi) tries to disable the MSR write + * interception while the other side is still tracking it. + */ +bool kvmi_monitor_msrw_intercept(struct kvm_vcpu *vcpu, u32 msr, bool enable) { - unsigned long *mask = msr_mask(vcpu, &msr); + struct kvmi_interception *arch_vcpui; + + if (!vcpu) + return false; + + arch_vcpui = READ_ONCE(vcpu->arch.kvmi); + + return (arch_vcpui && arch_vcpui->msrw.monitor_fct(vcpu, msr, enable)); +} +EXPORT_SYMBOL(kvmi_monitor_msrw_intercept); + +static bool msr_control(struct kvm_vcpu *vcpu, unsigned int msr, bool enable, + bool kvmi) +{ + unsigned long *mask = msr_mask(vcpu, &msr, kvmi); if (!mask) return false; @@ -462,6 +483,63 @@ static bool msr_control(struct kvm_vcpu *vcpu, unsigned int msr, bool enable) return true; } +static bool msr_intercepted_by_kvmi(struct kvm_vcpu *vcpu, u32 msr) +{ + return test_msr_mask(vcpu, msr, true); +} + +static bool msr_intercepted_by_kvm(struct kvm_vcpu *vcpu, u32 msr) +{ + return test_msr_mask(vcpu, msr, false); +} + +static void record_msr_intercept_status_for_kvmi(struct kvm_vcpu *vcpu, u32 msr, + bool enable) +{ + msr_control(vcpu, msr, enable, true); +} + +static void record_msr_intercept_status_for_kvm(struct kvm_vcpu *vcpu, u32 msr, + bool enable) +{ + msr_control(vcpu, msr, enable, false); +} + +static bool monitor_msrw_fct_kvmi(struct kvm_vcpu *vcpu, u32 msr, bool enable) +{ + bool ret = false; + + if (enable) { + if (kvm_x86_ops.msr_write_intercepted(vcpu, msr)) + record_msr_intercept_status_for_kvm(vcpu, msr, true); + } else { + if (unlikely(!msr_intercepted_by_kvmi(vcpu, msr))) + ret = true; + + if (msr_intercepted_by_kvm(vcpu, msr)) + ret = true; + } + + record_msr_intercept_status_for_kvmi(vcpu, msr, enable); + + return ret; +} + +static bool monitor_msrw_fct_kvm(struct kvm_vcpu *vcpu, u32 msr, bool enable) +{ + bool ret = false; + + if (!(msr_intercepted_by_kvmi(vcpu, msr))) + return false; + + if (!enable) + ret = true; + + record_msr_intercept_status_for_kvm(vcpu, msr, enable); + + return ret; +} + static unsigned int msr_mask_to_base(struct kvm_vcpu *vcpu, unsigned long *mask) { if (mask == vcpu->arch.kvmi->msrw.kvmi_mask.high) @@ -470,8 +548,16 @@ static unsigned int msr_mask_to_base(struct kvm_vcpu *vcpu, unsigned long *mask) return 0; } -static void kvmi_arch_disable_msr_intercept(struct kvm_vcpu *vcpu, - unsigned long *mask) +static void kvmi_control_msrw_intercept(struct kvm_vcpu *vcpu, u32 msr, + bool enable) +{ + vcpu->arch.kvmi->msrw.monitor_fct = monitor_msrw_fct_kvmi; + kvm_x86_ops.control_msr_intercept(vcpu, msr, MSR_TYPE_W, enable); + vcpu->arch.kvmi->msrw.monitor_fct = monitor_msrw_fct_kvm; +} + +static void kvmi_arch_disable_msrw_intercept(struct kvm_vcpu *vcpu, + unsigned long *mask) { unsigned int msr_base = msr_mask_to_base(vcpu, mask); int offset = -1; @@ -482,8 +568,7 @@ static void kvmi_arch_disable_msr_intercept(struct kvm_vcpu *vcpu, if (offset >= KVMI_NUM_MSR) break; - kvm_x86_ops.control_msr_intercept(vcpu, msr_base + offset, - MSR_TYPE_W, false); + kvmi_control_msrw_intercept(vcpu, msr_base + offset, false); } bitmap_zero(mask, KVMI_NUM_MSR); @@ -535,8 +620,8 @@ static void kvmi_arch_restore_interception(struct kvm_vcpu *vcpu) kvmi_arch_disable_bp_intercept(vcpu); kvmi_arch_disable_cr3w_intercept(vcpu); kvmi_arch_disable_desc_intercept(vcpu); - kvmi_arch_disable_msr_intercept(vcpu, arch_vcpui->msrw.kvmi_mask.low); - kvmi_arch_disable_msr_intercept(vcpu, arch_vcpui->msrw.kvmi_mask.high); + kvmi_arch_disable_msrw_intercept(vcpu, arch_vcpui->msrw.kvmi_mask.low); + kvmi_arch_disable_msrw_intercept(vcpu, arch_vcpui->msrw.kvmi_mask.high); } bool kvmi_arch_clean_up_interception(struct kvm_vcpu *vcpu) @@ -563,12 +648,14 @@ bool kvmi_arch_vcpu_alloc_interception(struct kvm_vcpu *vcpu) arch_vcpui->breakpoint.monitor_fct = monitor_bp_fct_kvm; arch_vcpui->cr3w.monitor_fct = monitor_cr3w_fct_kvm; arch_vcpui->descriptor.monitor_fct = monitor_desc_fct_kvm; + arch_vcpui->msrw.monitor_fct = monitor_msrw_fct_kvm; /* * paired with: * - kvmi_monitor_bp_intercept() * - kvmi_monitor_cr3w_intercept() * - kvmi_monitor_desc_intercept() + * - kvmi_monitor_msrw_intercept() */ smp_wmb(); WRITE_ONCE(vcpu->arch.kvmi, arch_vcpui); @@ -986,6 +1073,20 @@ bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write) } EXPORT_SYMBOL(kvmi_descriptor_event); +bool kvmi_msrw_intercept_originator(struct kvm_vcpu *vcpu) +{ + struct kvmi_interception *arch_vcpui; + + if (!vcpu) + return false; + + arch_vcpui = READ_ONCE(vcpu->arch.kvmi); + + return (arch_vcpui && + arch_vcpui->msrw.monitor_fct == monitor_msrw_fct_kvmi); +} +EXPORT_SYMBOL(kvmi_msrw_intercept_originator); + int kvmi_arch_cmd_vcpu_control_msr(struct kvm_vcpu *vcpu, const struct kvmi_vcpu_control_msr *req) { @@ -995,9 +1096,7 @@ int kvmi_arch_cmd_vcpu_control_msr(struct kvm_vcpu *vcpu, if (!kvmi_msr_valid(req->msr)) return -KVM_EINVAL; - kvm_x86_ops.control_msr_intercept(vcpu, req->msr, MSR_TYPE_W, - req->enable == 1); - msr_control(vcpu, req->msr, req->enable); + kvmi_control_msrw_intercept(vcpu, req->msr, req->enable); return 0; } @@ -1033,7 +1132,7 @@ static bool __kvmi_msr_event(struct kvm_vcpu *vcpu, struct msr_data *msr) u64 ret_value = msr->data; u32 action; - if (!test_msr_mask(vcpu, msr->index)) + if (!test_msr_mask(vcpu, msr->index, true)) return true; if (kvm_x86_ops.get_msr(vcpu, &old_msr)) return true; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 20f6905b45aa..5c2d4a0c3d31 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -592,6 +592,17 @@ static void set_msr_interception(struct kvm_vcpu *vcpu, unsigned long tmp; u32 offset; +#ifdef CONFIG_KVM_INTROSPECTION + if ((type & MSR_TYPE_W) && + kvmi_monitor_msrw_intercept(vcpu, msr, !value)) + type &= ~MSR_TYPE_W; + + /* + * Avoid the below warning for kvmi intercepted msrs. + */ + if (!kvmi_msrw_intercept_originator(vcpu)) +#endif /* CONFIG_KVM_INTROSPECTION */ + /* * If this warning triggers extend the direct_access_msrs list at the * beginning of the file diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 8d396a2d2309..f57587ddb3be 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3752,6 +3752,12 @@ static __always_inline void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu, if (!cpu_has_vmx_msr_bitmap()) return; +#ifdef CONFIG_KVM_INTROSPECTION + if ((type & MSR_TYPE_W) && + kvmi_monitor_msrw_intercept(vcpu, msr, false)) + type &= ~MSR_TYPE_W; +#endif /* CONFIG_KVM_INTROSPECTION */ + if (static_branch_unlikely(&enable_evmcs)) evmcs_touch_msr_bitmap(); @@ -3791,6 +3797,11 @@ static __always_inline void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, if (!cpu_has_vmx_msr_bitmap()) return; +#ifdef CONFIG_KVM_INTROSPECTION + if (type & MSR_TYPE_W) + kvmi_monitor_msrw_intercept(vcpu, msr, true); +#endif /* CONFIG_KVM_INTROSPECTION */ + if (static_branch_unlikely(&enable_evmcs)) evmcs_touch_msr_bitmap();
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 77/84] KVM: introspection: add KVMI_VM_SET_PAGE_ACCESS
From: Mihai Don?u <mdontu at bitdefender.com> This command sets the spte access bits (rwx) for an array of guest physical addresses (through the page tracking subsystem). These GPAs, with the requested access bits, are also kept in a radix tree in order to filter out the #PF events which are of no interest to the introspection tool. The access restrictions for pages that are not visible to the guest are silently ignored by default (the tool might set restrictions for the whole memory, based on KVMI_VM_GET_MAX_GFN). Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 60 ++++++ arch/x86/include/asm/kvm_host.h | 2 + arch/x86/include/asm/kvmi_host.h | 7 + arch/x86/kvm/kvmi.c | 40 ++++ include/linux/kvmi_host.h | 3 + include/uapi/linux/kvmi.h | 23 +++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 52 ++++++ virt/kvm/introspection/kvmi.c | 174 +++++++++++++++++- virt/kvm/introspection/kvmi_int.h | 12 ++ virt/kvm/introspection/kvmi_msg.c | 12 ++ 10 files changed, 384 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 536d6ecec026..123b2360d2e0 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -977,6 +977,66 @@ to control events for any other register will fail with -KVM_EINVAL:: * -KVM_EINVAL - the padding is not zero * -KVM_EAGAIN - the selected vCPU can't be introspected yet +23. KVMI_VM_SET_PAGE_ACCESS +--------------------------- + +:Architectures: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vm_set_page_access { + __u16 count; + __u16 padding1; + __u32 padding2; + struct kvmi_page_access_entry entries[0]; + }; + +where:: + + struct kvmi_page_access_entry { + __u64 gpa; + __u8 access; + __u8 visible; + __u16 padding1; + __u32 padding2; + }; + + +:Returns: + +:: + + struct kvmi_error_code + +Sets the access bits (rwx) for an array of ``count`` guest physical +addresses. + +The valid access bits are:: + + KVMI_PAGE_ACCESS_R + KVMI_PAGE_ACCESS_W + KVMI_PAGE_ACCESS_X + + +The command will fail with -KVM_EINVAL if any of the specified combination +of access bits is not supported. Also, if ``visible`` is set to 1 but +the page is not visible. + +The command will try to apply all changes and return the first error if +some failed. The introspection tool should handle the rollback. + +In order to 'forget' an address, all three bits ('rwx') must be set. + +:Errors: + +* -KVM_EINVAL - the specified access bits combination is invalid +* -KVM_EINVAL - the padding is not zero +* -KVM_EINVAL - the message size is invalid +* -KVM_EAGAIN - the selected vCPU can't be introspected yet +* -KVM_ENOMEM - there is not enough memory to add the page tracking structures + Events ===== diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index acfcebce51dd..d96bf0e15ea2 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -45,6 +45,8 @@ #define KVM_PRIVATE_MEM_SLOTS 3 #define KVM_MEM_SLOTS_NUM (KVM_USER_MEM_SLOTS + KVM_PRIVATE_MEM_SLOTS) +#include <asm/kvmi_host.h> + #define KVM_HALT_POLL_NS_DEFAULT 200000 #define KVM_IRQCHIP_NUM_PINS KVM_IOAPIC_NUM_PINS diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index acc003403c95..98ea548c0b15 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_KVMI_HOST_H #define _ASM_X86_KVMI_HOST_H +#include <asm/kvm_page_track.h> #include <asm/kvmi.h> struct msr_data; @@ -42,6 +43,12 @@ struct kvm_vcpu_arch_introspection { struct kvm_arch_introspection { }; +#define SLOTS_SIZE BITS_TO_LONGS(KVM_MEM_SLOTS_NUM) + +struct kvmi_arch_mem_access { + unsigned long active[KVM_PAGE_TRACK_MAX][SLOTS_SIZE]; +}; + #ifdef CONFIG_KVM_INTROSPECTION bool kvmi_monitor_bp_intercept(struct kvm_vcpu *vcpu, u32 dbg); diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 0b1301ebafba..b233a3c5becb 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -1169,3 +1169,43 @@ bool kvmi_msr_event(struct kvm_vcpu *vcpu, struct msr_data *msr) return ret; } + +static const struct { + unsigned int allow_bit; + enum kvm_page_track_mode track_mode; +} track_modes[] = { + { KVMI_PAGE_ACCESS_R, KVM_PAGE_TRACK_PREREAD }, + { KVMI_PAGE_ACCESS_W, KVM_PAGE_TRACK_PREWRITE }, + { KVMI_PAGE_ACCESS_X, KVM_PAGE_TRACK_PREEXEC }, +}; + +void kvmi_arch_update_page_tracking(struct kvm *kvm, + struct kvm_memory_slot *slot, + struct kvmi_mem_access *m) +{ + struct kvmi_arch_mem_access *arch = &m->arch; + int i; + + if (!slot) { + slot = gfn_to_memslot(kvm, m->gfn); + if (!slot) + return; + } + + for (i = 0; i < ARRAY_SIZE(track_modes); i++) { + unsigned int allow_bit = track_modes[i].allow_bit; + enum kvm_page_track_mode mode = track_modes[i].track_mode; + bool slot_tracked = test_bit(slot->id, arch->active[mode]); + + if (m->access & allow_bit) { + if (slot_tracked) { + kvm_slot_page_track_remove_page(kvm, slot, + m->gfn, mode); + clear_bit(slot->id, arch->active[mode]); + } + } else if (!slot_tracked) { + kvm_slot_page_track_add_page(kvm, slot, m->gfn, mode); + set_bit(slot->id, arch->active[mode]); + } + } +} diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 1fae589d9d35..11eb9b1c3c5e 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -64,6 +64,9 @@ struct kvm_introspection { atomic_t ev_seq; bool cleanup_on_unhook; + + struct radix_tree_root access_tree; + rwlock_t access_tree_lock; }; int kvmi_version(void); diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 2872f90ff092..dc82f192534c 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -45,6 +45,8 @@ enum { KVMI_VCPU_GET_MTRR_TYPE = 21, KVMI_VCPU_CONTROL_MSR = 22, + KVMI_VM_SET_PAGE_ACCESS = 23, + KVMI_NUM_MESSAGES }; @@ -68,6 +70,12 @@ enum { KVMI_EVENT_ACTION_CRASH = 2, }; +enum { + KVMI_PAGE_ACCESS_R = 1 << 0, + KVMI_PAGE_ACCESS_W = 1 << 1, + KVMI_PAGE_ACCESS_X = 1 << 2, +}; + struct kvmi_msg_hdr { __u16 id; __u16 size; @@ -164,6 +172,21 @@ struct kvmi_vm_get_max_gfn_reply { __u64 gfn; }; +struct kvmi_page_access_entry { + __u64 gpa; + __u8 access; + __u8 visible; + __u16 padding1; + __u32 padding2; +}; + +struct kvmi_vm_set_page_access { + __u16 count; + __u16 padding1; + __u32 padding2; + struct kvmi_page_access_entry entries[0]; +}; + struct kvmi_event { __u16 size; __u16 vcpu; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 0ddc0029e3e5..4fb109cec1b4 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -1809,6 +1809,57 @@ static void test_cmd_vcpu_control_msr(struct kvm_vm *vm) test_invalid_control_msr(vm, msr); } +static int cmd_set_page_access(__u16 count, __u64 *gpa, __u8 *access) +{ + struct kvmi_page_access_entry *entry, *end; + struct kvmi_vm_set_page_access *cmd; + struct kvmi_msg_hdr *req; + size_t req_size; + int r; + + req_size = sizeof(*req) + sizeof(*cmd) + count * sizeof(*entry); + req = calloc(1, req_size); + + TEST_ASSERT(req, "Insufficient Memory\n"); + + cmd = (struct kvmi_vm_set_page_access *)(req + 1); + cmd->count = count; + + entry = cmd->entries; + end = cmd->entries + count; + for (; entry < end; entry++) { + entry->gpa = *gpa++; + entry->access = *access++; + } + + r = do_command(KVMI_VM_SET_PAGE_ACCESS, req, req_size, NULL, 0); + + free(req); + return r; +} + +static void set_page_access(__u64 gpa, __u8 access) +{ + int r; + + r = cmd_set_page_access(1, &gpa, &access); + TEST_ASSERT(r == 0, + "KVMI_VM_SET_PAGE_ACCESS failed, gpa 0x%llx, access 0x%x, error %d (%s)\n", + gpa, access, -r, kvm_strerror(-r)); +} + +static void test_cmd_vm_set_page_access(struct kvm_vm *vm) +{ + __u8 full_access = KVMI_PAGE_ACCESS_R | KVMI_PAGE_ACCESS_W + | KVMI_PAGE_ACCESS_X; + __u8 no_access = 0; + __u64 gpa = 0; + + set_page_access(gpa, no_access); + + set_page_access(gpa, full_access); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1841,6 +1892,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_get_mtrr_type(vm); test_event_descriptor(vm); test_cmd_vcpu_control_msr(vm); + test_cmd_vm_set_page_access(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index aa8a0708cc1c..b57ad490dd06 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -21,6 +21,11 @@ static DECLARE_BITMAP(Kvmi_known_vcpu_events, KVMI_NUM_EVENTS); static struct kmem_cache *msg_cache; static struct kmem_cache *job_cache; +static struct kmem_cache *radix_cache; + +static const u8 full_access = KVMI_PAGE_ACCESS_R | + KVMI_PAGE_ACCESS_W | + KVMI_PAGE_ACCESS_X; void *kvmi_msg_alloc(void) { @@ -39,6 +44,8 @@ static void kvmi_cache_destroy(void) msg_cache = NULL; kmem_cache_destroy(job_cache); job_cache = NULL; + kmem_cache_destroy(radix_cache); + radix_cache = NULL; } static int kvmi_cache_create(void) @@ -48,8 +55,11 @@ static int kvmi_cache_create(void) job_cache = kmem_cache_create("kvmi_job", sizeof(struct kvmi_job), 0, SLAB_ACCOUNT, NULL); + radix_cache = kmem_cache_create("kvmi_radix_tree", + sizeof(struct kvmi_mem_access), + 0, SLAB_ACCOUNT, NULL); - if (!msg_cache || !job_cache) { + if (!msg_cache || !job_cache || !radix_cache) { kvmi_cache_destroy(); return -1; @@ -241,12 +251,38 @@ static void free_vcpui(struct kvm_vcpu *vcpu, bool restore_interception) kvmi_make_request(vcpu, false); } +static void kvmi_clear_mem_access(struct kvm *kvm) +{ + struct kvm_introspection *kvmi = KVMI(kvm); + struct radix_tree_iter iter; + void **slot; + int idx; + + idx = srcu_read_lock(&kvm->srcu); + spin_lock(&kvm->mmu_lock); + + radix_tree_for_each_slot(slot, &kvmi->access_tree, &iter, 0) { + struct kvmi_mem_access *m = *slot; + + m->access = full_access; + kvmi_arch_update_page_tracking(kvm, NULL, m); + + radix_tree_iter_delete(&kvmi->access_tree, &iter, slot); + kmem_cache_free(radix_cache, m); + } + + spin_unlock(&kvm->mmu_lock); + srcu_read_unlock(&kvm->srcu, idx); +} + static void free_kvmi(struct kvm *kvm) { bool restore_interception = KVMI(kvm)->cleanup_on_unhook; struct kvm_vcpu *vcpu; int i; + kvmi_clear_mem_access(kvm); + kvm_for_each_vcpu(i, vcpu, kvm) free_vcpui(vcpu, restore_interception); @@ -297,6 +333,10 @@ alloc_kvmi(struct kvm *kvm, const struct kvm_introspection_hook *hook) atomic_set(&kvmi->ev_seq, 0); + INIT_RADIX_TREE(&kvmi->access_tree, + GFP_KERNEL & ~__GFP_DIRECT_RECLAIM); + rwlock_init(&kvmi->access_tree_lock); + kvm_for_each_vcpu(i, vcpu, kvm) { int err = create_vcpui(vcpu); @@ -1010,3 +1050,135 @@ bool kvmi_enter_guest(struct kvm_vcpu *vcpu) kvmi_put(vcpu->kvm); return r; } + +static struct kvmi_mem_access * +__kvmi_get_gfn_access(struct kvm_introspection *kvmi, const gfn_t gfn) +{ + return radix_tree_lookup(&kvmi->access_tree, gfn); +} + +static void kvmi_update_mem_access(struct kvm *kvm, struct kvmi_mem_access *m) +{ + struct kvm_introspection *kvmi = KVMI(kvm); + + kvmi_arch_update_page_tracking(kvm, NULL, m); + + if (m->access == full_access) { + radix_tree_delete(&kvmi->access_tree, m->gfn); + kmem_cache_free(radix_cache, m); + } +} + +static void kvmi_insert_mem_access(struct kvm *kvm, struct kvmi_mem_access *m) +{ + struct kvm_introspection *kvmi = KVMI(kvm); + + radix_tree_insert(&kvmi->access_tree, m->gfn, m); + kvmi_arch_update_page_tracking(kvm, NULL, m); +} + +static void kvmi_set_mem_access(struct kvm *kvm, struct kvmi_mem_access *m, + bool *used) +{ + struct kvm_introspection *kvmi = KVMI(kvm); + struct kvmi_mem_access *found; + int idx; + + idx = srcu_read_lock(&kvm->srcu); + spin_lock(&kvm->mmu_lock); + write_lock(&kvmi->access_tree_lock); + + found = __kvmi_get_gfn_access(kvmi, m->gfn); + if (found) { + found->access = m->access; + kvmi_update_mem_access(kvm, found); + } else if (m->access != full_access) { + kvmi_insert_mem_access(kvm, m); + *used = true; + } + + write_unlock(&kvmi->access_tree_lock); + spin_unlock(&kvm->mmu_lock); + srcu_read_unlock(&kvm->srcu, idx); +} + +static int kvmi_set_gfn_access(struct kvm *kvm, gfn_t gfn, u8 access) +{ + struct kvmi_mem_access *m; + bool used = false; + int err = 0; + + m = kmem_cache_zalloc(radix_cache, GFP_KERNEL); + if (!m) + return -KVM_ENOMEM; + + m->gfn = gfn; + m->access = access; + + if (radix_tree_preload(GFP_KERNEL)) + err = -KVM_ENOMEM; + else + kvmi_set_mem_access(kvm, m, &used); + + radix_tree_preload_end(); + + if (!used) + kmem_cache_free(radix_cache, m); + + return err; +} + +static bool kvmi_is_visible_gfn(struct kvm *kvm, gfn_t gfn) +{ + bool visible; + int idx; + + idx = srcu_read_lock(&kvm->srcu); + visible = kvm_is_visible_gfn(kvm, gfn); + srcu_read_unlock(&kvm->srcu, idx); + + return visible; +} + +static int set_page_access_entry(struct kvm_introspection *kvmi, + const struct kvmi_page_access_entry *entry) +{ + u8 unknown_bits = ~(KVMI_PAGE_ACCESS_R | KVMI_PAGE_ACCESS_W + | KVMI_PAGE_ACCESS_X); + gfn_t gfn = gpa_to_gfn(entry->gpa); + struct kvm *kvm = kvmi->kvm; + + if ((entry->access & unknown_bits) + || entry->padding1 || entry->padding2 + || entry->visible > 1) + return -KVM_EINVAL; + + if (!kvmi_is_visible_gfn(kvm, gfn)) + return entry->visible ? -KVM_EINVAL : 0; + + return kvmi_set_gfn_access(kvm, gfn, entry->access); +} + +int kvmi_cmd_set_page_access(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const struct kvmi_vm_set_page_access *req) +{ + const struct kvmi_page_access_entry *entry = req->entries; + const struct kvmi_page_access_entry *end = req->entries + req->count; + int ec = 0; + + if (req->padding1 || req->padding2) + return -KVM_EINVAL; + + if (msg->size < struct_size(req, entries, req->count)) + return -KVM_EINVAL; + + for (; entry < end; entry++) { + int r = set_page_access_entry(kvmi, entry); + + if (r && !ec) + ec = r; + } + + return ec; +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 8f8557bd8ab0..024e7acf0dce 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -21,6 +21,12 @@ #define KVMI(kvm) ((kvm)->kvmi) #define VCPUI(vcpu) ((vcpu)->kvmi) +struct kvmi_mem_access { + gfn_t gfn; + u8 access; + struct kvmi_arch_mem_access arch; +}; + static inline bool is_event_enabled(struct kvm_vcpu *vcpu, int event) { return test_bit(event, VCPUI(vcpu)->ev_enable_mask); @@ -73,6 +79,9 @@ int kvmi_cmd_write_physical(struct kvm *kvm, u64 gpa, size_t size, int kvmi_cmd_vcpu_pause(struct kvm_vcpu *vcpu, bool wait); int kvmi_cmd_vcpu_set_registers(struct kvm_vcpu *vcpu, const struct kvm_regs *regs); +int kvmi_cmd_set_page_access(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const struct kvmi_vm_set_page_access *req); /* arch */ bool kvmi_arch_vcpu_alloc_interception(struct kvm_vcpu *vcpu); @@ -117,5 +126,8 @@ int kvmi_arch_cmd_vcpu_set_xsave(struct kvm_vcpu *vcpu, int kvmi_arch_cmd_vcpu_get_mtrr_type(struct kvm_vcpu *vcpu, u64 gpa, u8 *type); int kvmi_arch_cmd_vcpu_control_msr(struct kvm_vcpu *vcpu, const struct kvmi_vcpu_control_msr *req); +void kvmi_arch_update_page_tracking(struct kvm *kvm, + struct kvm_memory_slot *slot, + struct kvmi_mem_access *m); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index d9cbbefb9993..f7ffb971f1dc 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -334,6 +334,17 @@ static int handle_vm_get_max_gfn(struct kvm_introspection *kvmi, return kvmi_msg_vm_reply(kvmi, msg, 0, &rpl, sizeof(rpl)); } +static int handle_vm_set_page_access(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *req) +{ + int ec; + + ec = kvmi_cmd_set_page_access(kvmi, msg, req); + + return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); +} + /* * These commands are executed by the receiving thread. */ @@ -348,6 +359,7 @@ static int(*const msg_vm[])(struct kvm_introspection *, [KVMI_VM_GET_INFO] = handle_vm_get_info, [KVMI_VM_GET_MAX_GFN] = handle_vm_get_max_gfn, [KVMI_VM_READ_PHYSICAL] = handle_vm_read_physical, + [KVMI_VM_SET_PAGE_ACCESS] = handle_vm_set_page_access, [KVMI_VM_WRITE_PHYSICAL] = handle_vm_write_physical, };
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 78/84] KVM: introspection: add KVMI_EVENT_PF
From: Mihai Don?u <mdontu at bitdefender.com> This event is sent when a #PF occurs due to a failed permission check in the shadow page tables, for a page in which the introspection tool has shown interest. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 51 +++++++ arch/x86/include/asm/kvmi_host.h | 1 + arch/x86/kvm/kvmi.c | 141 ++++++++++++++++++ include/uapi/linux/kvmi.h | 10 ++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 76 ++++++++++ virt/kvm/introspection/kvmi.c | 115 ++++++++++++++ virt/kvm/introspection/kvmi_int.h | 9 ++ virt/kvm/introspection/kvmi_msg.c | 18 +++ 8 files changed, 421 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 123b2360d2e0..b2e2a9edda77 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -554,6 +554,7 @@ the following events:: KVMI_EVENT_DESCRIPTOR KVMI_EVENT_HYPERCALL KVMI_EVENT_MSR + KVMI_EVENT_PF KVMI_EVENT_XSETBV When an event is enabled, the introspection tool is notified and @@ -1387,3 +1388,53 @@ register (see **KVMI_VCPU_CONTROL_EVENTS**). ``kvmi_event``, the MSR number, the old value and the new value are sent to the introspection tool. The *CONTINUE* action will set the ``new_val``. + +10. KVMI_EVENT_PF +----------------- + +:Architectures: x86 +:Versions: >= 1 +:Actions: CONTINUE, CRASH, RETRY +:Parameters: + +:: + + struct kvmi_event; + struct kvmi_event_pf { + __u64 gva; + __u64 gpa; + __u8 access; + __u8 padding1; + __u16 padding2; + __u32 padding3; + }; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + +This event is sent when a hypervisor page fault occurs due to a failed +permission check in the shadow page tables, the introspection has been +enabled for this event (see *KVMI_VCPU_CONTROL_EVENTS*) and the event was +generated for a page in which the introspection tool has shown interest +(ie. has previously touched it by adjusting the spte permissions). + +The shadow page tables can be used by the introspection tool to guarantee +the purpose of code areas inside the guest (code, rodata, stack, heap +etc.) Each attempt at an operation unfitting for a certain memory +range (eg. execute code in heap) triggers a page fault and gives the +introspection tool the chance to audit the code attempting the operation. + +``kvmi_event``, guest virtual address (or 0xffffffff/UNMAPPED_GVA), +guest physical address and the access flags (eg. KVMI_PAGE_ACCESS_R) +are sent to the introspection tool. + +The *CONTINUE* action will continue the page fault handling (e.g. via +emulation). + +The *RETRY* action is used by the introspection tool to retry the +execution of the current instruction, usually because it changed the +instruction pointer or the page restrictions. diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 98ea548c0b15..25c7bb8a9082 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -41,6 +41,7 @@ struct kvm_vcpu_arch_introspection { }; struct kvm_arch_introspection { + struct kvm_page_track_notifier_node kptn_node; }; #define SLOTS_SIZE BITS_TO_LONGS(KVM_MEM_SLOTS_NUM) diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index b233a3c5becb..8fbf1720749b 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -10,6 +10,21 @@ #include "cpuid.h" #include "../../../virt/kvm/introspection/kvmi_int.h" +static bool kvmi_track_preread(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + int bytes, + struct kvm_page_track_notifier_node *node); +static bool kvmi_track_prewrite(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + const u8 *new, int bytes, + struct kvm_page_track_notifier_node *node); +static bool kvmi_track_preexec(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + struct kvm_page_track_notifier_node *node); +static void kvmi_track_create_slot(struct kvm *kvm, + struct kvm_memory_slot *slot, + unsigned long npages, + struct kvm_page_track_notifier_node *node); +static void kvmi_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot, + struct kvm_page_track_notifier_node *node); + static unsigned int kvmi_vcpu_mode(const struct kvm_vcpu *vcpu, const struct kvm_sregs *sregs) { @@ -1209,3 +1224,129 @@ void kvmi_arch_update_page_tracking(struct kvm *kvm, } } } + +void kvmi_arch_hook(struct kvm *kvm) +{ + struct kvm_introspection *kvmi = KVMI(kvm); + + kvmi->arch.kptn_node.track_preread = kvmi_track_preread; + kvmi->arch.kptn_node.track_prewrite = kvmi_track_prewrite; + kvmi->arch.kptn_node.track_preexec = kvmi_track_preexec; + kvmi->arch.kptn_node.track_create_slot = kvmi_track_create_slot; + kvmi->arch.kptn_node.track_flush_slot = kvmi_track_flush_slot; + + kvm_page_track_register_notifier(kvm, &kvmi->arch.kptn_node); +} + +void kvmi_arch_unhook(struct kvm *kvm) +{ + struct kvm_introspection *kvmi = KVMI(kvm); + + kvm_page_track_unregister_notifier(kvm, &kvmi->arch.kptn_node); +} + +static bool is_pf_of_interest(struct kvm_vcpu *vcpu, gpa_t gpa, u8 access) +{ + if (!kvm_x86_ops.spt_fault(vcpu)) + return false; + + if (kvm_x86_ops.gpt_translation_fault(vcpu)) + return false; + + return kvmi_restricted_page_access(KVMI(vcpu->kvm), gpa, access); +} + +static bool handle_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + int access) +{ + if (!is_pf_of_interest(vcpu, gpa, access)) + return true; + + return kvmi_pf_event(vcpu, gpa, gva, access); +} + +static bool kvmi_track_preread(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + int bytes, + struct kvm_page_track_notifier_node *node) +{ + struct kvm_introspection *kvmi; + bool ret = true; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return true; + + if (is_event_enabled(vcpu, KVMI_EVENT_PF)) + ret = handle_pf_event(vcpu, gpa, gva, KVMI_PAGE_ACCESS_R); + + kvmi_put(vcpu->kvm); + + return ret; +} + +static bool kvmi_track_prewrite(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + const u8 *new, int bytes, + struct kvm_page_track_notifier_node *node) +{ + struct kvm_introspection *kvmi; + bool ret = true; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return true; + + if (is_event_enabled(vcpu, KVMI_EVENT_PF)) + ret = handle_pf_event(vcpu, gpa, gva, KVMI_PAGE_ACCESS_W); + + kvmi_put(vcpu->kvm); + + return ret; +} + +static bool kvmi_track_preexec(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, + struct kvm_page_track_notifier_node *node) +{ + struct kvm_introspection *kvmi; + bool ret = true; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return true; + + if (is_event_enabled(vcpu, KVMI_EVENT_PF)) + ret = handle_pf_event(vcpu, gpa, gva, KVMI_PAGE_ACCESS_X); + + kvmi_put(vcpu->kvm); + + return ret; +} + +static void kvmi_track_create_slot(struct kvm *kvm, + struct kvm_memory_slot *slot, + unsigned long npages, + struct kvm_page_track_notifier_node *node) +{ + struct kvm_introspection *kvmi; + + kvmi = kvmi_get(kvm); + if (!kvmi) + return; + + kvmi_add_memslot(kvm, slot, npages); + + kvmi_put(kvm); +} + +static void kvmi_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot, + struct kvm_page_track_notifier_node *node) +{ + struct kvm_introspection *kvmi; + + kvmi = kvmi_get(kvm); + if (!kvmi) + return; + + kvmi_remove_memslot(kvm, slot); + + kvmi_put(kvm); +} diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index dc82f192534c..dc7ba12498b7 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -60,6 +60,7 @@ enum { KVMI_EVENT_XSETBV = 6, KVMI_EVENT_DESCRIPTOR = 7, KVMI_EVENT_MSR = 8, + KVMI_EVENT_PF = 9, KVMI_NUM_EVENTS }; @@ -218,4 +219,13 @@ struct kvmi_vcpu_inject_exception { __u64 address; }; +struct kvmi_event_pf { + __u64 gva; + __u64 gpa; + __u8 access; + __u8 padding1; + __u16 padding2; + __u32 padding3; +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 4fb109cec1b4..21b3f7a459c8 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -42,6 +42,11 @@ struct vcpu_reply { struct kvmi_event_reply reply; }; +struct pf_ev { + struct kvmi_event common; + struct kvmi_event_pf pf; +}; + struct vcpu_worker_data { struct kvm_vm *vm; int vcpu_id; @@ -51,6 +56,10 @@ struct vcpu_worker_data { bool restart_on_shutdown; }; +typedef void (*fct_pf_event)(struct kvm_vm *vm, struct kvmi_msg_hdr *hdr, + struct pf_ev *ev, + struct vcpu_reply *rpl); + enum { GUEST_TEST_NOOP = 0, GUEST_TEST_BP, @@ -58,6 +67,7 @@ enum { GUEST_TEST_DESCRIPTOR, GUEST_TEST_HYPERCALL, GUEST_TEST_MSR, + GUEST_TEST_PF, GUEST_TEST_XSETBV, }; @@ -111,6 +121,11 @@ static void guest_msr_test(void) wrmsr(MSR_MISC_FEATURES_ENABLES, msr); } +static void guest_pf_test(void) +{ + *((uint8_t *)test_gva) = READ_ONCE(test_write_pattern); +} + /* from fpu/internal.h */ static u64 xgetbv(u32 index) { @@ -171,6 +186,9 @@ static void guest_code(void) case GUEST_TEST_MSR: guest_msr_test(); break; + case GUEST_TEST_PF: + guest_pf_test(); + break; case GUEST_TEST_XSETBV: guest_xsetbv_test(); break; @@ -1860,6 +1878,63 @@ static void test_cmd_vm_set_page_access(struct kvm_vm *vm) set_page_access(gpa, full_access); } +static void test_pf(struct kvm_vm *vm, fct_pf_event cbk) +{ + __u16 event_id = KVMI_EVENT_PF; + struct vcpu_worker_data data = { + .vm = vm, + .vcpu_id = VCPU_ID, + .test_id = GUEST_TEST_PF, + }; + struct kvmi_msg_hdr hdr; + struct vcpu_reply rpl = {}; + pthread_t vcpu_thread; + struct pf_ev ev; + + set_page_access(test_gpa, KVMI_PAGE_ACCESS_R); + enable_vcpu_event(vm, event_id); + + new_test_write_pattern(vm); + + vcpu_thread = start_vcpu_worker(&data); + + receive_event(&hdr, &ev.common, sizeof(ev), event_id); + + pr_info("PF event, gpa 0x%llx, gva 0x%llx, access 0x%x\n", + ev.pf.gpa, ev.pf.gva, ev.pf.access); + + TEST_ASSERT(ev.pf.gpa == test_gpa && ev.pf.gva == test_gva, + "Unexpected #PF event, gpa 0x%llx (expended 0x%lx), gva 0x%llx (expected 0x%lx)\n", + ev.pf.gpa, test_gpa, ev.pf.gva, test_gva); + + cbk(vm, &hdr, &ev, &rpl); + + stop_vcpu_worker(vcpu_thread, &data); + + TEST_ASSERT(*((uint8_t *)test_hva) == test_write_pattern, + "Write failed, expected 0x%x, result 0x%x\n", + test_write_pattern, *((uint8_t *)test_hva)); + + disable_vcpu_event(vm, event_id); + set_page_access(test_gpa, KVMI_PAGE_ACCESS_R + | KVMI_PAGE_ACCESS_W + | KVMI_PAGE_ACCESS_X); +} + +static void cbk_test_event_pf(struct kvm_vm *vm, struct kvmi_msg_hdr *hdr, + struct pf_ev *ev, struct vcpu_reply *rpl) +{ + set_page_access(test_gpa, KVMI_PAGE_ACCESS_R | KVMI_PAGE_ACCESS_W); + + reply_to_event(hdr, &ev->common, KVMI_EVENT_ACTION_RETRY, + rpl, sizeof(*rpl)); +} + +static void test_event_pf(struct kvm_vm *vm) +{ + test_pf(vm, cbk_test_event_pf); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1893,6 +1968,7 @@ static void test_introspection(struct kvm_vm *vm) test_event_descriptor(vm); test_cmd_vcpu_control_msr(vm); test_cmd_vm_set_page_access(vm); + test_event_pf(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index b57ad490dd06..99c88e182587 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -113,6 +113,7 @@ static void setup_known_events(void) set_bit(KVMI_EVENT_HYPERCALL, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_MSR, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_PAUSE_VCPU, Kvmi_known_vcpu_events); + set_bit(KVMI_EVENT_PF, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_TRAP, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_XSETBV, Kvmi_known_vcpu_events); @@ -368,6 +369,8 @@ static void __kvmi_unhook(struct kvm *kvm) struct kvm_introspection *kvmi = KVMI(kvm); wait_for_completion_killable(&kvm->kvmi_complete); + + kvmi_arch_unhook(kvm); kvmi_sock_put(kvmi); } @@ -415,6 +418,8 @@ static int __kvmi_hook(struct kvm *kvm, if (!kvmi_sock_get(kvmi, hook->fd)) return -EINVAL; + kvmi_arch_hook(kvm); + return 0; } @@ -1182,3 +1187,113 @@ int kvmi_cmd_set_page_access(struct kvm_introspection *kvmi, return ec; } + +static int kvmi_get_gfn_access(struct kvm_introspection *kvmi, const gfn_t gfn, + u8 *access) +{ + struct kvmi_mem_access *m; + + read_lock(&kvmi->access_tree_lock); + m = __kvmi_get_gfn_access(kvmi, gfn); + if (m) + *access = m->access; + read_unlock(&kvmi->access_tree_lock); + + return m ? 0 : -1; +} + +bool kvmi_restricted_page_access(struct kvm_introspection *kvmi, gpa_t gpa, + u8 access) +{ + u8 allowed_access; + int err; + + err = kvmi_get_gfn_access(kvmi, gpa_to_gfn(gpa), &allowed_access); + if (err) + return false; + + /* + * We want to be notified only for violations involving access + * bits that we've specifically cleared + */ + if (access & (~allowed_access)) + return true; + + return false; +} + +bool kvmi_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, u8 access) +{ + bool ret = false; + u32 action; + + action = kvmi_msg_send_pf(vcpu, gpa, gva, access); + + switch (action) { + case KVMI_EVENT_ACTION_CONTINUE: + ret = true; + break; + case KVMI_EVENT_ACTION_RETRY: + break; + default: + kvmi_handle_common_event_actions(vcpu->kvm, action); + } + + return ret; +} + +void kvmi_add_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, + unsigned long npages) +{ + struct kvm_introspection *kvmi = KVMI(kvm); + gfn_t start = slot->base_gfn; + gfn_t end = start + npages; + int idx; + + idx = srcu_read_lock(&kvm->srcu); + spin_lock(&kvm->mmu_lock); + read_lock(&kvmi->access_tree_lock); + + while (start < end) { + struct kvmi_mem_access *m; + + m = __kvmi_get_gfn_access(kvmi, start); + if (m) + kvmi_arch_update_page_tracking(kvm, slot, m); + start++; + } + + read_unlock(&kvmi->access_tree_lock); + spin_unlock(&kvm->mmu_lock); + srcu_read_unlock(&kvm->srcu, idx); +} + +void kvmi_remove_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) +{ + struct kvm_introspection *kvmi = KVMI(kvm); + gfn_t start = slot->base_gfn; + gfn_t end = start + slot->npages; + int idx; + + idx = srcu_read_lock(&kvm->srcu); + spin_lock(&kvm->mmu_lock); + write_lock(&kvmi->access_tree_lock); + + while (start < end) { + struct kvmi_mem_access *m; + + m = __kvmi_get_gfn_access(kvmi, start); + if (m) { + u8 prev_access = m->access; + + m->access = full_access; + kvmi_arch_update_page_tracking(kvm, slot, m); + m->access = prev_access; + } + start++; + } + + write_unlock(&kvmi->access_tree_lock); + spin_unlock(&kvm->mmu_lock); + srcu_read_unlock(&kvm->srcu, idx); +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 024e7acf0dce..9f2341fe21d5 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -47,6 +47,7 @@ int kvmi_msg_send_unhook(struct kvm_introspection *kvmi); u32 kvmi_msg_send_vcpu_pause(struct kvm_vcpu *vcpu); u32 kvmi_msg_send_hypercall(struct kvm_vcpu *vcpu); u32 kvmi_msg_send_bp(struct kvm_vcpu *vcpu, u64 gpa, u8 insn_len); +u32 kvmi_msg_send_pf(struct kvm_vcpu *vcpu, u64 gpa, u64 gva, u8 access); /* kvmi.c */ void *kvmi_msg_alloc(void); @@ -82,6 +83,12 @@ int kvmi_cmd_vcpu_set_registers(struct kvm_vcpu *vcpu, int kvmi_cmd_set_page_access(struct kvm_introspection *kvmi, const struct kvmi_msg_hdr *msg, const struct kvmi_vm_set_page_access *req); +bool kvmi_restricted_page_access(struct kvm_introspection *kvmi, gpa_t gpa, + u8 access); +bool kvmi_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, u8 access); +void kvmi_add_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, + unsigned long npages); +void kvmi_remove_memslot(struct kvm *kvm, struct kvm_memory_slot *slot); /* arch */ bool kvmi_arch_vcpu_alloc_interception(struct kvm_vcpu *vcpu); @@ -129,5 +136,7 @@ int kvmi_arch_cmd_vcpu_control_msr(struct kvm_vcpu *vcpu, void kvmi_arch_update_page_tracking(struct kvm *kvm, struct kvm_memory_slot *slot, struct kvmi_mem_access *m); +void kvmi_arch_hook(struct kvm *kvm); +void kvmi_arch_unhook(struct kvm *kvm); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index f7ffb971f1dc..0a0d10b43f2d 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -983,3 +983,21 @@ u32 kvmi_msg_send_bp(struct kvm_vcpu *vcpu, u64 gpa, u8 insn_len) return action; } + +u32 kvmi_msg_send_pf(struct kvm_vcpu *vcpu, u64 gpa, u64 gva, u8 access) +{ + struct kvmi_event_pf e; + int err, action; + + memset(&e, 0, sizeof(e)); + e.gpa = gpa; + e.gva = gva; + e.access = access; + + err = kvmi_send_event(vcpu, KVMI_EVENT_PF, &e, sizeof(e), + NULL, 0, &action); + if (err) + return KVMI_EVENT_ACTION_CONTINUE; + + return action; +}
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 79/84] KVM: introspection: extend KVMI_GET_VERSION with struct kvmi_features
This is used by the introspection tool to check the hardware support for the single step feature. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 13 ++++++++++++- arch/x86/include/uapi/asm/kvmi.h | 5 +++++ arch/x86/kvm/kvmi.c | 5 +++++ include/uapi/linux/kvmi.h | 1 + tools/testing/selftests/kvm/x86_64/kvmi_test.c | 5 +++++ virt/kvm/introspection/kvmi_int.h | 1 + virt/kvm/introspection/kvmi_msg.c | 2 ++ 7 files changed, 31 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index b2e2a9edda77..47387f297029 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -254,9 +254,20 @@ The vCPU commands start with:: struct kvmi_get_version_reply { __u32 version; __u32 padding; + struct kvmi_features features; }; -Returns the introspection API version. +For x86 + +:: + + struct kvmi_features { + __u8 singlestep; + __u8 padding[7]; + }; + +Returns the introspection API version and some of the features supported +by the hardware. This command is always allowed and successful. diff --git a/arch/x86/include/uapi/asm/kvmi.h b/arch/x86/include/uapi/asm/kvmi.h index 1bb13da61dbf..32af803f1d70 100644 --- a/arch/x86/include/uapi/asm/kvmi.h +++ b/arch/x86/include/uapi/asm/kvmi.h @@ -145,4 +145,9 @@ struct kvmi_event_msr_reply { __u64 new_val; }; +struct kvmi_features { + __u8 singlestep; + __u8 padding[7]; +}; + #endif /* _UAPI_ASM_X86_KVMI_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 8fbf1720749b..672a113b3bf4 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -1350,3 +1350,8 @@ static void kvmi_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot, kvmi_put(kvm); } + +void kvmi_arch_features(struct kvmi_features *feat) +{ + feat->singlestep = !!kvm_x86_ops.control_singlestep; +} diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index dc7ba12498b7..a84affbafa67 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -101,6 +101,7 @@ struct kvmi_error_code { struct kvmi_get_version_reply { __u32 version; __u32 padding; + struct kvmi_features features; }; struct kvmi_vm_check_command { diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 21b3f7a459c8..eabe7dae149e 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -56,6 +56,8 @@ struct vcpu_worker_data { bool restart_on_shutdown; }; +static struct kvmi_features features; + typedef void (*fct_pf_event)(struct kvm_vm *vm, struct kvmi_msg_hdr *hdr, struct pf_ev *ev, struct vcpu_reply *rpl); @@ -437,7 +439,10 @@ static void test_cmd_get_version(void) "Unexpected KVMI version %d, expecting %d\n", rpl.version, KVMI_VERSION); + features = rpl.features; + pr_info("KVMI version: %u\n", rpl.version); + pr_info("\tsinglestep: %u\n", features.singlestep); } static void cmd_vm_check_command(__u16 id, __u16 padding, int expected_err) diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 9f2341fe21d5..68b8d60a7fac 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -138,5 +138,6 @@ void kvmi_arch_update_page_tracking(struct kvm *kvm, struct kvmi_mem_access *m); void kvmi_arch_hook(struct kvm *kvm); void kvmi_arch_unhook(struct kvm *kvm); +void kvmi_arch_features(struct kvmi_features *feat); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 0a0d10b43f2d..e754cee48912 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -148,6 +148,8 @@ static int handle_get_version(struct kvm_introspection *kvmi, memset(&rpl, 0, sizeof(rpl)); rpl.version = kvmi_version(); + kvmi_arch_features(&rpl.features); + return kvmi_msg_vm_reply(kvmi, msg, 0, &rpl, sizeof(rpl)); }
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 80/84] KVM: introspection: add KVMI_VCPU_CONTROL_SINGLESTEP
From: Nicu?or C??u <ncitu at bitdefender.com> The next commit that adds the KVMI_EVENT_SINGLESTEP event will make this command more useful. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 32 ++++++++++ arch/x86/kvm/kvmi.c | 18 ++++++ arch/x86/kvm/x86.c | 12 +++- include/linux/kvmi_host.h | 7 +++ include/uapi/linux/kvmi.h | 7 +++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 46 ++++++++++++++ virt/kvm/introspection/kvmi.c | 26 +++++++- virt/kvm/introspection/kvmi_int.h | 2 + virt/kvm/introspection/kvmi_msg.c | 60 +++++++++++++++---- 9 files changed, 193 insertions(+), 17 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 47387f297029..0a07ef101302 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -1049,6 +1049,38 @@ In order to 'forget' an address, all three bits ('rwx') must be set. * -KVM_EAGAIN - the selected vCPU can't be introspected yet * -KVM_ENOMEM - there is not enough memory to add the page tracking structures +24. KVMI_VCPU_CONTROL_SINGLESTEP +-------------------------------- + +:Architectures: x86 (vmx) +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_control_singlestep { + __u8 enable; + __u8 padding[7]; + }; + +:Returns: + +:: + + struct kvmi_error_code; + +Enables/disables singlestep for the selected vCPU. + +The introspection tool should use *KVMI_GET_VERSION*, to check +if the hardware supports singlestep (see **KVMI_GET_VERSION**). + +:Errors: + +* -KVM_EOPNOTSUPP - the hardware doesn't support singlestep +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet + Events ===== diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 672a113b3bf4..18713004152d 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -1355,3 +1355,21 @@ void kvmi_arch_features(struct kvmi_features *feat) { feat->singlestep = !!kvm_x86_ops.control_singlestep; } + +bool kvmi_arch_start_singlestep(struct kvm_vcpu *vcpu) +{ + if (!kvm_x86_ops.control_singlestep) + return false; + + kvm_x86_ops.control_singlestep(vcpu, true); + return true; +} + +bool kvmi_arch_stop_singlestep(struct kvm_vcpu *vcpu) +{ + if (!kvm_x86_ops.control_singlestep) + return false; + + kvm_x86_ops.control_singlestep(vcpu, false); + return true; +} diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0add0b0b8f2d..02b74a57ca01 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8515,9 +8515,15 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) goto out; } - inject_pending_event(vcpu, &req_immediate_exit); - if (req_int_win) - kvm_x86_ops.enable_irq_window(vcpu); + if (!kvmi_vcpu_running_singlestep(vcpu)) { + /* + * We cannot inject events during single-stepping. + * Try again later. + */ + inject_pending_event(vcpu, &req_immediate_exit); + if (req_int_win) + kvm_x86_ops.enable_irq_window(vcpu); + } if (kvm_lapic_enabled(vcpu)) { update_cr8_intercept(vcpu); diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 11eb9b1c3c5e..a641768027cc 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -45,6 +45,10 @@ struct kvm_vcpu_introspection { bool pending; bool send_event; } exception; + + struct { + bool loop; + } singlestep; }; struct kvm_introspection { @@ -89,6 +93,7 @@ void kvmi_handle_requests(struct kvm_vcpu *vcpu); bool kvmi_hypercall_event(struct kvm_vcpu *vcpu); bool kvmi_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len); bool kvmi_enter_guest(struct kvm_vcpu *vcpu); +bool kvmi_vcpu_running_singlestep(struct kvm_vcpu *vcpu); #else @@ -105,6 +110,8 @@ static inline bool kvmi_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, { return true; } static inline bool kvmi_enter_guest(struct kvm_vcpu *vcpu) { return true; } +static inline bool kvmi_vcpu_running_singlestep(struct kvm_vcpu *vcpu) + { return false; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index a84affbafa67..bc515237612a 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -47,6 +47,8 @@ enum { KVMI_VM_SET_PAGE_ACCESS = 23, + KVMI_VCPU_CONTROL_SINGLESTEP = 24, + KVMI_NUM_MESSAGES }; @@ -189,6 +191,11 @@ struct kvmi_vm_set_page_access { struct kvmi_page_access_entry entries[0]; }; +struct kvmi_vcpu_control_singlestep { + __u8 enable; + __u8 padding[7]; +}; + struct kvmi_event { __u16 size; __u16 vcpu; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index eabe7dae149e..0803d7e5af1e 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -1940,6 +1940,51 @@ static void test_event_pf(struct kvm_vm *vm) test_pf(vm, cbk_test_event_pf); } +static void cmd_vcpu_singlestep(struct kvm_vm *vm, __u8 enable, __u8 padding, + int expected_err) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_control_singlestep cmd; + } req = {}; + int r; + + req.cmd.enable = enable; + req.cmd.padding[6] = padding; + + r = do_vcpu0_command(vm, KVMI_VCPU_CONTROL_SINGLESTEP, + &req.hdr, sizeof(req), NULL, 0); + TEST_ASSERT(r == expected_err, + "KVMI_VCPU_CONTROL_SINGLESTEP failed, error %d(%s), expected error %d\n", + -r, kvm_strerror(-r), expected_err); +} + +static void test_supported_singlestep(struct kvm_vm *vm) +{ + __u8 disable = 0, enable = 1, enable_inval = 2; + __u8 padding = 1, no_padding = 0; + + cmd_vcpu_singlestep(vm, enable, no_padding, 0); + cmd_vcpu_singlestep(vm, disable, no_padding, 0); + + cmd_vcpu_singlestep(vm, enable, padding, -KVM_EINVAL); + cmd_vcpu_singlestep(vm, enable_inval, no_padding, -KVM_EINVAL); +} + +static void test_unsupported_singlestep(struct kvm_vm *vm) +{ + cmd_vcpu_singlestep(vm, 1, 0, -KVM_EOPNOTSUPP); +} + +static void test_cmd_vcpu_control_singlestep(struct kvm_vm *vm) +{ + if (features.singlestep) + test_supported_singlestep(vm); + else + test_unsupported_singlestep(vm); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -1974,6 +2019,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vcpu_control_msr(vm); test_cmd_vm_set_page_access(vm); test_event_pf(vm); + test_cmd_vcpu_control_singlestep(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 99c88e182587..2c7533a966f9 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -429,6 +429,11 @@ static void kvmi_job_release_vcpu(struct kvm_vcpu *vcpu, void *ctx) atomic_set(&vcpui->pause_requests, 0); vcpui->waiting_for_reply = false; + + if (vcpui->singlestep.loop) { + kvmi_arch_stop_singlestep(vcpu); + vcpui->singlestep.loop = false; + } } static void kvmi_release_vcpus(struct kvm *kvm) @@ -1047,7 +1052,9 @@ bool kvmi_enter_guest(struct kvm_vcpu *vcpu) vcpui = VCPUI(vcpu); - if (vcpui->exception.pending) { + if (vcpui->singlestep.loop) { + kvmi_arch_start_singlestep(vcpu); + } else if (vcpui->exception.pending) { kvmi_inject_pending_exception(vcpu); r = false; } @@ -1297,3 +1304,20 @@ void kvmi_remove_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) spin_unlock(&kvm->mmu_lock); srcu_read_unlock(&kvm->srcu, idx); } + +bool kvmi_vcpu_running_singlestep(struct kvm_vcpu *vcpu) +{ + struct kvm_introspection *kvmi; + bool ret; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return false; + + ret = VCPUI(vcpu)->singlestep.loop; + + kvmi_put(vcpu->kvm); + + return ret; +} +EXPORT_SYMBOL(kvmi_vcpu_running_singlestep); diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index 68b8d60a7fac..e5fca3502bab 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -139,5 +139,7 @@ void kvmi_arch_update_page_tracking(struct kvm *kvm, void kvmi_arch_hook(struct kvm *kvm); void kvmi_arch_unhook(struct kvm *kvm); void kvmi_arch_features(struct kvmi_features *feat); +bool kvmi_arch_start_singlestep(struct kvm_vcpu *vcpu); +bool kvmi_arch_stop_singlestep(struct kvm_vcpu *vcpu); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index e754cee48912..04e7511a9777 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -609,6 +609,39 @@ static int handle_vcpu_control_msr(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); } +static int handle_vcpu_control_singlestep(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vcpu_control_singlestep *req = _req; + struct kvm_vcpu *vcpu = job->vcpu; + int ec = -KVM_EINVAL; + bool done; + int i; + + if (req->enable > 1) + goto reply; + + for (i = 0; i < sizeof(req->padding); i++) + if (req->padding[i]) + goto reply; + + if (req->enable) + done = kvmi_arch_start_singlestep(vcpu); + else + done = kvmi_arch_stop_singlestep(vcpu); + + if (done) { + ec = 0; + VCPUI(vcpu)->singlestep.loop = !!req->enable; + } else { + ec = -KVM_EOPNOTSUPP; + } + +reply: + return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -617,19 +650,20 @@ static int handle_vcpu_control_msr(const struct kvmi_vcpu_msg_job *job, */ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, const struct kvmi_msg_hdr *, const void *) = { - [KVMI_EVENT] = handle_vcpu_event_reply, - [KVMI_VCPU_CONTROL_CR] = handle_vcpu_control_cr, - [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, - [KVMI_VCPU_CONTROL_MSR] = handle_vcpu_control_msr, - [KVMI_VCPU_GET_CPUID] = handle_vcpu_get_cpuid, - [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, - [KVMI_VCPU_GET_MTRR_TYPE] = handle_vcpu_get_mtrr_type, - [KVMI_VCPU_GET_REGISTERS] = handle_vcpu_get_registers, - [KVMI_VCPU_GET_XCR] = handle_vcpu_get_xcr, - [KVMI_VCPU_GET_XSAVE] = handle_vcpu_get_xsave, - [KVMI_VCPU_INJECT_EXCEPTION] = handle_vcpu_inject_exception, - [KVMI_VCPU_SET_REGISTERS] = handle_vcpu_set_registers, - [KVMI_VCPU_SET_XSAVE] = handle_vcpu_set_xsave, + [KVMI_EVENT] = handle_vcpu_event_reply, + [KVMI_VCPU_CONTROL_CR] = handle_vcpu_control_cr, + [KVMI_VCPU_CONTROL_EVENTS] = handle_vcpu_control_events, + [KVMI_VCPU_CONTROL_MSR] = handle_vcpu_control_msr, + [KVMI_VCPU_CONTROL_SINGLESTEP] = handle_vcpu_control_singlestep, + [KVMI_VCPU_GET_CPUID] = handle_vcpu_get_cpuid, + [KVMI_VCPU_GET_INFO] = handle_vcpu_get_info, + [KVMI_VCPU_GET_MTRR_TYPE] = handle_vcpu_get_mtrr_type, + [KVMI_VCPU_GET_REGISTERS] = handle_vcpu_get_registers, + [KVMI_VCPU_GET_XCR] = handle_vcpu_get_xcr, + [KVMI_VCPU_GET_XSAVE] = handle_vcpu_get_xsave, + [KVMI_VCPU_INJECT_EXCEPTION] = handle_vcpu_inject_exception, + [KVMI_VCPU_SET_REGISTERS] = handle_vcpu_set_registers, + [KVMI_VCPU_SET_XSAVE] = handle_vcpu_set_xsave, }; static bool is_vcpu_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 81/84] KVM: introspection: add KVMI_EVENT_SINGLESTEP
From: Nicu?or C??u <ncitu at bitdefender.com> This event is sent after each instruction when the singlestep has been enabled for a vCPU. Signed-off-by: Nicu?or C??u <ncitu at bitdefender.com> Co-developed-by: Adalbert Laz?r <alazar at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 31 +++++++++ arch/x86/kvm/vmx/vmx.c | 6 ++ include/linux/kvmi_host.h | 4 ++ include/uapi/linux/kvmi.h | 6 ++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 65 +++++++++++++++++-- virt/kvm/introspection/kvmi.c | 60 +++++++++++++++++ virt/kvm/introspection/kvmi_msg.c | 5 ++ 7 files changed, 172 insertions(+), 5 deletions(-) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 0a07ef101302..3c481c1b2186 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -576,6 +576,7 @@ because these are sent as a result of certain commands (but they can be disallowed by the device manager) :: KVMI_EVENT_PAUSE_VCPU + KVMI_EVENT_SINGLESTEP KVMI_EVENT_TRAP The VM events (e.g. *KVMI_EVENT_UNHOOK*) are controlled with @@ -1075,8 +1076,12 @@ Enables/disables singlestep for the selected vCPU. The introspection tool should use *KVMI_GET_VERSION*, to check if the hardware supports singlestep (see **KVMI_GET_VERSION**). +After every instruction, a *KVMI_EVENT_SINGLESTEP* event is sent +to the introspection tool. + :Errors: +* -KVM_EPERM - the *KVMI_EVENT_SINGLESTEP* event is disallowed * -KVM_EOPNOTSUPP - the hardware doesn't support singlestep * -KVM_EINVAL - the padding is not zero * -KVM_EAGAIN - the selected vCPU can't be introspected yet @@ -1481,3 +1486,29 @@ emulation). The *RETRY* action is used by the introspection tool to retry the execution of the current instruction, usually because it changed the instruction pointer or the page restrictions. + +11. KVMI_EVENT_SINGLESTEP +------------------------- + +:Architectures: x86 +:Versions: >= 1 +:Actions: CONTINUE, CRASH +:Parameters: + +:: + + struct kvmi_event; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + struct kvmi_event_singlestep { + __u8 failed; + __u8 padding[7]; + }; + +This event is sent after each instruction, as long as the singlestep is +enabled for the current vCPU (see **KVMI_VCPU_CONTROL_SINGLESTEP**). diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f57587ddb3be..8c9ccd1ba0f0 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5598,6 +5598,7 @@ static int handle_invalid_op(struct kvm_vcpu *vcpu) static int handle_monitor_trap(struct kvm_vcpu *vcpu) { + kvmi_singlestep_done(vcpu); return 1; } @@ -6173,6 +6174,11 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) } } + if (kvmi_vcpu_running_singlestep(vcpu) && + exit_reason != EXIT_REASON_EPT_VIOLATION && + exit_reason != EXIT_REASON_MONITOR_TRAP_FLAG) + kvmi_singlestep_failed(vcpu); + if (exit_fastpath != EXIT_FASTPATH_NONE) return 1; diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index a641768027cc..b01e8505f493 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -94,6 +94,8 @@ bool kvmi_hypercall_event(struct kvm_vcpu *vcpu); bool kvmi_breakpoint_event(struct kvm_vcpu *vcpu, u64 gva, u8 insn_len); bool kvmi_enter_guest(struct kvm_vcpu *vcpu); bool kvmi_vcpu_running_singlestep(struct kvm_vcpu *vcpu); +void kvmi_singlestep_done(struct kvm_vcpu *vcpu); +void kvmi_singlestep_failed(struct kvm_vcpu *vcpu); #else @@ -112,6 +114,8 @@ static inline bool kvmi_enter_guest(struct kvm_vcpu *vcpu) { return true; } static inline bool kvmi_vcpu_running_singlestep(struct kvm_vcpu *vcpu) { return false; } +static inline void kvmi_singlestep_done(struct kvm_vcpu *vcpu) { } +static inline void kvmi_singlestep_failed(struct kvm_vcpu *vcpu) { } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index bc515237612a..040049abd450 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -63,6 +63,7 @@ enum { KVMI_EVENT_DESCRIPTOR = 7, KVMI_EVENT_MSR = 8, KVMI_EVENT_PF = 9, + KVMI_EVENT_SINGLESTEP = 10, KVMI_NUM_EVENTS }; @@ -236,4 +237,9 @@ struct kvmi_event_pf { __u32 padding3; }; +struct kvmi_event_singlestep { + __u8 failed; + __u8 padding[7]; +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 0803d7e5af1e..967ea568d93c 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -829,6 +829,14 @@ static void stop_vcpu_worker(pthread_t vcpu_thread, wait_vcpu_worker(vcpu_thread); } +static int __do_vcpu_command(struct kvm_vm *vm, int cmd_id, + struct kvmi_msg_hdr *req, size_t req_size, + void *rpl, size_t rpl_size) +{ + send_message(cmd_id, req, req_size); + return receive_cmd_reply(req, rpl, rpl_size); +} + static int do_vcpu_command(struct kvm_vm *vm, int cmd_id, struct kvmi_msg_hdr *req, size_t req_size, void *rpl, size_t rpl_size) @@ -839,8 +847,7 @@ static int do_vcpu_command(struct kvm_vm *vm, int cmd_id, vcpu_thread = start_vcpu_worker(&data); - send_message(cmd_id, req, req_size); - r = receive_cmd_reply(req, rpl, rpl_size); + r = __do_vcpu_command(vm, cmd_id, req, req_size, rpl, rpl_size); stop_vcpu_worker(vcpu_thread, &data); return r; @@ -1960,13 +1967,61 @@ static void cmd_vcpu_singlestep(struct kvm_vm *vm, __u8 enable, __u8 padding, -r, kvm_strerror(-r), expected_err); } +static void __control_singlestep(bool enable) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_control_singlestep cmd; + } req = {}; + int r; + + req.cmd.enable = enable ? 1 : 0; + + r = __do_vcpu0_command(KVMI_VCPU_CONTROL_SINGLESTEP, + &req.hdr, sizeof(req), NULL, 0); + TEST_ASSERT(r == 0, + "KVMI_VCPU_CONTROL_SINGLESTEP failed, error %d(%s)\n", + -r, kvm_strerror(-r)); +} + +static void test_singlestep_event(__u16 event_id) +{ + struct { + struct kvmi_event common; + struct kvmi_event_singlestep singlestep; + } ev; + bool enable = true, disable = false; + struct vcpu_reply rpl = { }; + struct kvmi_msg_hdr hdr; + + __control_singlestep(enable); + + receive_event(&hdr, &ev.common, sizeof(ev), event_id); + + pr_info("SINGLESTEP event, rip 0x%llx success %d\n", + ev.common.arch.regs.rip, !ev.singlestep.failed); + TEST_ASSERT(!ev.singlestep.failed, "Singlestep failed"); + + __control_singlestep(disable); + + reply_to_event(&hdr, &ev.common, KVMI_EVENT_ACTION_CONTINUE, + &rpl, sizeof(rpl)); +} + static void test_supported_singlestep(struct kvm_vm *vm) { - __u8 disable = 0, enable = 1, enable_inval = 2; + struct vcpu_worker_data data = {.vm = vm, .vcpu_id = VCPU_ID }; + __u16 event_id = KVMI_EVENT_SINGLESTEP; + __u8 enable = 1, enable_inval = 2; __u8 padding = 1, no_padding = 0; + pthread_t vcpu_thread; - cmd_vcpu_singlestep(vm, enable, no_padding, 0); - cmd_vcpu_singlestep(vm, disable, no_padding, 0); + enable_vcpu_event(vm, event_id); + vcpu_thread = start_vcpu_worker(&data); + test_singlestep_event(event_id); + stop_vcpu_worker(vcpu_thread, &data); + disable_vcpu_event(vm, event_id); cmd_vcpu_singlestep(vm, enable, padding, -KVM_EINVAL); cmd_vcpu_singlestep(vm, enable_inval, no_padding, -KVM_EINVAL); diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 2c7533a966f9..5382569b190b 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -114,6 +114,7 @@ static void setup_known_events(void) set_bit(KVMI_EVENT_MSR, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_PAUSE_VCPU, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_PF, Kvmi_known_vcpu_events); + set_bit(KVMI_EVENT_SINGLESTEP, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_TRAP, Kvmi_known_vcpu_events); set_bit(KVMI_EVENT_XSETBV, Kvmi_known_vcpu_events); @@ -1321,3 +1322,62 @@ bool kvmi_vcpu_running_singlestep(struct kvm_vcpu *vcpu) return ret; } EXPORT_SYMBOL(kvmi_vcpu_running_singlestep); + +static u32 kvmi_send_singlestep(struct kvm_vcpu *vcpu, bool success) +{ + struct kvmi_event_singlestep e; + int err, action; + + memset(&e, 0, sizeof(e)); + e.failed = success ? 0 : 1; + + err = kvmi_send_event(vcpu, KVMI_EVENT_SINGLESTEP, &e, sizeof(e), + NULL, 0, &action); + if (err) + return KVMI_EVENT_ACTION_CONTINUE; + + return action; +} + +static void kvmi_singlestep_event(struct kvm_vcpu *vcpu, bool success) +{ + u32 action; + + action = kvmi_send_singlestep(vcpu, success); + switch (action) { + case KVMI_EVENT_ACTION_CONTINUE: + break; + default: + kvmi_handle_common_event_actions(vcpu->kvm, action); + } +} + +static void kvmi_handle_singlestep_exit(struct kvm_vcpu *vcpu, bool success) +{ + struct kvm_vcpu_introspection *vcpui; + struct kvm_introspection *kvmi; + struct kvm *kvm = vcpu->kvm; + + kvmi = kvmi_get(kvm); + if (!kvmi) + return; + + vcpui = VCPUI(vcpu); + + if (vcpui->singlestep.loop) + kvmi_singlestep_event(vcpu, success); + + kvmi_put(kvm); +} + +void kvmi_singlestep_done(struct kvm_vcpu *vcpu) +{ + kvmi_handle_singlestep_exit(vcpu, true); +} +EXPORT_SYMBOL(kvmi_singlestep_done); + +void kvmi_singlestep_failed(struct kvm_vcpu *vcpu) +{ + kvmi_handle_singlestep_exit(vcpu, false); +} +EXPORT_SYMBOL(kvmi_singlestep_failed); diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 04e7511a9777..645debc47f13 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -626,6 +626,11 @@ static int handle_vcpu_control_singlestep(const struct kvmi_vcpu_msg_job *job, if (req->padding[i]) goto reply; + if (!is_event_allowed(KVMI(vcpu->kvm), KVMI_EVENT_SINGLESTEP)) { + ec = -KVM_EPERM; + goto reply; + } + if (req->enable) done = kvmi_arch_start_singlestep(vcpu); else
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 82/84] KVM: introspection: add KVMI_VCPU_TRANSLATE_GVA
This helps the introspection tool with the GVA to GPA translations without the need to read or monitor the guest page tables. Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- Documentation/virt/kvm/kvmi.rst | 32 +++++++++++++++++++ arch/x86/kvm/kvmi.c | 5 +++ include/uapi/linux/kvmi.h | 9 ++++++ .../testing/selftests/kvm/x86_64/kvmi_test.c | 30 +++++++++++++++++ virt/kvm/introspection/kvmi_int.h | 1 + virt/kvm/introspection/kvmi_msg.c | 15 +++++++++ 6 files changed, 92 insertions(+) diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 3c481c1b2186..62138fa4b65c 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -1086,6 +1086,38 @@ to the introspection tool. * -KVM_EINVAL - the padding is not zero * -KVM_EAGAIN - the selected vCPU can't be introspected yet +25. KVMI_VCPU_TRANSLATE_GVA +--------------------------- + +:Architecture: x86 +:Versions: >= 1 +:Parameters: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_translate_gva { + __u64 gva; + }; + +:Returns: + +:: + + struct kvmi_error_code; + struct kvmi_vcpu_translate_gva_reply { + __u64 gpa; + }; + +Translates a guest virtual address to a guest physical address or ~0 if +the address cannot be translated. + +:Errors: + +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - the padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet + Events ===== diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 18713004152d..8051d06064ab 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -1373,3 +1373,8 @@ bool kvmi_arch_stop_singlestep(struct kvm_vcpu *vcpu) kvm_x86_ops.control_singlestep(vcpu, false); return true; } + +gpa_t kvmi_arch_cmd_translate_gva(struct kvm_vcpu *vcpu, gva_t gva) +{ + return kvm_mmu_gva_to_gpa_system(vcpu, gva, 0, NULL); +} diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index 040049abd450..3c15c17d28e3 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -48,6 +48,7 @@ enum { KVMI_VM_SET_PAGE_ACCESS = 23, KVMI_VCPU_CONTROL_SINGLESTEP = 24, + KVMI_VCPU_TRANSLATE_GVA = 25, KVMI_NUM_MESSAGES }; @@ -242,4 +243,12 @@ struct kvmi_event_singlestep { __u8 padding[7]; }; +struct kvmi_vcpu_translate_gva { + __u64 gva; +}; + +struct kvmi_vcpu_translate_gva_reply { + __u64 gpa; +}; + #endif /* _UAPI__LINUX_KVMI_H */ diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 967ea568d93c..e968b1a6f969 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -2040,6 +2040,35 @@ static void test_cmd_vcpu_control_singlestep(struct kvm_vm *vm) test_unsupported_singlestep(vm); } +static void cmd_translate_gva(struct kvm_vm *vm, vm_vaddr_t gva, + vm_paddr_t expected_gpa) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_translate_gva cmd; + } req = { 0 }; + struct kvmi_vcpu_translate_gva_reply rpl; + + req.cmd.gva = gva; + + test_vcpu0_command(vm, KVMI_VCPU_TRANSLATE_GVA, &req.hdr, sizeof(req), + &rpl, sizeof(rpl)); + TEST_ASSERT(rpl.gpa == expected_gpa, + "Translation failed for gva 0x%lx -> gpa 0x%llx instead of 0x%lx\n", + gva, rpl.gpa, expected_gpa); +} + +static void test_cmd_translate_gva(struct kvm_vm *vm) +{ + cmd_translate_gva(vm, test_gva, test_gpa); + pr_info("Tested gva 0x%lx to gpa 0x%lx\n", test_gva, test_gpa); + + cmd_translate_gva(vm, -1, ~0); + pr_info("Tested gva 0x%lx to gpa 0x%lx\n", + (vm_vaddr_t)-1, (vm_paddr_t)-1); +} + static void test_introspection(struct kvm_vm *vm) { srandom(time(0)); @@ -2075,6 +2104,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vm_set_page_access(vm); test_event_pf(vm); test_cmd_vcpu_control_singlestep(vm); + test_cmd_translate_gva(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index e5fca3502bab..cb8453f0fb87 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -141,5 +141,6 @@ void kvmi_arch_unhook(struct kvm *kvm); void kvmi_arch_features(struct kvmi_features *feat); bool kvmi_arch_start_singlestep(struct kvm_vcpu *vcpu); bool kvmi_arch_stop_singlestep(struct kvm_vcpu *vcpu); +gpa_t kvmi_arch_cmd_translate_gva(struct kvm_vcpu *vcpu, gva_t gva); #endif diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 645debc47f13..d8874bd7a8b7 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -647,6 +647,20 @@ static int handle_vcpu_control_singlestep(const struct kvmi_vcpu_msg_job *job, return kvmi_msg_vcpu_reply(job, msg, ec, NULL, 0); } +static int handle_vcpu_translate_gva(const struct kvmi_vcpu_msg_job *job, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vcpu_translate_gva *req = _req; + struct kvmi_vcpu_translate_gva_reply rpl; + + memset(&rpl, 0, sizeof(rpl)); + + rpl.gpa = kvmi_arch_cmd_translate_gva(job->vcpu, req->gva); + + return kvmi_msg_vcpu_reply(job, msg, 0, &rpl, sizeof(rpl)); +} + /* * These functions are executed from the vCPU thread. The receiving thread * passes the messages using a newly allocated 'struct kvmi_vcpu_msg_job' @@ -669,6 +683,7 @@ static int(*const msg_vcpu[])(const struct kvmi_vcpu_msg_job *, [KVMI_VCPU_INJECT_EXCEPTION] = handle_vcpu_inject_exception, [KVMI_VCPU_SET_REGISTERS] = handle_vcpu_set_registers, [KVMI_VCPU_SET_XSAVE] = handle_vcpu_set_xsave, + [KVMI_VCPU_TRANSLATE_GVA] = handle_vcpu_translate_gva, }; static bool is_vcpu_command(u16 id)
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 83/84] KVM: introspection: emulate a guest page table walk on SPT violations due to A/D bit updates
From: Mihai Don?u <mdontu at bitdefender.com> On SPT page faults caused by guest page table walks, use the existing guest page table walk code to make the necessary adjustments to the A/D bits and return to guest. This effectively bypasses the x86 emulator who was making the wrong modifications leading one OS (Windows 8.1 x64) to triple-fault very early in the boot process with the introspection enabled. With introspection disabled, these faults are handled by simply removing the protection from the affected guest page and returning to guest. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/include/asm/kvmi_host.h | 2 ++ arch/x86/kvm/kvmi.c | 33 ++++++++++++++++++++++++++++++++ arch/x86/kvm/mmu/mmu.c | 12 ++++++++++-- include/linux/kvmi_host.h | 3 +++ virt/kvm/introspection/kvmi.c | 26 +++++++++++++++++++++++++ 5 files changed, 74 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvmi_host.h b/arch/x86/include/asm/kvmi_host.h index 25c7bb8a9082..509fa3fff5e7 100644 --- a/arch/x86/include/asm/kvmi_host.h +++ b/arch/x86/include/asm/kvmi_host.h @@ -64,6 +64,7 @@ bool kvmi_descriptor_event(struct kvm_vcpu *vcpu, u8 descriptor, bool write); bool kvmi_msr_event(struct kvm_vcpu *vcpu, struct msr_data *msr); bool kvmi_monitor_msrw_intercept(struct kvm_vcpu *vcpu, u32 msr, bool enable); bool kvmi_msrw_intercept_originator(struct kvm_vcpu *vcpu); +bool kvmi_update_ad_flags(struct kvm_vcpu *vcpu); #else /* CONFIG_KVM_INTROSPECTION */ @@ -88,6 +89,7 @@ static inline bool kvmi_monitor_msrw_intercept(struct kvm_vcpu *vcpu, u32 msr, bool enable) { return false; } static inline bool kvmi_msrw_intercept_originator(struct kvm_vcpu *vcpu) { return false; } +static inline bool kvmi_update_ad_flags(struct kvm_vcpu *vcpu) { return false; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 8051d06064ab..4e75858c03b4 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -1378,3 +1378,36 @@ gpa_t kvmi_arch_cmd_translate_gva(struct kvm_vcpu *vcpu, gva_t gva) { return kvm_mmu_gva_to_gpa_system(vcpu, gva, 0, NULL); } + +bool kvmi_update_ad_flags(struct kvm_vcpu *vcpu) +{ + struct kvm_introspection *kvmi; + bool ret = false; + gva_t gva; + gpa_t gpa; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return false; + + gva = kvm_x86_ops.fault_gla(vcpu); + if (gva == ~0ull) { + kvmi_warn_once(kvmi, "%s: cannot perform translation\n", + __func__); + goto out; + } + + gpa = kvm_mmu_gva_to_gpa_system(vcpu, gva, PFERR_WRITE_MASK, NULL); + if (gpa == UNMAPPED_GVA) { + struct x86_exception exception = { }; + + gpa = kvm_mmu_gva_to_gpa_system(vcpu, gva, 0, &exception); + } + + ret = (gpa != UNMAPPED_GVA); + +out: + kvmi_put(vcpu->kvm); + + return ret; +} diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 4df5b729e2c5..97766f34910d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -40,6 +40,7 @@ #include <linux/hash.h> #include <linux/kern_levels.h> #include <linux/kthread.h> +#include <linux/kvmi_host.h> #include <asm/page.h> #include <asm/memtype.h> @@ -5549,8 +5550,15 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, */ if (vcpu->arch.mmu->direct_map && (error_code & PFERR_NESTED_GUEST_PAGE) == PFERR_NESTED_GUEST_PAGE) { - kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(cr2_or_gpa)); - return 1; + gfn_t gfn = gpa_to_gfn(cr2_or_gpa); + + if (kvmi_tracked_gfn(vcpu, gfn)) { + if (kvmi_update_ad_flags(vcpu)) + return 1; + } else { + kvm_mmu_unprotect_page(vcpu->kvm, gfn); + return 1; + } } /* diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index b01e8505f493..5baef68d8cbe 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -96,6 +96,7 @@ bool kvmi_enter_guest(struct kvm_vcpu *vcpu); bool kvmi_vcpu_running_singlestep(struct kvm_vcpu *vcpu); void kvmi_singlestep_done(struct kvm_vcpu *vcpu); void kvmi_singlestep_failed(struct kvm_vcpu *vcpu); +bool kvmi_tracked_gfn(struct kvm_vcpu *vcpu, gfn_t gfn); #else @@ -116,6 +117,8 @@ static inline bool kvmi_vcpu_running_singlestep(struct kvm_vcpu *vcpu) { return false; } static inline void kvmi_singlestep_done(struct kvm_vcpu *vcpu) { } static inline void kvmi_singlestep_failed(struct kvm_vcpu *vcpu) { } +static inline bool kvmi_tracked_gfn(struct kvm_vcpu *vcpu, gfn_t gfn) + { return false; } #endif /* CONFIG_KVM_INTROSPECTION */ diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index 5382569b190b..2a96b80bddb2 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -1381,3 +1381,29 @@ void kvmi_singlestep_failed(struct kvm_vcpu *vcpu) kvmi_handle_singlestep_exit(vcpu, false); } EXPORT_SYMBOL(kvmi_singlestep_failed); + +static bool __kvmi_tracked_gfn(struct kvm_introspection *kvmi, gfn_t gfn) +{ + u8 ignored_access; + + if (kvmi_get_gfn_access(kvmi, gfn, &ignored_access)) + return false; + + return true; +} + +bool kvmi_tracked_gfn(struct kvm_vcpu *vcpu, gfn_t gfn) +{ + struct kvm_introspection *kvmi; + bool ret; + + kvmi = kvmi_get(vcpu->kvm); + if (!kvmi) + return false; + + ret = __kvmi_tracked_gfn(kvmi, gfn); + + kvmi_put(vcpu->kvm); + + return ret; +}
Adalbert Lazăr
2020-Jul-21 21:09 UTC
[PATCH v9 84/84] KVM: x86: call the page tracking code on emulation failure
From: Mihai Don?u <mdontu at bitdefender.com> The information we can provide this way is incomplete, but current users of the page tracking code can work with it. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/x86.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 02b74a57ca01..feb20b29bb92 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6911,6 +6911,51 @@ static bool is_vmware_backdoor_opcode(struct x86_emulate_ctxt *ctxt) return false; } +/* + * With introspection enabled, emulation failures translate in events being + * missed because the read/write callbacks are not invoked. All we have is + * the fetch event (kvm_page_track_preexec). Below we use the EPT/NPT VMEXIT + * information to generate the events, but without providing accurate + * data and size (the emulator would have computed those). If an instruction + * would happen to read and write in the same page, the second event will + * initially be missed and we rely on the page tracking mechanism to bring + * us back here to send it. + */ +static bool kvm_page_track_emulation_failure(struct kvm_vcpu *vcpu, gpa_t gpa) +{ + u64 error_code = vcpu->arch.error_code; + u8 data = 0; + gva_t gva; + bool ret; + + /* MMIO emulation failures should be treated the normal way */ + if (unlikely(error_code & PFERR_RSVD_MASK)) + return true; + + /* EPT/NTP must be enabled */ + if (unlikely(!vcpu->arch.mmu->direct_map)) + return true; + + /* + * The A/D bit emulation should make this test unneeded, but just + * in case + */ + if (unlikely((error_code & PFERR_NESTED_GUEST_PAGE) =+ PFERR_NESTED_GUEST_PAGE)) + return true; + + gva = kvm_x86_ops.fault_gla(vcpu); + + if (error_code & PFERR_WRITE_MASK) + ret = kvm_page_track_prewrite(vcpu, gpa, gva, &data, 0); + else if (error_code & PFERR_USER_MASK) + ret = kvm_page_track_preread(vcpu, gpa, gva, 0); + else + ret = true; + + return ret; +} + int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, int emulation_type, void *insn, int insn_len) { @@ -6960,6 +7005,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, kvm_queue_exception(vcpu, UD_VECTOR); return 1; } + if (!kvm_page_track_emulation_failure(vcpu, cr2_or_gpa)) + return 1; if (reexecute_instruction(vcpu, cr2_or_gpa, write_fault_to_spt, emulation_type)) @@ -7029,6 +7076,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, return 1; if (r == EMULATION_FAILED) { + if (!kvm_page_track_emulation_failure(vcpu, cr2_or_gpa)) + return 1; if (reexecute_instruction(vcpu, cr2_or_gpa, write_fault_to_spt, emulation_type)) return 1;
kernel test robot
2020-Jul-22 01:31 UTC
[PATCH v9 30/84] KVM: x86: export kvm_vcpu_ioctl_x86_get_xsave()
Hi "Adalbert, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on 3d9fdc252b52023260de1d12399cb3157ed28c07] url: https://github.com/0day-ci/linux/commits/Adalbert-Laz-r/VM-introspection/20200722-052036 base: 3d9fdc252b52023260de1d12399cb3157ed28c07 config: mips-allmodconfig (attached as .config) compiler: mips-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=mips If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp at intel.com> All warnings (new ones prefixed by >>): In file included from arch/mips/kernel/asm-offsets.c:24:>> include/linux/kvm_host.h:887:14: warning: 'struct kvm_xsave' declared inside parameter list will not be visible outside of this definition or declaration887 | struct kvm_xsave *guest_xsave); | ^~~~~~~~~ arch/mips/kernel/asm-offsets.c:26:6: warning: no previous prototype for 'output_ptreg_defines' [-Wmissing-prototypes] 26 | void output_ptreg_defines(void) | ^~~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:78:6: warning: no previous prototype for 'output_task_defines' [-Wmissing-prototypes] 78 | void output_task_defines(void) | ^~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:93:6: warning: no previous prototype for 'output_thread_info_defines' [-Wmissing-prototypes] 93 | void output_thread_info_defines(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:110:6: warning: no previous prototype for 'output_thread_defines' [-Wmissing-prototypes] 110 | void output_thread_defines(void) | ^~~~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:138:6: warning: no previous prototype for 'output_thread_fpu_defines' [-Wmissing-prototypes] 138 | void output_thread_fpu_defines(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:181:6: warning: no previous prototype for 'output_mm_defines' [-Wmissing-prototypes] 181 | void output_mm_defines(void) | ^~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:220:6: warning: no previous prototype for 'output_sc_defines' [-Wmissing-prototypes] 220 | void output_sc_defines(void) | ^~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:255:6: warning: no previous prototype for 'output_signal_defined' [-Wmissing-prototypes] 255 | void output_signal_defined(void) | ^~~~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:322:6: warning: no previous prototype for 'output_pbe_defines' [-Wmissing-prototypes] 322 | void output_pbe_defines(void) | ^~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:334:6: warning: no previous prototype for 'output_pm_defines' [-Wmissing-prototypes] 334 | void output_pm_defines(void) | ^~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:348:6: warning: no previous prototype for 'output_kvm_defines' [-Wmissing-prototypes] 348 | void output_kvm_defines(void) | ^~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:392:6: warning: no previous prototype for 'output_cps_defines' [-Wmissing-prototypes] 392 | void output_cps_defines(void) | ^~~~~~~~~~~~~~~~~~ -- In file included from arch/mips/kernel/asm-offsets.c:24:>> include/linux/kvm_host.h:887:14: warning: 'struct kvm_xsave' declared inside parameter list will not be visible outside of this definition or declaration887 | struct kvm_xsave *guest_xsave); | ^~~~~~~~~ arch/mips/kernel/asm-offsets.c:26:6: warning: no previous prototype for 'output_ptreg_defines' [-Wmissing-prototypes] 26 | void output_ptreg_defines(void) | ^~~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:78:6: warning: no previous prototype for 'output_task_defines' [-Wmissing-prototypes] 78 | void output_task_defines(void) | ^~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:93:6: warning: no previous prototype for 'output_thread_info_defines' [-Wmissing-prototypes] 93 | void output_thread_info_defines(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:110:6: warning: no previous prototype for 'output_thread_defines' [-Wmissing-prototypes] 110 | void output_thread_defines(void) | ^~~~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:138:6: warning: no previous prototype for 'output_thread_fpu_defines' [-Wmissing-prototypes] 138 | void output_thread_fpu_defines(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:181:6: warning: no previous prototype for 'output_mm_defines' [-Wmissing-prototypes] 181 | void output_mm_defines(void) | ^~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:220:6: warning: no previous prototype for 'output_sc_defines' [-Wmissing-prototypes] 220 | void output_sc_defines(void) | ^~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:255:6: warning: no previous prototype for 'output_signal_defined' [-Wmissing-prototypes] 255 | void output_signal_defined(void) | ^~~~~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:322:6: warning: no previous prototype for 'output_pbe_defines' [-Wmissing-prototypes] 322 | void output_pbe_defines(void) | ^~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:334:6: warning: no previous prototype for 'output_pm_defines' [-Wmissing-prototypes] 334 | void output_pm_defines(void) | ^~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:348:6: warning: no previous prototype for 'output_kvm_defines' [-Wmissing-prototypes] 348 | void output_kvm_defines(void) | ^~~~~~~~~~~~~~~~~~ arch/mips/kernel/asm-offsets.c:392:6: warning: no previous prototype for 'output_cps_defines' [-Wmissing-prototypes] 392 | void output_cps_defines(void) | ^~~~~~~~~~~~~~~~~~ vim +887 include/linux/kvm_host.h 862 863 int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, 864 struct kvm_translation *tr); 865 866 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs); 867 void kvm_arch_vcpu_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs); 868 int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs); 869 void kvm_arch_vcpu_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs, 870 bool clear_exception); 871 int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, 872 struct kvm_sregs *sregs); 873 void kvm_arch_vcpu_get_sregs(struct kvm_vcpu *vcpu, 874 struct kvm_sregs *sregs); 875 int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, 876 struct kvm_sregs *sregs); 877 int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, 878 struct kvm_mp_state *mp_state); 879 int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, 880 struct kvm_mp_state *mp_state); 881 int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, 882 struct kvm_guest_debug *dbg); 883 int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu); 884 int kvm_arch_vcpu_set_guest_debug(struct kvm_vcpu *vcpu, 885 struct kvm_guest_debug *dbg); 886 void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, > 887 struct kvm_xsave *guest_xsave); 888 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all at lists.01.org -------------- next part -------------- A non-text attachment was scrubbed... Name: .config.gz Type: application/gzip Size: 66837 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20200722/3ba41a06/attachment-0001.gz>
On Wed, Jul 22, 2020 at 12:07:59AM +0300, Adalbert Laz??r wrote:> From: Mathieu Tarral <mathieu.tarral at protonmail.com> > > This function is used by VM introspection code to ungracefully shutdown > a guest at the request of the introspection tool. > > A security application will use this as the last resort to stop the > spread of a malware from a guest.I don't think your module has any business doing this. If at all it would be an EXPORT_SYMBOL_GPL, but the export is very questionable and needs a much better justification.
kernel test robot
2020-Jul-22 08:25 UTC
[PATCH v9 69/84] KVM: introspection: add KVMI_VCPU_GET_XCR
Hi "Adalbert, Thank you for the patch! Yet something to improve: [auto build test ERROR on 3d9fdc252b52023260de1d12399cb3157ed28c07] url: https://github.com/0day-ci/linux/commits/Adalbert-Laz-r/VM-introspection/20200722-052036 base: 3d9fdc252b52023260de1d12399cb3157ed28c07 config: x86_64-randconfig-s022-20200719 (attached as .config) compiler: gcc-9 (Debian 9.3.0-14) 9.3.0 reproduce: # apt-get install sparse # sparse version: v0.6.2-49-g707c5017-dirty # save the attached .config to linux build tree make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp at intel.com> All errors (new ones prefixed by >>): In file included from <command-line>:32:>> ./usr/include/asm/kvmi.h:99:2: error: unknown type name 'u64'99 | u64 value; | ^~~ --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all at lists.01.org -------------- next part -------------- A non-text attachment was scrubbed... Name: .config.gz Type: application/gzip Size: 33326 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20200722/ada7f03a/attachment-0001.gz>
Apparently Analagous Threads
- [RFC PATCH v7 69/78] KVM: introspection: add KVMI_VCPU_CONTROL_MSR and KVMI_EVENT_MSR
- [RFC PATCH v7 70/78] KVM: introspection: restore the state of MSR interception on unhook
- [RFC PATCH v6 55/92] kvm: introspection: add KVMI_CONTROL_MSR and KVMI_EVENT_MSR
- [PATCH v9 81/84] KVM: introspection: add KVMI_EVENT_SINGLESTEP
- [PATCH v9 00/84] VM introspection