Laurent Vivier
2007-Aug-16 08:58 UTC
[PATCH/RFC 3/4]Introduce "account modifiers" mechanism
[PATCH 3/3] introduce "account modifiers" mechanism in the kernel allowing a module to modify the collected accounting for a given task. This implementation is based on the "preempt_notifier". "account_system_time()" and "account_user_time()" can call functions registered by a module to modify the cputime value. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> -- ------------- Laurent.Vivier@bull.net -------------- "Software is hard" - Donald Knuth -------------- next part -------------- Index: kvm/include/linux/sched.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- kvm.orig/include/linux/sched.h 2007-08-16 15:23:27.000000000 +0200 +++ kvm/include/linux/sched.h 2007-08-16 15:23:41.000000000 +0200 @@ -955,6 +955,10 @@ /* list of struct preempt_notifier: */ struct hlist_head preempt_notifiers; #endif +#ifdef CONFIG_ACCOUNT_MODIFIERS + /* list of struct account_modifiers: */ + struct hlist_head account_modifiers; +#endif unsigned short ioprio; #ifdef CONFIG_BLK_DEV_IO_TRACE @@ -1873,6 +1877,31 @@ } #endif +#ifdef CONFIG_ACCOUNT_MODIFIERS +struct account_modifier; +struct account_ops { + cputime_t (*user_time)(struct account_modifier *, + struct task_struct *, cputime_t ); + cputime_t (*system_time)(struct account_modifier *, + struct task_struct *, int, cputime_t); +}; + +struct account_modifier { + struct hlist_node link; + struct account_ops *ops; +}; + +void account_modifier_register(struct account_modifier *modifier); +void account_modifier_unregister(struct account_modifier *modifier); + +static inline void account_modifier_init(struct account_modifier *modifier, + struct account_ops *ops) +{ + INIT_HLIST_NODE(&modifier->link); + modifier->ops =3D ops; +} +#endif + #endif /* __KERNEL__ */ #endif Index: kvm/kernel/sched.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- kvm.orig/kernel/sched.c 2007-08-16 15:15:33.000000000 +0200 +++ kvm/kernel/sched.c 2007-08-16 15:23:28.000000000 +0200 @@ -1593,6 +1593,9 @@ #ifdef CONFIG_PREEMPT_NOTIFIERS INIT_HLIST_HEAD(&p->preempt_notifiers); #endif +#ifdef CONFIG_ACCOUNT_MODIFIERS + INIT_HLIST_HEAD(&p->account_modifiers); +#endif /* * We mark the process as running here, but have not actually @@ -1731,6 +1734,62 @@ } #endif +#ifdef CONFIG_ACCOUNT_MODIFIERS + +void account_modifier_register(struct account_modifier *modifier) +{ + hlist_add_head(&modifier->link, ¤t->account_modifiers); +} +EXPORT_SYMBOL_GPL(account_modifier_register); + +void account_modifier_unregister(struct account_modifier *modifier) +{ + hlist_del(&modifier->link); +} +EXPORT_SYMBOL_GPL(account_modifier_unregister); + +static cputime_t fire_user_time_account_modifiers(struct task_struct *curr, + cputime_t cputime) +{ + struct account_modifier *modifier; + struct hlist_node *node; + + hlist_for_each_entry(modifier, node, &curr->account_modifiers, link) + if (modifier->ops->user_time) + cputime =3D modifier->ops->user_time(modifier, + curr, cputime); + return cputime; +} + +static cputime_t fire_system_time_account_modifiers(struct task_struct *curr, + int hardirq_offset, + cputime_t cputime) +{ + struct account_modifier *modifier; + struct hlist_node *node; + + hlist_for_each_entry(modifier, node, &curr->account_modifiers, link) + if (modifier->ops->system_time) + cputime =3D modifier->ops->system_time(modifier, + curr, hardirq_offset, cputime); + return cputime; +} + +#else + +static inline cputime_t fire_user_time_account_modifiers(struct task_struct *curr, cputime_t cputime) +{ + return cputime; +} + +static inline cputime_t fire_system_time_account_modifiers(struct task_struct *curr, + int hardirq_offset, + cputime_t cputime) +{ + return cputime; +} + +#endif /** * prepare_task_switch - prepare to switch tasks @@ -3223,6 +3282,8 @@ struct cpu_usage_stat *cpustat =3D &kstat_this_cpu.cpustat; cputime64_t tmp; + cputime =3D fire_user_time_account_modifiers(p, cputime); + p->utime =3D cputime_add(p->utime, cputime); /* Add user time to cpustat. */ @@ -3246,6 +3307,8 @@ struct rq *rq =3D this_rq(); cputime64_t tmp; + cputime =3D fire_system_time_account_modifiers(p,hardirq_offset, cputime); + p->stime =3D cputime_add(p->stime, cputime); /* Add system time to cpustat. */ @@ -6522,6 +6585,9 @@ #ifdef CONFIG_PREEMPT_NOTIFIERS INIT_HLIST_HEAD(&init_task.preempt_notifiers); #endif +#ifdef CONFIG_ACCOUNT_MODIFIERS + INIT_HLIST_HEAD(&init_task.account_modifiers); +#endif #ifdef CONFIG_SMP nr_cpu_ids =3D highest_cpu + 1;
Rusty Russell
2007-Aug-16 15:40 UTC
[PATCH/RFC 3/4]Introduce "account modifiers" mechanism
On Thu, 2007-08-16 at 17:58 +0200, Laurent Vivier wrote:> [PATCH 3/3] introduce "account modifiers" mechanism in the kernel allowing a > module to modify the collected accounting for a given task. This implementation > is based on the "preempt_notifier". "account_system_time()" and > "account_user_time()" can call functions registered by a module to modify the > cputime value. > > Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>Hi Laurent, This seems a little like overkill. Why not just add an "account_guest_time" which subtracts the given amount of time from system time (if available) and adds it to guest time? Then kvm (and lguest) should just need to call this at the right times. Am I missing something? Rusty.
Laurent Vivier
2007-Aug-17 04:54 UTC
[PATCH/RFC 4/4, second shot]KVM uses "account_guest_time()"
KVM updates vtime in task_struct to allow account_guest_time() to modify user, system and guest time in cpustat accordingly. -------------- next part -------------- Index: kvm/drivers/kvm/Kconfig =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- kvm.orig/drivers/kvm/Kconfig 2007-08-17 10:24:46.000000000 +0200 +++ kvm/drivers/kvm/Kconfig 2007-08-17 10:25:25.000000000 +0200 @@ -41,4 +41,10 @@ Provides support for KVM on AMD processors equipped with the AMD-V (SVM) extensions. +config GUEST_ACCOUNTING + bool "Virtual Machine accounting support" + depends on KVM + ---help--- + Allows to account CPU time used by the Virtual Machines. + endif # VIRTUALIZATION Index: kvm/drivers/kvm/kvm.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- kvm.orig/drivers/kvm/kvm.h 2007-08-17 10:19:30.000000000 +0200 +++ kvm/drivers/kvm/kvm.h 2007-08-17 10:21:56.000000000 +0200 @@ -589,6 +589,19 @@ int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run); +#ifdef CONFIG_GUEST_ACCOUNTING +static inline ktime_t kvm_guest_enter(void) +{ + return ktime_get(); +} + +static inline void kvm_guest_exit(ktime_t enter) +{ + ktime_t delta =3D ktime_sub(ktime_get(), enter); + current->vtime =3D ktime_add(current->vtime, delta); +} +#endif + static inline int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva, u32 error_code) { Index: kvm/drivers/kvm/svm.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- kvm.orig/drivers/kvm/svm.c 2007-08-17 10:22:07.000000000 +0200 +++ kvm/drivers/kvm/svm.c 2007-08-17 10:23:32.000000000 +0200 @@ -1392,6 +1392,9 @@ u16 gs_selector; u16 ldt_selector; int r; +#ifdef CONFIG_GUEST_ACCOUNTING + ktime_t now; +#endif again: r =3D kvm_mmu_reload(vcpu); @@ -1404,6 +1407,9 @@ clgi(); vcpu->guest_mode =3D 1; +#ifdef CONFIG_GUEST_ACCOUNTING + now =3D kvm_guest_enter(); +#endif if (vcpu->requests) if (test_and_clear_bit(KVM_TLB_FLUSH, &vcpu->requests)) svm_flush_tlb(vcpu); @@ -1536,6 +1542,9 @@ #endif : "cc", "memory" ); +#ifdef CONFIG_GUEST_ACCOUNTING + kvm_guest_exit(now); +#endif vcpu->guest_mode =3D 0; if (vcpu->fpu_active) { Index: kvm/drivers/kvm/vmx.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- kvm.orig/drivers/kvm/vmx.c 2007-08-17 10:23:36.000000000 +0200 +++ kvm/drivers/kvm/vmx.c 2007-08-17 10:24:37.000000000 +0200 @@ -2052,6 +2052,9 @@ struct vcpu_vmx *vmx =3D to_vmx(vcpu); u8 fail; int r; +#ifdef CONFIG_GUEST_ACCOUNTING + ktime_t now; +#endif preempted: if (vcpu->guest_debug.enabled) @@ -2078,6 +2081,9 @@ local_irq_disable(); vcpu->guest_mode =3D 1; +#ifdef CONFIG_GUEST_ACCOUNTING + now =3D kvm_guest_enter(); +#endif if (vcpu->requests) if (test_and_clear_bit(KVM_TLB_FLUSH, &vcpu->requests)) vmx_flush_tlb(vcpu); @@ -2198,6 +2204,9 @@ [cr2]"i"(offsetof(struct kvm_vcpu, cr2)) : "cc", "memory" ); +#ifdef CONFIG_GUEST_ACCOUNTING + kvm_guest_exit(now); +#endif vcpu->guest_mode =3D 0; local_irq_enable(); =20
Avi Kivity
2007-Aug-17 06:00 UTC
[kvm-devel] [PATCH/RFC 3/4, second shot]Introduce "account_guest_time"
Laurent Vivier wrote:> This is another way to compute guest time... I remove the "account modifiers" > mechanism and call directly account_guest_time() from account_system_time(). > account_system_time() computes user, system and guest times according value > accumulated in vtime (a ktime_t) in task_struct by the virtual machine. >> @@ -3246,6 +3277,10 @@ > struct rq *rq = this_rq(); > cputime64_t tmp; > > +#ifdef CONFIG_GUEST_ACCOUNTING > + cputime = account_guest_time(p, cputime); > +#endif > + > p->stime = cputime_add(p->stime, cputime); > > /* Add system time to cpustat. */In order to reduce the impact on whatever function this is in (use diff -p please), you can always have a definition of account_guest_time: #else static cputime_t account_guest_time(struct task_struct *p, cputime_t cputime) { return cputime; } #endif This way the #ifdef/#endif is not necessary when calling it. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.
Laurent Vivier
2007-Aug-20 00:30 UTC
[kvm-devel] [PATCH/RFC 3/4]Introduce "account modifiers" mechanism
Avi Kivity wrote:> Laurent Vivier wrote: >> Avi Kivity wrote: >> [...] >> >>> The normal user/system accounting has the same issue, no? Whereever we >>> happen to land (kernel or user) gets the whole tick. >>> >>> So I think it is okay to have the same limitation for guest time. >>> >>> >> >> So this is how it looks like. >> PATCH 1 and 2 are always a prerequisite. >> >> > >> + tmp =3D cputime_to_cputime64(cputime); >> + if (p->flags & PF_VCPU) { >> + p->utime =3D cputime_add(p->utime, cputime); >> + p->gtime =3D cputime_add(p->gtime, cputime); >> + >> + cpustat->guest =3D cputime64_add(cpustat->guest, tmp); >> + cpustat->user =3D cputime64_add(cpustat->user, tmp); >> + >> + p->flags &=3D ~PF_VCPU; >> + >> + return; >> + } >> + > > Where did CONFIG_GUEST_ACCOUNTING go? >Lost in the sea ... Actually, I thought this modification is not enough expensive (in time and space) to justify a CONFIG_*. But if you think so I can add this in init/Kconfig. Laurent -- ------------- Laurent.Vivier@bull.net -------------- "Software is hard" - Donald Knuth -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://lists.linux-foundation.org/pipermail/virtualization/attachments/20070820/8194834b/signature.pgp