Hello. I''m trying to debug a problem of linux hvms (3.11.7 dom0 kernel, xen 4.3.1, guest kernels tried were 3.2, 3.5, 3.8), using tsc as clocksource, hanging some time after host S3 has been performed. I''m using the default tsc_mode (also tried never emulate to make sure), vtsc is off. It seems linux tsc clocksource code expects the counter to be reset to 0 after S3 resume, except on some Atom chips who guarantee constant tsc thru S3 (I''m not testing on these though). That doesn''t seem to be case. I dumped results of rtdsc() both in dom0 and the linux guest. It seems that in dom0 the tsc after S3 restarts to 0 as expected, in the linux hvm however it goes into negative values. As soon as it wraps back to 0, timekeeping in linux gets broken. The negativeness seem to be because there is some offsetting being done to the tsc in hvm guest -> I sort of expected that since vtsc on that hvm is off, the values returned by rdtsc() would match, that is however not the case, when the hvm guest boots the tsc in there seems to start from 0, not from the current rdtsc value in dom0. So, after host S3, my linux hvm has about equivalent time to live as the delta between xen boot and guest boot (i.e. if i booted the hvm 2mins after host boot, it will hang ~2mins after host s3 since thats when the tsc will wrap) Any ideas on cause/fix? Why the tsc in hvm guest is offset from dom0 one even in TSC_MODE_NEVER_EMULATE?
On 11/20/2013 11:57 AM, Tomasz Wroblewski wrote:> Hello. I''m trying to debug a problem of linux hvms (3.11.7 dom0 kernel, xen 4.3.1, guest kernels tried were 3.2, 3.5, 3.8), using tsc as > clocksource, hanging some time after host S3 has been performed. I''m using the default tsc_mode (also tried never emulate to make sure), > vtsc is off. > > It seems linux tsc clocksource code expects the counter to be reset to 0 after S3 resume, except on some Atom chips who guarantee constant > tsc thru S3 (I''m not testing on these though). That doesn''t seem to be case. I dumped results of rtdsc() both in dom0 and the linux guest. > It seems that in dom0 the tsc after S3 restarts to 0 as expected, in the linux hvm however it goes into negative values. As soon as it wraps > back to 0, timekeeping in linux gets broken. > > The negativeness seem to be because there is some offsetting being done to the tsc in hvm guest -> I sort of expected that since vtsc on > that hvm is off, the values returned by rdtsc() would match, that is however not the case, when the hvm guest boots the tsc in there seems > to start from 0, not from the current rdtsc value in dom0. So, after host S3, my linux hvm has about equivalent time to live as the delta > between xen boot and guest boot (i.e. if i booted the hvm 2mins after host boot, it will hang ~2mins after host s3 since thats when the tsc > will wrap) > > Any ideas on cause/fix? Why the tsc in hvm guest is offset from dom0 one even in TSC_MODE_NEVER_EMULATE? >I''ve since found out that VMCS seems to provide tsc offsetting capabilities; adding something like for_each_vcpu ( d, v ) { if (v->vcpu_id == 0) hvm_set_guest_tsc(v, 0); } inside hvm_s3_resume (xen/arch/x86/hvm/hvm.c) fixed this for me; any comment on such a solution?
>>> On 20.11.13 at 12:41, Tomasz Wroblewski <tomasz.wroblewski@citrix.com> wrote: > I''ve since found out that VMCS seems to provide tsc offsetting capabilities; > adding something like > > for_each_vcpu ( d, v ) > { > if (v->vcpu_id == 0) > hvm_set_guest_tsc(v, 0); > } > > inside hvm_s3_resume (xen/arch/x86/hvm/hvm.c) fixed this for me; any comment > on such a solution?This sounds plausible, but I''d prefer it to be done alongside the other state resetting done for S3 (which all happen in hvm_s3_suspend()). Unless that doesn''t work, of course. Jan
On 11/20/2013 12:51 PM, Jan Beulich wrote:>>>> On 20.11.13 at 12:41, Tomasz Wroblewski <tomasz.wroblewski@citrix.com> wrote: >> I''ve since found out that VMCS seems to provide tsc offsetting capabilities; >> adding something like >> >> for_each_vcpu ( d, v ) >> { >> if (v->vcpu_id == 0) >> hvm_set_guest_tsc(v, 0); >> } >> >> inside hvm_s3_resume (xen/arch/x86/hvm/hvm.c) fixed this for me; any comment >> on such a solution? > > This sounds plausible, but I''d prefer it to be done alongside the other > state resetting done for S3 (which all happen in hvm_s3_suspend()). > Unless that doesn''t work, of course. >All right, thanks! I''ll try it out and post a patch, I think it should work. I''ve only reset the tsc on vcpu 0 since I''ve noticed same is done in hvm_vcpu_initialize(), is that enough?> Jan >
> This sounds plausible, but I''d prefer it to be done alongside the other > state resetting done for S3 (which all happen in hvm_s3_suspend()). > Unless that doesn''t work, of course. >looks like it has to be done after resume since set_guest_tsc reads current timestamp to determine the offset necessary, and for the logic to be correct it needs to do it after it''s already been reset to 0 on the host.> Jan >
>>> On 20.11.13 at 12:55, Tomasz Wroblewski <tomasz.wroblewski@citrix.com> wrote: > On 11/20/2013 12:51 PM, Jan Beulich wrote: >>>>> On 20.11.13 at 12:41, Tomasz Wroblewski <tomasz.wroblewski@citrix.com> wrote: >>> I''ve since found out that VMCS seems to provide tsc offsetting capabilities; >>> adding something like >>> >>> for_each_vcpu ( d, v ) >>> { >>> if (v->vcpu_id == 0) >>> hvm_set_guest_tsc(v, 0); >>> } >>> >>> inside hvm_s3_resume (xen/arch/x86/hvm/hvm.c) fixed this for me; any comment >>> on such a solution? >> >> This sounds plausible, but I''d prefer it to be done alongside the other >> state resetting done for S3 (which all happen in hvm_s3_suspend()). >> Unless that doesn''t work, of course. >> > All right, thanks! I''ll try it out and post a patch, I think it should work. > I''ve only reset the tsc on vcpu 0 since I''ve noticed same is > done in hvm_vcpu_initialize(), is that enough?Honestly I don''t immediately see why it''s being done there for vCPU 0 only, and I don''t think that''s be sufficient for the resume case. Jan