Marcelo Tosatti
2018-Oct-08 19:36 UTC
[patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support
On Mon, Oct 08, 2018 at 10:38:22AM -0700, Andy Lutomirski wrote:> On Mon, Oct 8, 2018 at 8:27 AM Marcelo Tosatti <mtosatti at redhat.com> wrote: > > > > On Sat, Oct 06, 2018 at 03:28:05PM -0700, Andy Lutomirski wrote: > > > On Sat, Oct 6, 2018 at 1:29 PM Marcelo Tosatti <mtosatti at redhat.com> wrote: > > > > > > > > On Thu, Oct 04, 2018 at 03:15:32PM -0700, Andy Lutomirski wrote: > > > > > For better or for worse, I'm trying to understand this code. So far, > > > > > I've come up with this patch: > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/vdso-tglx&id=14fd71e12b1c4492a06f368f75041f263e6862bf > > > > > > > > > > Is it correct, or am I missing some subtlety? > > > > > > > > The master clock, when initialized, has a pair > > > > > > > > masterclockvalues=(TSC value, time-of-day data). > > > > > > > > When updating the guest clock, we only update relative to (TSC value) > > > > that was read on masterclock initialization. > > > > > > I don't see the problem. The masterclock data is updated here: > > > > > > host_tsc_clocksource = kvm_get_time_and_clockread( > > > &ka->master_kernel_ns, > > > &ka->master_cycle_now); > > > > > > kvm_get_time_and_clockread() gets those values from > > > do_monotonic_boot(), which, barring bugs, should cause > > > get_kvmclock_ns() to return exactly the same thing as > > > ktime_get_boot_ns() + ka->kvmclock_offset, albeit in a rather > > > roundabout manner. > > > > > > So what am I missing? Is there actually something wrong with my patch? > > > > For the bug mentioned in the comment not to happen, you must only read > > TSC and add it as offset to (TSC value, time-of-day data). > > > > Its more than "a roundabout manner". > > > > Read the comment again. > > > > I read the comment three more times and even dug through the git > history. It seems like what you're saying is that, under certain > conditions (which arguably would be bugs in the core Linux timing > code),I don't see that as a bug. Its just a side effect of reading two different clocks (one is CLOCK_MONOTONIC and the other is TSC), and using those two clocks to as a "base + offset". As the comment explains, if you do that, can't guarantee monotonicity.> actually calling ktime_get_boot_ns() could be non-monotonic > with respect to the kvmclock timing. But get_kvmclock_ns() isn't used > for VM timing as such -- it's used for the IOCTL interfaces for > updating the time offset. So can you explain how my patch is > incorrect?ktime_get_boot_ns() has frequency correction applied, while reading masterclock + TSC offset does not. So the clock reads differ.
Andy Lutomirski
2018-Oct-09 20:09 UTC
[patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support
On Tue, Oct 9, 2018 at 8:28 AM Marcelo Tosatti <mtosatti at redhat.com> wrote:> > On Mon, Oct 08, 2018 at 10:38:22AM -0700, Andy Lutomirski wrote: > > On Mon, Oct 8, 2018 at 8:27 AM Marcelo Tosatti <mtosatti at redhat.com> wrote:> > I read the comment three more times and even dug through the git > > history. It seems like what you're saying is that, under certain > > conditions (which arguably would be bugs in the core Linux timing > > code), > > I don't see that as a bug. Its just a side effect of reading two > different clocks (one is CLOCK_MONOTONIC and the other is TSC), > and using those two clocks to as a "base + offset". > > As the comment explains, if you do that, can't guarantee monotonicity. > > > actually calling ktime_get_boot_ns() could be non-monotonic > > with respect to the kvmclock timing. But get_kvmclock_ns() isn't used > > for VM timing as such -- it's used for the IOCTL interfaces for > > updating the time offset. So can you explain how my patch is > > incorrect? > > ktime_get_boot_ns() has frequency correction applied, while > reading masterclock + TSC offset does not. > > So the clock reads differ. >Ah, okay, I finally think I see what's going on. In the kvmclock data exposed to the guest, tsc_shift and tsc_to_system_mul come from tgt_tsc_khz, whereas master_kernel_ns and master_cycle_now come from CLOCK_BOOTTIME. So the kvmclock and kernel clock drift apart at a rate given by the frequency shift and then suddenly agree again every time the pvclock data is updated. Is there a reason to do it this way?
Marcelo Tosatti
2018-Oct-11 22:27 UTC
[patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support
On Tue, Oct 09, 2018 at 01:09:42PM -0700, Andy Lutomirski wrote:> On Tue, Oct 9, 2018 at 8:28 AM Marcelo Tosatti <mtosatti at redhat.com> wrote: > > > > On Mon, Oct 08, 2018 at 10:38:22AM -0700, Andy Lutomirski wrote: > > > On Mon, Oct 8, 2018 at 8:27 AM Marcelo Tosatti <mtosatti at redhat.com> wrote: > > > > I read the comment three more times and even dug through the git > > > history. It seems like what you're saying is that, under certain > > > conditions (which arguably would be bugs in the core Linux timing > > > code), > > > > I don't see that as a bug. Its just a side effect of reading two > > different clocks (one is CLOCK_MONOTONIC and the other is TSC), > > and using those two clocks to as a "base + offset". > > > > As the comment explains, if you do that, can't guarantee monotonicity. > > > > > actually calling ktime_get_boot_ns() could be non-monotonic > > > with respect to the kvmclock timing. But get_kvmclock_ns() isn't used > > > for VM timing as such -- it's used for the IOCTL interfaces for > > > updating the time offset. So can you explain how my patch is > > > incorrect? > > > > ktime_get_boot_ns() has frequency correction applied, while > > reading masterclock + TSC offset does not. > > > > So the clock reads differ. > > > > Ah, okay, I finally think I see what's going on. In the kvmclock data > exposed to the guest, tsc_shift and tsc_to_system_mul come from > tgt_tsc_khz, whereas master_kernel_ns and master_cycle_now come from > CLOCK_BOOTTIME. So the kvmclock and kernel clock drift apart at a > rate given by the frequency shift and then suddenly agree again every > time the pvclock data is updated.Yes.> Is there a reason to do it this way?Since pvclock updates which update system_timestamp are expensive (must stop all vcpus), they should be avoided. So only HW TSC counts, and used as offset against vcpu's tsc_timestamp.
Apparently Analagous Threads
- [patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support
- [patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support
- [patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support
- [patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support
- [patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support