search for: rdtsc_ordered

Displaying 20 results from an estimated 23 matches for "rdtsc_ordered".

2018 Sep 18
3
[patch 09/11] x86/vdso: Simplify the invalid vclock case
...l. > > d8bb6f4c1670 ("x86: tsc prevent time going backwards") > > I still have one of the machines which is affected by this. Are we sure this isn't a load vs rdtsc reorder? Because if I look at the current code: notrace static u64 vread_tsc(void) { u64 ret = (u64)rdtsc_ordered(); u64 last = gtod->cycle_last; if (likely(ret >= last)) return ret; /* * GCC likes to generate cmov here, but this branch is extremely * predictable (it's just a function of time and the likely is * very likely) and there's a data dependence, so force GCC * to generate...
2018 Sep 18
3
[patch 09/11] x86/vdso: Simplify the invalid vclock case
...l. > > d8bb6f4c1670 ("x86: tsc prevent time going backwards") > > I still have one of the machines which is affected by this. Are we sure this isn't a load vs rdtsc reorder? Because if I look at the current code: notrace static u64 vread_tsc(void) { u64 ret = (u64)rdtsc_ordered(); u64 last = gtod->cycle_last; if (likely(ret >= last)) return ret; /* * GCC likes to generate cmov here, but this branch is extremely * predictable (it's just a function of time and the likely is * very likely) and there's a data dependence, so force GCC * to generate...
2017 Mar 03
1
[PATCH v3 2/3] x86/hyperv: move TSC reading method to asm/mshyperv.h
...available, probably should be unlikely. > + /* > + * Make sure we read sequence before we read other values from > + * TSC page. > + */ > + smp_rmb(); > + > + scale = READ_ONCE(tsc_pg->tsc_scale); > + offset = READ_ONCE(tsc_pg->tsc_offset); > + cur_tsc = rdtsc_ordered(); Since you already have smp_ barriers and rdtsc_ordered is a barrier, the compiler barriers (READ_ONCE()) shouldn't be necessary. > + > + current_tick = mul_u64_u64_shr(cur_tsc, scale, 64) + offset; > + > + /* > + * Make sure we read sequence after we read all other value...
2017 Mar 03
1
[PATCH v3 2/3] x86/hyperv: move TSC reading method to asm/mshyperv.h
...available, probably should be unlikely. > + /* > + * Make sure we read sequence before we read other values from > + * TSC page. > + */ > + smp_rmb(); > + > + scale = READ_ONCE(tsc_pg->tsc_scale); > + offset = READ_ONCE(tsc_pg->tsc_offset); > + cur_tsc = rdtsc_ordered(); Since you already have smp_ barriers and rdtsc_ordered is a barrier, the compiler barriers (READ_ONCE()) shouldn't be necessary. > + > + current_tick = mul_u64_u64_shr(cur_tsc, scale, 64) + offset; > + > + /* > + * Make sure we read sequence after we read all other value...
2018 Sep 14
0
[patch 10/11] x86/vdso: Move cycle_last handling into the caller
...(void) { const struct pvclock_vcpu_time_info *pvti = &get_pvti0()->pvti; - u64 ret; - u64 last; u32 version; + u64 ret; /* * Note: The kernel and hypervisor must guarantee that cpu ID @@ -111,13 +110,7 @@ static notrace u64 vread_pvclock(void) ret = __pvclock_read_cycles(pvti, rdtsc_ordered()); } while (pvclock_read_retry(pvti, version)); - /* refer to vread_tsc() comment for rationale */ - last = gtod->cycle_last; - - if (likely(ret >= last)) - return ret; - - return last; + return ret; } #endif #ifdef CONFIG_HYPERV_TSCPAGE @@ -130,30 +123,10 @@ static notrace u64 vread...
2017 Feb 14
2
[PATCH v2 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
Thomas Gleixner <tglx at linutronix.de> writes: > On Tue, 14 Feb 2017, Vitaly Kuznetsov wrote: > >> Hi, >> >> while we're still waiting for a definitive ACK from Microsoft that the >> algorithm is good for SMP case (as we can't prevent the code in vdso from >> migrating between CPUs) I'd like to send v2 with some modifications to keep
2017 Feb 14
2
[PATCH v2 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
Thomas Gleixner <tglx at linutronix.de> writes: > On Tue, 14 Feb 2017, Vitaly Kuznetsov wrote: > >> Hi, >> >> while we're still waiting for a definitive ACK from Microsoft that the >> algorithm is good for SMP case (as we can't prevent the code in vdso from >> migrating between CPUs) I'd like to send v2 with some modifications to keep
2018 Sep 18
0
[patch 09/11] x86/vdso: Simplify the invalid vclock case
...1:57PM +0200, Thomas Gleixner wrote: > > I still have one of the machines which is affected by this. > > Are we sure this isn't a load vs rdtsc reorder? Because if I look at the > current code: The load order of last vs. rdtsc does not matter at all. CPU0 CPU1 .... now0 = rdtsc_ordered(); ... tk->cycle_last = now0; gtod->seq++; gtod->cycle_last = tk->cycle_last; ... gtod->seq++; seq_begin(gtod->seq); now1 = rdtsc_ordered(); So if the TSC on CPU1 is slightly behind the TSC on CPU0 then now1 can be smaller than cycle_last. The TSC sync stuff does not c...
2017 Mar 03
4
[PATCH v3 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
Hi, merge window is about to close so I hope it's OK to make another try here. Changes since v2: - Add explicit READ_ONCE() to not rely on 'volatile' [Andy Lutomirski] - rdtsc() -> rdtsc_ordered() [Andy Lutomirski] - virt_rmb() -> smp_rmb() [Thomas Gleixner, Andy Lutomirski] Thomas, Andy, it seems the only blocker for the series was the ambiguity with TSC page read algorithm. I contacted Microsoft (through K. Y.) and asked what we should do when we see 'seq=0'. The answer is:...
2017 Mar 03
4
[PATCH v3 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
Hi, merge window is about to close so I hope it's OK to make another try here. Changes since v2: - Add explicit READ_ONCE() to not rely on 'volatile' [Andy Lutomirski] - rdtsc() -> rdtsc_ordered() [Andy Lutomirski] - virt_rmb() -> smp_rmb() [Thomas Gleixner, Andy Lutomirski] Thomas, Andy, it seems the only blocker for the series was the ambiguity with TSC page read algorithm. I contacted Microsoft (through K. Y.) and asked what we should do when we see 'seq=0'. The answer is:...
2017 Feb 14
0
[PATCH v2 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
...> (PATCH 2)? As members of struct ms_hyperv_tsc_page are volatile we don't > need READ_ONCE(), compilers are not allowed to merge accesses. The > resulting code looks good to me: No, on multiple counts, unfortunately. 1. LFENCE is basically useless except for IO and for (Intel only) rdtsc_ordered(). AFAIK there is literally no scenario under which LFENCE is useful for access to normal memory. 2. The problem here has little to do with barriers. You're doing: read seq; read var1; read var2; read tsc; read seq again; If the hypervisor updates things between reading var1 and var2 or be...
2017 Feb 15
2
[PATCH v2 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
...struct ms_hyperv_tsc_page are volatile we don't >> need READ_ONCE(), compilers are not allowed to merge accesses. The >> resulting code looks good to me: > > No, on multiple counts, unfortunately. > > 1. LFENCE is basically useless except for IO and for (Intel only) > rdtsc_ordered(). AFAIK there is literally no scenario under which > LFENCE is useful for access to normal memory. > Interesting, (For some reason I was under the impression that when I do READ var1 -> reg1 READ var2 -> reg2 from normal memory reads can actually happen in any order and LFENCE in...
2017 Feb 15
2
[PATCH v2 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
...struct ms_hyperv_tsc_page are volatile we don't >> need READ_ONCE(), compilers are not allowed to merge accesses. The >> resulting code looks good to me: > > No, on multiple counts, unfortunately. > > 1. LFENCE is basically useless except for IO and for (Intel only) > rdtsc_ordered(). AFAIK there is literally no scenario under which > LFENCE is useful for access to normal memory. > Interesting, (For some reason I was under the impression that when I do READ var1 -> reg1 READ var2 -> reg2 from normal memory reads can actually happen in any order and LFENCE in...
2018 Sep 17
11
[patch V2 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support
Matt attempted to add CLOCK_TAI support to the VDSO clock_gettime() implementation, which extended the clockid switch case and added yet another slightly different copy of the same code. Especially the extended switch case is problematic as the compiler tends to generate a jump table which then requires to use retpolines. If jump tables are disabled it adds yet another conditional to the existing
2018 Sep 19
0
[patch 09/11] x86/vdso: Simplify the invalid vclock case
...; (cycles-last)* ...". That should just be a "sub ; js ; ". It's an extra > load of ->cycle_last, but only on the path where we're heading for the > fallback anyway. The value of 1 can be adjusted so that in the "js" > path, we could detect and accept an rdtsc_ordered() call that's just a > few 10s of cycles behind last and treat that as 0 and continue back on > the normal path. But maybe it's hard to get gcc to generate the expected > code. I played around with a lot of variants and GCC generates all kinds of interesting ASM. And at some point...
2017 Mar 03
0
[PATCH v3 2/3] x86/hyperv: move TSC reading method to asm/mshyperv.h
...(1) { + sequence = READ_ONCE(tsc_pg->tsc_sequence); + if (!sequence) + break; + /* + * Make sure we read sequence before we read other values from + * TSC page. + */ + smp_rmb(); + + scale = READ_ONCE(tsc_pg->tsc_scale); + offset = READ_ONCE(tsc_pg->tsc_offset); + cur_tsc = rdtsc_ordered(); + + current_tick = mul_u64_u64_shr(cur_tsc, scale, 64) + offset; + + /* + * Make sure we read sequence after we read all other values + * from TSC page. + */ + smp_rmb(); + + if (READ_ONCE(tsc_pg->tsc_sequence) == sequence) + return current_tick; + } + + return U64_MAX; +} + #els...
2018 Sep 18
2
[patch 09/11] x86/vdso: Simplify the invalid vclock case
On Tue, 18 Sep 2018, Thomas Gleixner wrote: > On Tue, 18 Sep 2018, Peter Zijlstra wrote: > > > Your memory serves you right. That's indeed observable on CPUs which > > > lack TSC_ADJUST. > > > > But, if the gtod code can observe this, then why doesn't the code that > > checks the sync? > > Because it depends where the involved CPUs are in the
2018 Sep 18
2
[patch 09/11] x86/vdso: Simplify the invalid vclock case
On Tue, 18 Sep 2018, Thomas Gleixner wrote: > On Tue, 18 Sep 2018, Peter Zijlstra wrote: > > > Your memory serves you right. That's indeed observable on CPUs which > > > lack TSC_ADJUST. > > > > But, if the gtod code can observe this, then why doesn't the code that > > checks the sync? > > Because it depends where the involved CPUs are in the
2018 Sep 14
0
[patch 09/11] x86/vdso: Simplify the invalid vclock case
..._pvclock(int *mo do { version = pvclock_read_begin(pvti); - if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) { - *mode = VCLOCK_NONE; - return 0; - } + if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) + return U64_MAX; ret = __pvclock_read_cycles(pvti, rdtsc_ordered()); } while (pvclock_read_retry(pvti, version)); @@ -148,17 +121,12 @@ static notrace u64 vread_pvclock(int *mo } #endif #ifdef CONFIG_HYPERV_TSCPAGE -static notrace u64 vread_hvclock(int *mode) +static notrace u64 vread_hvclock(void) { const struct ms_hyperv_tsc_page *tsc_pg = (const st...
2018 Sep 18
3
[patch 09/11] x86/vdso: Simplify the invalid vclock case
> On Sep 18, 2018, at 12:52 AM, Thomas Gleixner <tglx at linutronix.de> wrote: > >> On Mon, 17 Sep 2018, John Stultz wrote: >>> On Mon, Sep 17, 2018 at 12:25 PM, Andy Lutomirski <luto at kernel.org> wrote: >>> Also, I'm not entirely convinced that this "last" thing is needed at >>> all. John, what's the scenario under which we