similar to: _mm_lfence in both pathes of an if/else are hoisted by SimplfyCFG potentially breaking use as a speculation barrier

Displaying 20 results from an estimated 2000 matches similar to: "_mm_lfence in both pathes of an if/else are hoisted by SimplfyCFG potentially breaking use as a speculation barrier"

2020 Aug 09
2
_mm_lfence in both pathes of an if/else are hoisted by SimplfyCFG potentially breaking use as a speculation barrier
Hi Craig, The review for the similar GPU problem is now up here: https://reviews.llvm.org/D85603 (+ some other patches on the Phabricator stack). >From a pragmatic perspective, the constraints added to program transforms there are sufficient for what you need. You'd produce IR such as: %token = call token @llvm.experimental.convergence.anchor() br i1 %c, label %then, label %else
2016 Jan 12
3
[PATCH 3/4] x86,asm: Re-work smp_store_mb()
On Mon, Nov 02, 2015 at 04:06:46PM -0800, Linus Torvalds wrote: > On Mon, Nov 2, 2015 at 12:15 PM, Davidlohr Bueso <dave at stgolabs.net> wrote: > > > > So I ran some experiments on an IvyBridge (2.8GHz) and the cost of XCHG is > > constantly cheaper (by at least half the latency) than MFENCE. While there > > was a decent amount of variation, this difference
2016 Jan 12
3
[PATCH 3/4] x86,asm: Re-work smp_store_mb()
On Mon, Nov 02, 2015 at 04:06:46PM -0800, Linus Torvalds wrote: > On Mon, Nov 2, 2015 at 12:15 PM, Davidlohr Bueso <dave at stgolabs.net> wrote: > > > > So I ran some experiments on an IvyBridge (2.8GHz) and the cost of XCHG is > > constantly cheaper (by at least half the latency) than MFENCE. While there > > was a decent amount of variation, this difference
2017 Feb 14
2
[PATCH v2 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
Thomas Gleixner <tglx at linutronix.de> writes: > On Tue, 14 Feb 2017, Vitaly Kuznetsov wrote: > >> Hi, >> >> while we're still waiting for a definitive ACK from Microsoft that the >> algorithm is good for SMP case (as we can't prevent the code in vdso from >> migrating between CPUs) I'd like to send v2 with some modifications to keep
2017 Feb 14
2
[PATCH v2 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
Thomas Gleixner <tglx at linutronix.de> writes: > On Tue, 14 Feb 2017, Vitaly Kuznetsov wrote: > >> Hi, >> >> while we're still waiting for a definitive ACK from Microsoft that the >> algorithm is good for SMP case (as we can't prevent the code in vdso from >> migrating between CPUs) I'd like to send v2 with some modifications to keep
2018 Mar 23
5
RFC: Speculative Load Hardening (a Spectre variant #1 mitigation)
Hello all, I've been working for the last month or so on a comprehensive mitigation approach to variant #1 of Spectre. There are a bunch of reasons why this is desirable: - Critical software that is unlikely to be easily hand-mitigated (or where the performance tradeoff isn't worth it) will have a compelling option. - It gives us a baseline on performance for hand-mitigation. - Combined
2018 Feb 03
0
retpoline mitigation and 6.0
On Sat, 2018-02-03 at 00:23 +0000, Chandler Carruth wrote: > > Two aspects to this... > > One, we're somewhat reluctant to guarantee an ABI here. At least I > am. While we don't *expect* rampant divergence here, I don't want > this to become something we cannot change if there are good reasons > to do so. We've already changed the thunks once based on
2017 Oct 27
1
[PATCH v6] x86: use lock+addl for smp_mb()
mfence appears to be way slower than a locked instruction - let's use lock+add unconditionally, as we always did on old 32-bit. Results: perf stat -r 10 -- ./virtio_ring_0_9 --sleep --host-affinity 0 --guest-affinity 0 Before: 0.922565990 seconds time elapsed ( +- 1.15% ) After: 0.578667024 seconds time elapsed
2017 Oct 27
1
[PATCH v6] x86: use lock+addl for smp_mb()
mfence appears to be way slower than a locked instruction - let's use lock+add unconditionally, as we always did on old 32-bit. Results: perf stat -r 10 -- ./virtio_ring_0_9 --sleep --host-affinity 0 --guest-affinity 0 Before: 0.922565990 seconds time elapsed ( +- 1.15% ) After: 0.578667024 seconds time elapsed
2016 Jan 12
1
[PATCH 3/4] x86,asm: Re-work smp_store_mb()
On Tue, Jan 12, 2016 at 09:20:06AM -0800, Linus Torvalds wrote: > On Tue, Jan 12, 2016 at 5:57 AM, Michael S. Tsirkin <mst at redhat.com> wrote: > > #ifdef xchgrz > > /* same as xchg but poking at gcc red zone */ > > #define barrier() do { int ret; asm volatile ("xchgl %0, -4(%%" SP ");": "=r"(ret) :: "memory", "cc"); }
2016 Jan 12
1
[PATCH 3/4] x86,asm: Re-work smp_store_mb()
On Tue, Jan 12, 2016 at 09:20:06AM -0800, Linus Torvalds wrote: > On Tue, Jan 12, 2016 at 5:57 AM, Michael S. Tsirkin <mst at redhat.com> wrote: > > #ifdef xchgrz > > /* same as xchg but poking at gcc red zone */ > > #define barrier() do { int ret; asm volatile ("xchgl %0, -4(%%" SP ");": "=r"(ret) :: "memory", "cc"); }
2007 Oct 16
1
LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)
Nick Piggin <npiggin@suse.de> wrote: > > Also, for non-wb memory. I don't think the Intel document referenced > says anything about this, but the AMD document says that loads can pass > loads (page 8, rule b). > > This is why our rmb() is still an lfence. BTW, Xen (in particular, the code in drivers/xen) uses mb/rmb/wmb instead of smp_mb/smp_rmb/smp_wmb when it
2007 Oct 16
1
LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)
Nick Piggin <npiggin@suse.de> wrote: > > Also, for non-wb memory. I don't think the Intel document referenced > says anything about this, but the AMD document says that loads can pass > loads (page 8, rule b). > > This is why our rmb() is still an lfence. BTW, Xen (in particular, the code in drivers/xen) uses mb/rmb/wmb instead of smp_mb/smp_rmb/smp_wmb when it
2018 Feb 03
4
retpoline mitigation and 6.0
On Fri, Feb 2, 2018 at 4:03 PM David Woodhouse <dwmw2 at infradead.org> wrote: > On Thu, 2018-02-01 at 10:10 +0100, Hans Wennborg via llvm-dev wrote: > > > > I saw the retpoline mitigation landed in r323155. Are we ready to > > merge this to 6.0, or are there any open issues that we're waiting > > for? Also, were there any followups I should know about? Also,
2008 Oct 17
0
[LLVMdev] MFENCE encoding
Hmm. mfence and lfence needs special handling. I'll take a look. Evan On Oct 16, 2008, at 10:46 PM, Mon Ping Wang wrote: > Hi, > > I have a problem with creating a MFENCE on X86 with SSE > > In X86InstrSSE.td, a MFENCE is > def MFENCE : I<0xAE, MRM6m, (outs), (ins), > "mfence", [(int_x86_sse2_mfence)]>, TB, Requires< > [HasSSE2]>;
2020 Aug 09
2
[RFC] Introducing convergence control bundles and intrinsics
Hi all, please see https://reviews.llvm.org/D85603 and its related changes for our most recent and hopefully final attempt at putting the `convergent` attribute on a solid theoretical foundation in a way that is useful for modern GPU compiler use cases. We have clear line of sight to enabling a new control flow implementation in the AMDGPU backend which is built on this foundation. I have
2008 Oct 17
1
[LLVMdev] MFENCE encoding
I've fixed this (untested though). http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20081013/068611.html Evan On Oct 17, 2008, at 9:51 AM, Evan Cheng wrote: > Hmm. mfence and lfence needs special handling. I'll take a look. > > Evan > > On Oct 16, 2008, at 10:46 PM, Mon Ping Wang wrote: > >> Hi, >> >> I have a problem with creating a MFENCE
2020 Mar 10
2
[RFC] Speculative Execution Side Effect Suppression for Mitigating Load Value Injection
Hi everyone, Some Intel processors have a newly disclosed vulnerability named Load Value Injection. One pager on Load Value Injection: https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Deep dive on Load Value Injection: https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection I wrote this compiler pass that can
2020 Aug 17
2
[RFC] Introducing convergence control bundles and intrinsics
Hi Hal, On Mon, Aug 17, 2020 at 2:13 AM Hal Finkel <hfinkel at anl.gov> wrote: > Thanks for sending this. What do you think that we should do with the > existing convergent attribute? My preference, which is implicitly expressed in the review, is to use `convergent` both for the new and the old thing. They are implicitly distinguished via the "convergencectrl" operand
2020 Aug 17
2
[RFC] Introducing convergence control bundles and intrinsics
On Mon, Aug 17, 2020 at 7:14 PM Hal Finkel <hfinkel at anl.gov> wrote: > On 8/17/20 11:51 AM, Nicolai Hähnle wrote: > > Hi Hal, > > > > On Mon, Aug 17, 2020 at 2:13 AM Hal Finkel <hfinkel at anl.gov> wrote: > >> Thanks for sending this. What do you think that we should do with the > >> existing convergent attribute? > > My preference, which