similar to: AtomicExpandPass and branch weighting

Displaying 20 results from an estimated 9000 matches similar to: "AtomicExpandPass and branch weighting"

2016 Dec 14
0
AtomicExpandPass and branch weighting
Seems reasonable. I'd note additionally that on some architectures, that the success block *must* be the fallthrough case (that is to say: you must not have any taken branches between the load-linked and store-conditional) in order to have an architectural guarantee that two such loops on different CPUs won't livelock against eachother. On Dec 12, 2016, at 12:30 PM, Kyle Butt via
2016 Dec 14
1
AtomicExpandPass and branch weighting
On Wed, Dec 14, 2016 at 1:34 PM, James Knight <jyknight at google.com> wrote: > Seems reasonable. > > I'd note additionally that on some architectures, that the success block > *must* be the fallthrough case (that is to say: you must not have any taken > branches between the load-linked and store-conditional) in order to have an > architectural guarantee that two such
2018 Jun 13
12
RFC: Atomic LL/SC loops in LLVM revisited
# RFC: Atomic LL/SC loops in LLVM revisited ## Summary This proposal gives a brief overview of the challenges of lowering to LL/SC loops and details the approach I am taking for RISC-V. Beyond getting feedback on that work, my intention is to find consensus on moving other backends towards a similar approach and sharing common code where feasible. Scroll down to 'Questions' for a summary
2017 Jan 10
2
[PATCHish] IfConversion; lost edges for some diamonds
On Tue, Jan 10, 2017 at 2:31 AM, Peter A Jonsson <pj at sics.se> wrote: > Hi Kyle, > > my apologies for mailing you directly but it seems new user creation is > disabled on the llvm bugzilla. > > We sometime lose edges during IfConversion of diamonds and it’s not > obvious how to reproduce on an upstream target. The documentation for > HasFallThrough says *may*
2017 May 30
3
[atomics][AArch64] Possible bug in cmpxchg lowering
Currently the AtomicExpandPass will lower the following IR: define i1 @foo(i32* %obj, i32 %old, i32 %new) { entry: %v0 = cmpxchg weak volatile i32* %obj, i32 %old, i32 %new _*release acquire*_ %v1 = extractvalue { i32, i1 } %v0, 1 ret i1 %v1 } to the equivalent of the following on AArch64: _*ldxr w8, [x0]*_ cmp w8, w1 b.ne .LBB0_3 // BB#1:
2016 Jan 27
7
Adding sanity to the Atomics implementation
Right now, the atomics implementation in clang is a bit of a mess. It has three basically completely distinct code-paths: There's the legacy __sync_* builtins, which clang lowers directly to atomic IR instructions. Then, the llvm atomic IR instructions themselves can sometimes emit libcalls to __sync_* library functions (which are basically undocumented, and users are often responsible for
2017 Sep 15
4
RFC: Trace-based layout.
I plan on rewriting the block placement algorithm to proceed by traces. A trace is a chain of blocks where each block in the chain may fall through to the successor in the chain. The overall algorithm would be to first produce traces for a function, and then order those traces to try and get cache locality. Currently block placement uses a greedy single step approach to layout. It produces
2015 Dec 11
7
RFC: Extending atomic loads and stores to floating point and vector types
Currently, we limit atomic loads and stores to either pointer or integer types. I would like to propose that we extend this to allow both floating point and vector types which meet the other requirements. (i.e. power-of-two multiple of 8 bits, and aligned) This will enable a couple of follow on changes: 1) Teaching the vectorizer how to vectorize unordered atomic loads and stores 2)
2016 Jan 28
0
Adding sanity to the Atomics implementation
I don't have much of substance to add, but I will say that I like the proposal (I too prefer Alternative A). When adding C11 atomic support for hexagon, I found it surprising that support for the __sync_* was implemented completely differently than the C11 atomics. On 1/27/2016 1:55 PM, James Knight via llvm-dev wrote: > Right now, the atomics implementation in clang is a bit of a
2017 Sep 19
2
RFC: Trace-based layout.
On Mon, Sep 18, 2017 at 1:16 PM, Andrew Trick <atrick at apple.com> wrote: > > On Sep 14, 2017, at 6:53 PM, Kyle Butt via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > I plan on rewriting the block placement algorithm to proceed by traces. > > A trace is a chain of blocks where each block in the chain may fall > through to > the successor in the chain.
2015 Dec 11
2
RFC: Extending atomic loads and stores to floating point and vector types
On 12/11/2015 01:29 PM, James Y Knight wrote: > > On Fri, Dec 11, 2015 at 3:05 PM, Philip Reames via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > >> One open question I don't know the answer to: Are there any >> special semantics required from floating point stores which >> aren't met
2017 Sep 19
0
RFC: Trace-based layout.
> On Sep 18, 2017, at 5:17 PM, Kyle Butt <iteratee at google.com> wrote: > > > > On Mon, Sep 18, 2017 at 1:16 PM, Andrew Trick <atrick at apple.com <mailto:atrick at apple.com>> wrote: > >> On Sep 14, 2017, at 6:53 PM, Kyle Butt via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> I plan on
2017 Sep 18
0
RFC: Trace-based layout.
> On Sep 14, 2017, at 6:53 PM, Kyle Butt via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I plan on rewriting the block placement algorithm to proceed by traces. > > A trace is a chain of blocks where each block in the chain may fall through to > the successor in the chain. > > The overall algorithm would be to first produce traces for a function, and then >
2014 Aug 15
2
[LLVMdev] Plan to optimize atomics in LLVM
> From my reading of Atomics.rst, it would be sound to reorder (It does not > say much about load-linked, so I am treating it as a normal load here) > >> store seq_cst >> fence release >> load-linked monotonic > > into > >> load-linked monotonic >> store seq_cst >> fence release > Which would make an execution ending in %old_x = %old_y = 0
2016 Jan 31
2
Adding sanity to the Atomics implementation
----- Original Message ----- > From: "Ben via llvm-dev Craig" <llvm-dev at lists.llvm.org> > To: llvm-dev at lists.llvm.org > Sent: Thursday, January 28, 2016 9:41:06 AM > Subject: Re: [llvm-dev] Adding sanity to the Atomics implementation > > > I don't have much of substance to add, but I will say that I like the > proposal (I too prefer Alternative
2015 Dec 11
2
RFC: Extending atomic loads and stores to floating point and vector types
On 12/11/2015 12:05 AM, JF Bastien wrote: > On Fri, Dec 11, 2015 at 3:22 AM, Philip Reames via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Currently, we limit atomic loads and stores to either pointer or > integer types. I would like to propose that we extend this to > allow both floating point and vector types
2016 Jan 13
4
RFC: non-temporal fencing in LLVM IR
Hello, fencing enthusiasts! *TL;DR:* We'd like to propose an addition to the LLVM memory model requiring non-temporal accesses be surrounded by non-temporal load barriers and non-temporal store barriers, and we'd like to add such orderings to the fence IR opcode. We are open to different approaches, hence this email instead of a patch. *Who's "we"?* Philip Reames brought
2013 Feb 18
0
[LLVMdev] splitting a branch within a pseudo
This is the old MIPS I code that sort of does what I need to do. This seems really involved to do such a simple thing. Maybe there are now helper classes for this or some better example I can look at. I suppose I can mimick this if people say this just the correct way to do this in LLVM. static MachineBasicBlock* ExpandCondMov(MachineInstr *MI, MachineBasicBlock *BB,
2013 Feb 18
1
[LLVMdev] splitting a branch within a pseudo
Some stuff did not get pasted in properly. static MachineBasicBlock* ExpandCondMov(MachineInstr *MI, MachineBasicBlock *BB, DebugLoc dl, const MipsSubtarget *Subtarget, const TargetInstrInfo *TII, bool isFPCmp, unsigned Opc) { //
2016 Jan 14
2
RFC: non-temporal fencing in LLVM IR
Hi JF, Philip, Clang currently has __builtin_nontemporal_store and __builtin_nontemporal_load. How will the usage model for those change? Thanks again, Hal ----- Original Message ----- > From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org> > To: "JF Bastien" <jfb at google.com>, "llvm-dev" > <llvm-dev at lists.llvm.org> >