thr3ads.net - similar to: "Heads-up: Handling target-specific intrinsics in InstCombine"

Displaying 20 results from an estimated 10000 matches similar to: "Heads-up: Handling target-specific intrinsics in InstCombine"

Eliminate some two entry PHI nodes - SimplifyCFG

2020 Feb 05

Eliminate some two entry PHI nodes - SimplifyCFG

Conditional on the target supporting cmov? Though that's probably not optimal. On Wed, Feb 5, 2020, 7:47 AM Nicolai Hähnle <nhaehnle at gmail.com> wrote: > Hi Ryan, > > On Mon, Feb 3, 2020 at 7:08 PM Ryan Taylor via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > SimplifyCFG FoldTwoEntryPhiNode looks to simplify all 2 entry phi nodes > in a block, if it

Question on fast-math optimizations

2018 Nov 22

Question on fast-math optimizations

On 11/21/18 12:41 PM, Nicolai Hähnle wrote: > On 20.11.18 16:38, Stephen Canon via llvm-dev wrote: >> Distribution doesn’t seem to be used by many transforms at present. >> My vague recollection is that the fast math flags didn’t do a great >> job of characterizing when it would be allowed, and using it >> aggressively broke a lot of code in practice (code which was

[InstCombine] addrspacecast assumed associative with gep

2019 Jun 17

[InstCombine] addrspacecast assumed associative with gep

> What do you mean exactly by "behave differently on the other side of the cast”? Do you have a concrete example? I was hesitant to say only in that it is probably an "abuse of mechanics" and definitely playing with fire, _however_ the target I'm working on has extensive bit operations for a subset of memory, including atomic test-and-set, etc. It's convenient to be

Eliminate some two entry PHI nodes - SimplifyCFG

2020 Feb 03

Eliminate some two entry PHI nodes - SimplifyCFG

SimplifyCFG FoldTwoEntryPhiNode looks to simplify all 2 entry phi nodes in a block, if it can't do them all then it won't do any and returns. There is a lot of code that is directly in this function geared toward this requirement. Is it possible currently to get this function (or pass) to simply fold "some" of the phis (without having to fold them all?). I understand that

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 24

RFC: (Co-)Convergent functions and uniform function parameters

> On Oct 24, 2016, at 4:15 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote: > > On 25.10.2016 01:11, Nicolai Hähnle wrote: >> On 24.10.2016 21:54, Mehdi Amini wrote: >>>> On Oct 24, 2016, at 12:38 PM, Nicolai Hähnle via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>> Some brain-storming on an issue with SPMD/SIMT backend

[RFC] IRBuilder polymorphism: Templates/virtual

2020 Feb 05

[RFC] IRBuilder polymorphism: Templates/virtual

Hi, The IRBuilder is currently templated over a constant folder, and an instruction inserter. https://reviews.llvm.org/D73835 proposes to move this towards using virtual dispatch instead. As this is a larger design change, I would like to get some feedback on this. The current templated design of IRBuilder has a couple of problems: 1. It's not possible to share code between use-sites that

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 26

RFC: (Co-)Convergent functions and uniform function parameters

On 25.10.2016 16:28, Nicolai Hähnle wrote: > But I fear that this path leads to eternal fuzziness. Let me try a > completely different approach to define what we need by augmenting the > semantics of IR with "divergence tokens". In addition to its usual > value, every IR value carries a "divergence set" of divergence tokens. > > The basic rule is: the

atomic ops are optimized with incorrect semantics .

2020 Feb 10

atomic ops are optimized with incorrect semantics .

Hi All, With the "https://gcc.godbolt.org/z/yBYTrd" case . the atomic is converted to non atomic ops for x86 like from xchg dword ptr [100], eax to mov dword ptr [100], 1 the pass is responsible for this tranformation was instCombine i.e InstCombiner::visitAtomicRMWInst which converts the IR like %0 = atomicrmw xchg i32* inttoptr (i64 100 to i32*), i32 1 monotonic to store

cmpxchg on floats

2020 Aug 14

cmpxchg on floats

We've relaxed `atomicrmw xchg` to support floating point types but not cmpxchg -- the cmpxchg comparison behavior is not a floating point comparison, so that would be potentially misleading. I'd say adding the assertion is a good idea. Cheers, Nicolai On Thu, Aug 13, 2020 at 10:59 PM Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Does the code generator

[RFC] Introducing the maynotprogress IR attribute

2020 Sep 07

[RFC] Introducing the maynotprogress IR attribute

On 9/7/20 10:56 AM, Nicolai Hähnle wrote: > Hi Johannes and Atmn, > > On Sat, Sep 5, 2020 at 7:07 AM Johannes Doerfert via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> > In any case, please explain the intended behavior of the attribute and >> > the metadata upon inlining. >> >> The attribute will be attached to the caller upon

Why MachineBasicBlcok doesn't have transferPredecessors() ?

2020 Mar 13

Why MachineBasicBlcok doesn't have transferPredecessors() ?

for example I want to insert a new machine bb “before” a specific machine bb. or split a mbb and keep the later one as the original one. (to keep the label/Blackadder's correct t) (or keep other property of mbb) so I need to transfer the original mbb's predecessor to the new mbb. Nicolai Hähnle <nhaehnle at gmail.com> 於 2020年3月13日週五 23:57 寫道： > On Fri, Mar 13, 2020 at

[RFC] Tablegen-erated GlobalISel Combine Rules

2018 Nov 09

[RFC] Tablegen-erated GlobalISel Combine Rules

Hi Daniel, Disclaimer: Haven't read the proposal yet. > TL;DR: We're planning to define GlobalISel Combine Rules using MIR syntax with a few bits glued on to interface with the algorithm and escape into C++ when we need to. Eventually, ISel rules may follow suit. I would rather avoid adding a dependency on yet another tablegen backend to the project unless we are confident it is

llvm-config with shared libraries in cmake builds broken (since r257003?)

2016 Jan 07

llvm-config with shared libraries in cmake builds broken (since r257003?)

Hi Andrew, since today, I get: $ llvm-config --link-shared --libs engine llvm-config: error: libLLVM-3.8svn.so is missing Looking at the log, this is most likely caused by your recent change. cmake shared library builds generate separate .so files analogous to the static library builds, e.g. libLLVMCodeGen.so (no version suffix, curiously enough). It would be nice if that wasn't broken

Removing pointers from MCInstrDesc for less relocations

2016 May 09

Removing pointers from MCInstrDesc for less relocations

On 09.05.2016 05:19, Benjamin Kramer wrote: > On Mon, May 9, 2016 at 5:35 AM, Nicolai Hähnle <llvm-dev at lists.llvm.org> wrote: >> Hi everybody, >> >> I noticed today that my libLLVM-3.9svn.so has a ~1.7MB .data.rel.ro segment >> - i.e. data that needs to be touched by the dynamic linker even though it's >> ultimately read-only, and data that cannot be

[RFC] Expressing preserved-relations between passes from different modules (was: Re: Linker issue)

2019 Jun 06

[RFC] Expressing preserved-relations between passes from different modules (was: Re: Linker issue)

Any comments at all on this? Chandler perhaps? I've since dug a bit further, and it seems like the template-based solution wouldn't work anyway because DLL loading on Windows can't do the required commoning. So the general approach taken in https://reviews.llvm.org/D62802 seems to be the only technically viable path forward, though it would still be good to get an outside look at the

Condition code in DAGCombiner::visitFADDForFMACombine?

2018 Aug 22

Condition code in DAGCombiner::visitFADDForFMACombine?

On 22.08.2018 13:29, Ryan Taylor wrote: > The example starts as SPIR-V with the NoContraction decoration flag on > the fmul. > > I think what you are saying seems valid in that if the user had put the > flag on the fadd instead of the fmul it would not contract and so in > this example the user needs to put the NoContraction on the fadd though > I'm not sure

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 24

RFC: (Co-)Convergent functions and uniform function parameters

On 24.10.2016 21:54, Mehdi Amini wrote: >> On Oct 24, 2016, at 12:38 PM, Nicolai Hähnle via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> Some brain-storming on an issue with SPMD/SIMT backend support where I think some additional IR attributes would be useful. Sorry for the somewhat long mail; the short version of my current thinking is that I would like to have the following:

Question on fast-math optimizations

2018 Nov 30

Question on fast-math optimizations

On 30.11.18 11:49, Heiko Becker via llvm-dev wrote: > --Resending my last mail, as it might have gotten lost -- > > Thanks Nicolai and Steve for the initial replies. > > So if I understand correctly there are 2 places you can pinpoint at > where distributivity is used: > > - simplification of infinity/NaN expressions > > - combination with FMA introduction Well

Condition code in DAGCombiner::visitFADDForFMACombine?

2018 Aug 23

Condition code in DAGCombiner::visitFADDForFMACombine?

I don't think the global fast math flag should override the NoContraction decoration as that's mostly the point of that decoration to begin with, to have fine granular control while still having a broad sweeping optimization. Did I miss your point? I feel like I did. On Thu, Aug 23, 2018, 3:42 PM Michael Berg <michael_c_berg at apple.com> wrote: > Ryan, > > Given that the

Discourse category for the AMDGPU target

2020 Aug 04

Discourse category for the AMDGPU target

On Mon, Aug 3, 2020 at 7:00 PM David Blaikie <dblaikie at gmail.com> wrote: > I don't have much personal interest here - but my understanding was > that there was/is a fair bit of pushback to fragmenting the > communications channels to discord before there's a more general > buy-in to switch over across the project? (perhaps I'm misremembering > the previous

similar to: Heads-up: Handling target-specific intrinsics in InstCombine