thr3ads.net - search: "intrnoduplicate"

Displaying 6 results from an estimated 6 matches for "intrnoduplicate".

2014 Sep 30

[LLVMdev] Behaviour of NVPTX intrinsic

is there any guarantee that the nvptx intrinsic "llvm.nvvm.barrier0" will not be moved around by opt ? In other words, can I expect all the instructions above "llvm.nvvm.barrier0" to remain above it and those below it to remain below, after all the opt passes are run ? If that is not the case, is there a way to define such an intrinsic ? Thanks. -------------- next part

[LLVMdev] Behaviour of NVPTX intrinsic

2014 Sep 30

[LLVMdev] Behaviour of NVPTX intrinsic

..."llvm.nvvm.barrier0" to remain above it and those below it to remain below, >> after all the opt passes are run ? >> > > AFAIU, yes. Here's the definition: > > def int_nvvm_barrier0 : GCCBuiltin<"__nvvm_bar0">, > Intrinsic<[], [], [IntrNoDuplicate]>; > > Note that IntrNoDuplicate is the only intrinsic attribute. It has no other > attributes (like IntrNoMem) that would make it permissible for LLVM > optimizations to reorder things around it. By default, the optimizers would > not do this for function calls; only if these fun...

Preserving Call to Intrinsic function

2017 Jan 27

Preserving Call to Intrinsic function

Hello everyone, Consider we have this following set of code: int foo() { int a,b; a = __builtin_XX(0x11); b = __builtin_XX(0x11); return a+b; } The problem currently is that LLVM eliminated the second call and copied the result from the first call into a new set of registers. Is there is a way to force LLVM to generate two explicit calls to a builtin function. The builtin takes in an integer

LV: predication

2020 May 04

LV: predication

> The harm comes if the intrinsic ends up with the wrong value, or attached to the wrong loop. The intrinsic is marked as IntrNoDuplicate, so I wasn't worried about it ending up somewhere else. Also, it is a property of a specific loop, a tail-folded vector loop, that holds even after it is transformed I think. I.e. unrolling a vector loop is probably not what you want, but even if you do the element count would remain the same....

[RFC] Intrinsics for Hardware Loops

2019 May 20

[RFC] Intrinsics for Hardware Loops

Hi, Arm have recently announced the v8.1-M architecture specification for our next generation microcontrollers. The architecture includes vector extensions (MVE) and support for low-overhead branches (LoB), which can be thought of a style of hardware loop. Hardware loops aren't new to LLVM, other backends (at least Hexagon and PPC that I know of) also include support. These implementations

LV: predication

2020 May 01

LV: predication

Hi Eli, > The problem with your proposal, as written, is that the vectorizer is producing the intrinsic. Because we don’t impose any ordering on optimizations before codegen, every optimization pass in LLVM would have to be taught to preserve any @llvm.set.loop.elements.i32 whenever it makes any change. This is completely impractical because the intrinsic isn’t related to anything

search for: intrnoduplicate