thr3ads.net - similar to: "[LLVMdev] Is alloca instruction allowed within the cycle?"

Displaying 20 results from an estimated 1200 matches similar to: "[LLVMdev] Is alloca instruction allowed within the cycle?"

[LLVMdev] Folding vector instructions

2008 Dec 30

[LLVMdev] Folding vector instructions

Hello. Sorry I am not sure this question should go to llvm or mesa3d-dev mailing list, so I post it to both. I am writing a llvm backend for a modern graphics processor which has a ISA very similar to that of Direct 3D. I am reading the code in Gallium-3D driver in a mesa3d branch, which converts the shader programs (TGSI tokens) to LLVM IR. For the shader instruction also found in LLVM IR,

[LLVMdev] Proposal: Loads/stores with deterministic trap/unwind behavior

2014 Apr 01

[LLVMdev] Proposal: Loads/stores with deterministic trap/unwind behavior

Hi, I wanted to propose an IR extension that would allow us to support zero-cost exception handling for non-call operations that may trap. I wanted to start with loads and stores through a null pointer, and later we might extend this to div/rem/mod zero. This feature is obviously useful for implementing languages such as Java and Go which deterministically translate such operations into

[LLVMdev] Why clang inlines with -O3 flag and opt doesn't?

2010 Sep 03

[LLVMdev] Why clang inlines with -O3 flag and opt doesn't?

When I compile my C fibonacci example fib.c with 'clang -O3 -c -emit-llvm -o fib-clang.bc fib.c&& llvm-dis fib-clang.bc' I get fib-clang.ll that has some degree of inlining in it. But when I get an equivalent to fib.c file fib.ll and run it through opt with the command 'llvm-as fib.ll&& opt -O3 fib.bc -o fib-opt.bc&& llvm-dis fib-opt.bc' resulting

[LLVMdev] Proposal: Loads/stores with deterministic trap/unwind behavior

2014 Apr 07

[LLVMdev] Proposal: Loads/stores with deterministic trap/unwind behavior

On Sat, Apr 05, 2014 at 12:21:17AM -0700, Andrew Trick wrote: > > On Mar 31, 2014, at 6:58 PM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > > Hi, > > > > I wanted to propose an IR extension that would allow us to support zero-cost > > exception handling for non-call operations that may trap. I wanted to start > > with loads and stores through

[LLVMdev] Possible miscompilation?

2008 Jun 12

[LLVMdev] Possible miscompilation?

Gordon Henriksen wrote: > On 2008-06-11, at 13:16, Gary Benson wrote: > > Duncan Sands wrote: > > > Can you please attach IR which can be compiled to an executable > > > (and shows the problem). > > > > I've been generating functions using a builder and then compiling > > them with ExecutionEngine::getPointerToFunction(). Is there some > >

[LLVMdev] Possible miscompilation?

2008 Jun 11

[LLVMdev] Possible miscompilation?

Hi all, I'm trying to figure out a weird bug I'm seeing. I'm hoping it's something simple in my IR but I can't see anything wrong so I'm hoping someone here can see something. I'm using LLVM to compile Java bytecode into native functions. My code keeps track of the Java local variables in an array of llvm::Value pointers which get phi'd up at various points. The

[LLVMdev] Possible miscompilation?

2008 Jun 11

[LLVMdev] Possible miscompilation?

On 2008-06-11, at 13:16, Gary Benson wrote: > Duncan Sands wrote: > >> Can you please attach IR which can be compiled to an executable >> (and shows the problem). > > I've been generating functions using a builder and then compiling > them with ExecutionEngine::getPointerToFunction(). Is there some way > I can get compilable IR from that?

[LLVMdev] Possible miscompilation?

2008 Jun 11

[LLVMdev] Possible miscompilation?

Duncan Sands wrote: > Can you please attach IR which can be compiled > to an executable (and shows the problem). I've been generating functions using a builder and then compiling them with ExecutionEngine::getPointerToFunction(). Is there some way I can get compilable IR from that? Cheers, Gary -- http://gbenson.net/

[LLVMdev] [Mesa3d-dev] Folding vector instructions

2008 Dec 30

[LLVMdev] [Mesa3d-dev] Folding vector instructions

Alex wrote: > Hello. > > Sorry I am not sure this question should go to llvm or mesa3d-dev mailing > list, so I post it to both. > > I am writing a llvm backend for a modern graphics processor which has a ISA > very similar to that of Direct 3D. > > I am reading the code in Gallium-3D driver in a mesa3d branch, which > converts the shader programs (TGSI tokens) to

[LLVMdev] Proposal: add intrinsics for safe division

2014 Apr 26

[LLVMdev] Proposal: add intrinsics for safe division

On Apr 25, 2014, at 2:21 PM, Eric Christopher <echristo at gmail.com> wrote: >> In short, I agree with your observations that these intrinsics are not an >> obvious slam-dunk compared to making the explicit control flow, but I think >> that the intrinsics do give enough flexibility on the LLVM side that it >> would be great if front-ends used them rather than rolling

optimisation issue in an llvm IR pass

2019 Jul 03

optimisation issue in an llvm IR pass

Hello, I have an optimisation issue in an llvm IR pass - the issue being that unnecessary instructions are generated in the final assembly (with -O3). I want to create the following assembly snippet: mov dl,BYTE PTR [rsi+rdi*1] add dl,0x1 adc dl,0x0 mov BYTE PTR [rsi+rdi*1],dl however what is created is (variant #1): mov dl,BYTE PTR [rsi+rdx*1] add dl,0x1 cmp

optimisation issue in an llvm IR pass

2019 Jul 03

optimisation issue in an llvm IR pass

Hi Craig, On 03.07.19 17:33, Craig Topper wrote: > Don't the CreateICmp calls return a Value* with an i1 type? But then > they are added to an i8 type? Not sure that works. I had that initially: auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1)); auto carry = IRB.CreateZExt(cf, Int8Ty); Incr = IRB.CreateAdd(Incr, carry); it makes no difference to the generated assembly

[LLVMdev] ScalarEvolution: Suboptimal handling of globals

2014 Nov 28

[LLVMdev] ScalarEvolution: Suboptimal handling of globals

Hi, For the program below, where "incr" and "Arr" are globals ================================= int incr; float Arr[1000]; int foo () { float x = 0; int newInc = incr+1; for (int i = 0; i < 1000; i++) { for (int j = 0; j < 1000; j += incr) { x += (Arr[i] + Arr[j]); } } return x; } ================================= The SCEV expression computed

[LLVMdev] Proposal: add intrinsics for safe division

2014 Apr 25

[LLVMdev] Proposal: add intrinsics for safe division

On April 25, 2014 at 1:44:37 PM, Reid Kleckner (rnk at google.com) wrote: Thanks for the writeup! It's very helpful. On Fri, Apr 25, 2014 at 11:49 AM, Filip Pizlo <fpizlo at apple.com> wrote: On April 25, 2014 at 10:48:18 AM, Reid Kleckner (rnk at google.com) wrote: On Fri, Apr 25, 2014 at 10:19 AM, Filip Pizlo <fpizlo at apple.com> wrote: The sdiv operation in LLVM IR only

qnbinom with small size is slow

2020 Aug 21

qnbinom with small size is slow

Hi Martin, thanks for verifying. I agree that the Cornish-Fisher seems to struggle with the small size parameters, but I also don't have a good idea how to replace it. But I think fixing do_search() is possible: I think the problem is that when searching to the left y is decremented only if `pnbinom(y - incr, n, pr, /*l._t.*/TRUE, /*log_p*/FALSE)) < p` is FALSE. I think the solution is

[PATCH envytools] nvamemtiming: Handle target < initial case when iterating values

2014 Aug 31

[PATCH envytools] nvamemtiming: Handle target < initial case when iterating values

Otherwise some values are not tested at all. --- nva/set_timings.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/nva/set_timings.c b/nva/set_timings.c index 6cd831c..7a8f845 100644 --- a/nva/set_timings.c +++ b/nva/set_timings.c @@ -408,7 +408,7 @@ static void iterate_values(struct nvamemtiming_conf *conf, FILE *outf, uint8_t index, enum color color) { uint8_t

[LLVMdev] Strange error in generated assembly

2009 Aug 30

[LLVMdev] Strange error in generated assembly

I've spent the better part of a day trying to figure out why my generated assembly code isn't correct. Here is the IR code for my loop: loopbody2: ; preds = %test1 %i14 = load i32* %i5 ; <i32> [#uses=1] %get15 = call %tart.core.String*

Multiple --compare-dest args again

2004 Jun 22

Multiple --compare-dest args again

Hi all. A while ago (April 15th or so) I posted a patch that allows rsync to take multiple --compare-dest or --link-dest arguments, allowing fetching of files not present in multiple trees. I never got any feedback on it, though, so I'm picking it up again. :) Is there any interest in such a patch at all? Below is the usage example i outlined back then; --start-- [...] Its primary usage is

[PATCH v4 09/10] dma-buf-map: Add memcpy and pointer-increment interfaces

2020 Oct 15

[PATCH v4 09/10] dma-buf-map: Add memcpy and pointer-increment interfaces

To do framebuffer updates, one needs memcpy from system memory and a pointer-increment function. Add both interfaces with documentation. Signed-off-by: Thomas Zimmermann <tzimmermann at suse.de> --- include/linux/dma-buf-map.h | 72 +++++++++++++++++++++++++++++++------ 1 file changed, 62 insertions(+), 10 deletions(-) diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h

[LLVMdev] Proposal: add intrinsics for safe division

2014 Apr 26

[LLVMdev] Proposal: add intrinsics for safe division

I am very much in favor of having a div instruction with well defined div-by-zero and overflow behavior. The undefined behavior on certain values for LLVM intrinsics has been a major pain point for us in Julia, because adding the extra branches just kills performance and we know that there is an X86 instruction that just does what we want. Anyway, this was brought up briefly above, but want to

similar to: [LLVMdev] Is alloca instruction allowed within the cycle?