thr3ads.net - similar to: "[LLVMdev] Best way to replace LLVM IR operation with code containing control flow?"

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] Best way to replace LLVM IR operation with code containing control flow?"

[LLVMdev] Best way to replace LLVM IR operation with code containing control flow?

2012 Jun 19

[LLVMdev] Best way to replace LLVM IR operation with code containing control flow?

Hi Tyler, > -Does anyone know where a backend-specific optimization can be added to replace > an instruction with code containing control flow? I think the backend lowering of atomic intrinsics generates control flow (loops), so that may give you a clue. Ciao, Duncan.

[LLVMdev] Packed instructions generaetd by LoopVectorize?

2013 Apr 04

[LLVMdev] Packed instructions generaetd by LoopVectorize?

Thanks, that did it! Are there any plans to enable the loop vectorizer by default? From: Nadav Rotem [mailto:nrotem at apple.com] Sent: Wednesday, April 03, 2013 13:33 PM To: Nowicki, Tyler Cc: LLVM Developers Mailing List Subject: Re: Packed instructions generaetd by LoopVectorize? Hi Tyler, Try adding -ffast-math. We can only vectorize reduction variables if it is safe to reorder floating

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

2013 Feb 21

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

On Thu, Feb 21, 2013 at 12:14 PM, Nadav Rotem <nrotem at apple.com> wrote: > You can change the input LLVM-IR. > > On Feb 21, 2013, at 7:16 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com> > wrote: > > Hi,**** > > ** ** > > I am interested in evaluating the performance of packed vs scalar > double-precision floating point instructions on

[LLVMdev] Packed instructions generaetd by LoopVectorize?

2013 Apr 03

[LLVMdev] Packed instructions generaetd by LoopVectorize?

Hi Tyler, Try adding -ffast-math. We can only vectorize reduction variables if it is safe to reorder floating point operations. Thanks, Nadav On Apr 3, 2013, at 10:29 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com> wrote: > Hi, > > I have a question about LoopVectorize. I wrote a simple test case, a dot product loop and found that packed instructions are

[LLVMdev] 8-bit DIV IR irregularities

2012 Jun 28

[LLVMdev] 8-bit DIV IR irregularities

I understand, but this sounds like legalization. Does every architecture trigger an overflow exception, as opposed to setting a bit? Perhaps it makes more sense to do this in the backends that trigger an overflow exception? I'm working on a modification for DIV right now in the x86 backend for Intel Atom that will improve performance, however because the *actual* operation has been replaced

[LLVMdev] 8-bit DIV IR irregularities

2012 Jun 27

[LLVMdev] 8-bit DIV IR irregularities

Hi, I noticed that when dividing with signed 8-bit values the IR uses a 32-bit signed divide, however, when unsigned 8-bit values are used the IR uses an 8-bit unsigned divide. Why not use a 8-bit signed divide when using 8-bit signed values? Here is the C code and IR: char idiv8(char a, char b) { char c = a / b; return c; } define signext i8 @idiv8(i8 signext %a, i8 signext %b) nounwind

[LLVMdev] 8-bit DIV IR irregularities

2012 Jun 27

[LLVMdev] 8-bit DIV IR irregularities

On Wed, Jun 27, 2012 at 4:02 PM, Nowicki, Tyler <tyler.nowicki at intel.com> wrote: > Hi, > > > > I noticed that when dividing with signed 8-bit values the IR uses a 32-bit > signed divide, however, when unsigned 8-bit values are used the IR uses an > 8-bit unsigned divide. Why not use a 8-bit signed divide when using 8-bit > signed values? "sdiv i8 -128,

[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom

2013 Sep 30

[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom

Was there any development on this? I noticed that clang still produces a lea for the testcase in llvm.org/pr13320. On 28 September 2012 11:36, Nowicki, Tyler <tyler.nowicki at intel.com> wrote: > Hi, > > > > Here is an update on our proposal to improve the uses of LEA on Atom > processors. > > > > 1. Disable current generation of LEAs > > > > Due to

[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom

2012 Sep 28

[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom

Hi, Here is an update on our proposal to improve the uses of LEA on Atom processors. 1. Disable current generation of LEAs Due to a 3 cycle stall between the ALU and the AGU any address generation done using math instruction will cause a stall on loads and stores which are within 3 cycles of the address generation. Consequently, the heuristics for using LEAs efficiently must know how many

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

2013 Feb 26

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

Thanks for the reply, they were very helpful. Is it enough to prevent BBVectorize from packing together double precision instructions? If a non-clang frontend is used, such as ISPC, is it possible that the IR may contain packed double instruction? Tyler From: Cameron McInally [mailto:cameron.mcinally at nyu.edu] Sent: Thursday, February 21, 2013 6:39 PM To: Nowicki, Tyler Cc: Nadav Rotem; LLVM

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

2013 Feb 21

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

You can change the input LLVM-IR. On Feb 21, 2013, at 7:16 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com> wrote: > Hi, > > I am interested in evaluating the performance of packed vs scalar double-precision floating point instructions on x86-atom and I was wondering if anyone knows more precisely where to modify llvm to use one or the other. I know I probably need

[LLVMdev] Packed instructions generaetd by LoopVectorize?

2013 Apr 03

[LLVMdev] Packed instructions generaetd by LoopVectorize?

Hi, I have a question about LoopVectorize. I wrote a simple test case, a dot product loop and found that packed instructions are generated when input arrays are integer, but not when they are float or double. If I modify the float example in http://llvm.org/docs/Vectorizers.html by adding restrict to the input arrays packed instructions are generated. Although it should not be required I tried

[LLVMdev] 8-bit DIV IR irregularities

2012 Jun 28

[LLVMdev] 8-bit DIV IR irregularities

On Wed, Jun 27, 2012 at 5:22 PM, Nowicki, Tyler <tyler.nowicki at intel.com> wrote: > I understand, but this sounds like legalization. Does every architecture trigger an overflow exception, as opposed to setting a bit? Perhaps it makes more sense to do this in the backends that trigger an overflow exception? The IR instruction has undefined behavior on overflow. This has nothing to do

[LLVMdev] x86 backend assembly - mov esp->reg

2011 Nov 24

[LLVMdev] x86 backend assembly - mov esp->reg

Hi, I've noticed an inconsistency with the x86 backend assembly output in how it treats arguments of a function. Here is a simple test to illustrate the inconsistency: <from test.c> void test() { char ac, bc, cc, dc, fc; ac = (char)Rand(); bc = (char)Rand(); cc = (char)Rand(); dc = (char)Rand(); fc = PartialRegisterOperationsTestChar(ac, bc, cc, dc); } <from

[LLVMdev] Long-Term ISel Design

2011 Mar 16

[LLVMdev] Long-Term ISel Design

All, As I've done more integrating of AVX work upstream and more tuning here, I've run across several things which are clunky in the current isel design. A couple examples I can remember offhand: 1. We have special target-specific operators for certain shuffles in X86, such as X86unpckl. I don't completely understand why but Bruno indicated it was to address inefficiecies.

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

2013 Feb 21

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

Hi, I am interested in evaluating the performance of packed vs scalar double-precision floating point instructions on x86-atom and I was wondering if anyone knows more precisely where to modify llvm to use one or the other. I know I probably need to change something in the type legalizer. Could anyone provide more details than that? Thanks, Tyler -------------- next part -------------- An HTML

[LLVMdev] x86 backend assembly - mov esp->reg

2011 Nov 24

[LLVMdev] x86 backend assembly - mov esp->reg

On Thu, Nov 24, 2011 at 11:39:32AM -0700, Nowicki, Tyler wrote: > When compiled for atom with clang in 32-bit mode the 8-bit variables > in test use 32-bit registers: That's fine since it can avoid partial stales and the value of the padding is undefined. > However, the 8-bit variables in PartialRegisterOperationsTestChar use > 8-bit registers: Same argument. It wants to use the

[LLVMdev] cygwin build broken (X86ISelDAGToDAG.cpp: ‘LOCK_OR8mi’ is not a member of ‘llvm::X86’)

2011 May 14

[LLVMdev] cygwin build broken (X86ISelDAGToDAG.cpp: ‘LOCK_OR8mi’ is not a member of ‘llvm::X86’)

Just a heads up that the llvm build appears to be broken on cygwin. I haven't investigated, but here's the failures: llvm[3]: Compiling X86ISelDAGToDAG.cpp for Release+Asserts build /home/Eric/boost/consulting/svn/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:1487: error: ‘LOCK_OR8mi’ is not a member of ‘llvm::X86’ /home/Eric/boost/consulting/svn/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:1488:

[LLVMdev] Long-Term ISel Design

2011 Mar 17

[LLVMdev] Long-Term ISel Design

On Mar 16, 2011, at 1:44 PM, David Greene wrote: > All, > > As I've done more integrating of AVX work upstream and more tuning here, > I've run across several things which are clunky in the current isel > design. A couple examples I can remember offhand: > > 1. We have special target-specific operators for certain shuffles in X86, > such as X86unpckl. I

[LLVMdev] cygwin build broken (X86ISelDAGToDAG.cpp: ‘LOCK_OR8mi’ is not a member of ‘llvm::X86’)

2011 May 16

[LLVMdev] cygwin build broken (X86ISelDAGToDAG.cpp: ‘LOCK_OR8mi’ is not a member of ‘llvm::X86’)

On May 14, 2011, at 3:08 AM, Eric Niebler wrote: > Just a heads up that the llvm build appears to be broken on cygwin. I > haven't investigated, but here's the failures: > > llvm[3]: Compiling X86ISelDAGToDAG.cpp for Release+Asserts build > /home/Eric/boost/consulting/svn/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:1487: > error: ‘LOCK_OR8mi’ is not a member of ‘llvm::X86’

similar to: [LLVMdev] Best way to replace LLVM IR operation with code containing control flow?