similar to: [LLVMdev] x86 backend assembly - mov esp->reg

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] x86 backend assembly - mov esp->reg"

2011 Nov 24
0
[LLVMdev] x86 backend assembly - mov esp->reg
On Thu, Nov 24, 2011 at 11:39:32AM -0700, Nowicki, Tyler wrote: > When compiled for atom with clang in 32-bit mode the 8-bit variables > in test use 32-bit registers: That's fine since it can avoid partial stales and the value of the padding is undefined. > However, the 8-bit variables in PartialRegisterOperationsTestChar use > 8-bit registers: Same argument. It wants to use the
2019 Sep 10
2
tablegen exponential behavior
Hi, I implemented a pattern matching of the dot product for arm64 and it seemed to work well for the basic case, i.e., class mulB<SDPatternOperator ldop> : PatFrag<(ops node:$Rn, node:$Rm, node:$offset), (mul (ldop (add node:$Rn, node:$offset)), (ldop (add node:$Rm, node:$offset)))>; class mulBz<SDPatternOperator ldop> : PatFrag<(ops node:$Rn,
2013 Apr 04
1
[LLVMdev] Packed instructions generaetd by LoopVectorize?
Thanks, that did it! Are there any plans to enable the loop vectorizer by default? From: Nadav Rotem [mailto:nrotem at apple.com] Sent: Wednesday, April 03, 2013 13:33 PM To: Nowicki, Tyler Cc: LLVM Developers Mailing List Subject: Re: Packed instructions generaetd by LoopVectorize? Hi Tyler, Try adding -ffast-math. We can only vectorize reduction variables if it is safe to reorder floating
2013 Feb 21
2
[LLVMdev] Generate scalar SSE instructions instead of packed instructions
On Thu, Feb 21, 2013 at 12:14 PM, Nadav Rotem <nrotem at apple.com> wrote: > You can change the input LLVM-IR. > > On Feb 21, 2013, at 7:16 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com> > wrote: > > Hi,**** > > ** ** > > I am interested in evaluating the performance of packed vs scalar > double-precision floating point instructions on
2012 Jun 27
2
[LLVMdev] 8-bit DIV IR irregularities
Hi, I noticed that when dividing with signed 8-bit values the IR uses a 32-bit signed divide, however, when unsigned 8-bit values are used the IR uses an 8-bit unsigned divide. Why not use a 8-bit signed divide when using 8-bit signed values? Here is the C code and IR: char idiv8(char a, char b) { char c = a / b; return c; } define signext i8 @idiv8(i8 signext %a, i8 signext %b) nounwind
2012 Sep 28
2
[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom
Hi, Here is an update on our proposal to improve the uses of LEA on Atom processors. 1. Disable current generation of LEAs Due to a 3 cycle stall between the ALU and the AGU any address generation done using math instruction will cause a stall on loads and stores which are within 3 cycles of the address generation. Consequently, the heuristics for using LEAs efficiently must know how many
2013 Apr 03
2
[LLVMdev] Packed instructions generaetd by LoopVectorize?
Hi, I have a question about LoopVectorize. I wrote a simple test case, a dot product loop and found that packed instructions are generated when input arrays are integer, but not when they are float or double. If I modify the float example in http://llvm.org/docs/Vectorizers.html by adding restrict to the input arrays packed instructions are generated. Although it should not be required I tried
2013 Apr 03
0
[LLVMdev] Packed instructions generaetd by LoopVectorize?
Hi Tyler, Try adding -ffast-math. We can only vectorize reduction variables if it is safe to reorder floating point operations. Thanks, Nadav On Apr 3, 2013, at 10:29 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com> wrote: > Hi, > > I have a question about LoopVectorize. I wrote a simple test case, a dot product loop and found that packed instructions are
2012 Jun 28
2
[LLVMdev] 8-bit DIV IR irregularities
I understand, but this sounds like legalization. Does every architecture trigger an overflow exception, as opposed to setting a bit? Perhaps it makes more sense to do this in the backends that trigger an overflow exception? I'm working on a modification for DIV right now in the x86 backend for Intel Atom that will improve performance, however because the *actual* operation has been replaced
2013 Sep 30
0
[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom
Was there any development on this? I noticed that clang still produces a lea for the testcase in llvm.org/pr13320. On 28 September 2012 11:36, Nowicki, Tyler <tyler.nowicki at intel.com> wrote: > Hi, > > > > Here is an update on our proposal to improve the uses of LEA on Atom > processors. > > > > 1. Disable current generation of LEAs > > > > Due to
2012 Jun 27
0
[LLVMdev] 8-bit DIV IR irregularities
On Wed, Jun 27, 2012 at 4:02 PM, Nowicki, Tyler <tyler.nowicki at intel.com> wrote: > Hi, > > > > I noticed that when dividing with signed 8-bit values the IR uses a 32-bit > signed divide, however, when unsigned 8-bit values are used the IR uses an > 8-bit unsigned divide. Why not use a 8-bit signed divide when using 8-bit > signed values? "sdiv i8 -128,
2013 Feb 21
2
[LLVMdev] Generate scalar SSE instructions instead of packed instructions
Hi, I am interested in evaluating the performance of packed vs scalar double-precision floating point instructions on x86-atom and I was wondering if anyone knows more precisely where to modify llvm to use one or the other. I know I probably need to change something in the type legalizer. Could anyone provide more details than that? Thanks, Tyler -------------- next part -------------- An HTML
2013 Feb 26
0
[LLVMdev] Generate scalar SSE instructions instead of packed instructions
Thanks for the reply, they were very helpful. Is it enough to prevent BBVectorize from packing together double precision instructions? If a non-clang frontend is used, such as ISPC, is it possible that the IR may contain packed double instruction? Tyler From: Cameron McInally [mailto:cameron.mcinally at nyu.edu] Sent: Thursday, February 21, 2013 6:39 PM To: Nowicki, Tyler Cc: Nadav Rotem; LLVM
2013 Feb 21
0
[LLVMdev] Generate scalar SSE instructions instead of packed instructions
You can change the input LLVM-IR. On Feb 21, 2013, at 7:16 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com> wrote: > Hi, > > I am interested in evaluating the performance of packed vs scalar double-precision floating point instructions on x86-atom and I was wondering if anyone knows more precisely where to modify llvm to use one or the other. I know I probably need
2012 Jun 28
0
[LLVMdev] 8-bit DIV IR irregularities
On Wed, Jun 27, 2012 at 5:22 PM, Nowicki, Tyler <tyler.nowicki at intel.com> wrote: > I understand, but this sounds like legalization. Does every architecture trigger an overflow exception, as opposed to setting a bit? Perhaps it makes more sense to do this in the backends that trigger an overflow exception? The IR instruction has undefined behavior on overflow. This has nothing to do
2013 Apr 15
1
[LLVMdev] State of Loop Unrolling and Vectorization in LLVM
Hi , I have a test case (and a micro benchmark made out of the test case) to check if loop unrolling and loop vectorization is efficiently done on LLVM. Here is the test case (credits: Tyler Nowicki) {code} extern float * array; extern int array_size; float g() { int i; float total = 0; for(i = 0; i < array_size; i++) { total += array[i]; } return total; } {code} When
2012 Jun 18
2
[LLVMdev] Best way to replace LLVM IR operation with code containing control flow?
Hi, -Does anyone know where a backend-specific optimization can be added to replace an instruction with code containing control flow? I'm interested in adding an optimization for the DIV instruction (x86-atom) which replace the IDIV/DIV with code containing control flow to select between the intended IDIV/DIV and an 8-bit DIV with movzx, as described in the Intel Atom Optimization Guide. My
2016 May 17
2
Working on FP SCEV Analysis
> On May 16, 2016, at 5:35 PM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > ----- Original Message ----- >> From: "Sanjoy Das via llvm-dev" <llvm-dev at lists.llvm.org> >> To: escha at apple.com >> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Michael V Zolotukhin" <michael.v.zolotukhin at
2012 Jun 19
0
[LLVMdev] Best way to replace LLVM IR operation with code containing control flow?
Hi Tyler, > -Does anyone know where a backend-specific optimization can be added to replace > an instruction with code containing control flow? I think the backend lowering of atomic intrinsics generates control flow (loops), so that may give you a clue. Ciao, Duncan.
2009 Mar 30
0
[PATCH 1/1] Add Diagnostic MBR for trouble-shooting
--- mbr/Makefile | 6 +- mbr/mbr-diag.S | 371 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 376 insertions(+), 1 deletions(-) create mode 100644 mbr/mbr-diag.S diff --git a/mbr/Makefile b/mbr/Makefile index 0bdf7e3..b9d743d 100644 --- a/mbr/Makefile +++ b/mbr/Makefile @@ -17,7 +17,7 @@ topdir = .. include $(topdir)/MCONFIG.embedded -all: mbr.bin gptmbr.bin