similar to: Where's the optimiser gone? (part 5.c): missed tail calls, and more...

Displaying 20 results from an estimated 6000 matches similar to: "Where's the optimiser gone? (part 5.c): missed tail calls, and more..."

2018 Dec 01
2
Where's the optimiser gone? (part 5.b): missed tail calls, and more...
Compile the following functions with "-O3 -target i386" (see <https://godbolt.org/z/VmKlXL>): long long div(long long foo, long long bar) { return foo / bar; } On the left the generated code; on the right the expected, properly optimised code: div: # @div push ebp | mov ebp, esp | push dword ptr [ebp + 20] | push
2019 Mar 04
2
Where's the optimiser gone (part 11): use the proper instruction for sign extension
Compile with -O3 -m32 (see <https://godbolt.org/z/yCpBpM>): long lsign(long x) { return (x > 0) - (x < 0); } long long llsign(long long x) { return (x > 0) - (x < 0); } While the code generated for the "long" version of this function is quite OK, the code for the "long long" version misses an obvious optimisation: lsign: # @lsign mov
2005 Feb 22
0
[LLVMdev] Area for improvement
On Mon, 21 Feb 2005, Jeff Cohen wrote: > I noticed that fourinarow is one of the programs in which LLVM is much slower > than GCC, so I decided to take a look and see why that is so. The program > has many loops that look like this: > > #define ROWS 6 > #define COLS 7 > > void init_board(char b[COLS][ROWS+1]) > { > int i,j; > > for
2005 Feb 22
2
[LLVMdev] Area for improvement
Sorry, I thought I was running selection dag isel but I screwed up when trying out the really big array. You're right, it does clean it up except for the multiplication. So LoopStrengthReduce is not ready for prime time and doesn't actually get used? I might consider whipping it into shape. Does it still have to handle getelementptr in its full generality? Chris Lattner wrote:
2018 Dec 01
2
Where's the optimiser gone? (part 5.a): missed tail calls, and more...
Compile the following functions with "-O3 -target amd64" (see <https://godbolt.org/z/5xqYhH>): __int128 div(__int128 foo, __int128 bar) { return foo / bar; } On the left the generated code; on the right the expected, properly optimised code: div: # @div push rbp | mov rbp, rsp | call __divti3 | jmp __divti3 pop rbp | ret
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
Thank You, It means vmovdqa64 zmm22, zmmword ptr [rip + .LCPI0_0] # zmm22 = [8,9,10,11,12,13,14,15] zmm22 will contain 64 bit constant values which are indexes here zmm22=8, 9, 10, 11, 12,13,14,15. not the values loaded from these locations. and zmm2 contains constant 4000. so, vpmuludq zmm14, zmm10, zmm2 ; will multiply the indexes values with 4000, as for array b the stride is 4000. zmm14=
2005 Feb 22
5
[LLVMdev] Area for improvement
I noticed that fourinarow is one of the programs in which LLVM is much slower than GCC, so I decided to take a look and see why that is so. The program has many loops that look like this: #define ROWS 6 #define COLS 7 void init_board(char b[COLS][ROWS+1]) { int i,j; for (i=0;i<COLS;i++) for (j=0;j<ROWS;j++) b[i][j]='.';
2009 Jan 06
2
[LLVMdev] LLVM Optmizer
The following C code : #include <stdio.h> #include <stdlib.h> int TESTE2( int parami , int paraml ,double paramd ) { int varx=0,vary; int nI =0; //varx= parami; if( parami > 0 ) { varx = parami; vary = varx + 1; } else { varx = vary + 1; vary = paraml; } varx = varx + parami + paraml; for( nI = 1 ; nI <= paraml; nI++) { varx =
2005 Feb 22
0
[LLVMdev] Area for improvement
On Mon, 21 Feb 2005, Jeff Cohen wrote: > Sorry, I thought I was running selection dag isel but I screwed up when > trying out the really big array. You're right, it does clean it up except > for the multiplication. > > So LoopStrengthReduce is not ready for prime time and doesn't actually get > used? I don't know what the status of it is. You could try it out,
2009 Jan 07
3
[LLVMdev] LLVM optmization
The following C test program was compiled using LLVM with -O3 option and MSVC with /O2. The MSVC one is about 600 times faster than the one compiled with the LLVM. We can see that the for loop in MSVC assembler is solved in the optimization pass more efficiently than that in LLVM. Is there an way to get a optimization result in LLVM like that of the MSVC? Manoel Teixeira #include
2011 Mar 06
1
[PATCH] core: Fix 'trackbuf' descriptor list byte length
(Tested using a Linux bzImage, with and without an initrd.) Per shuffle_and_boot documentation, %ecx must contain the descriptor list byte length, but it's set with such list end address instead. Fix. Signed-off-by: Ahmed S. Darwish <darwish.07 at gmail.com> -- core/bcopy32.inc | 2 ++ core/bcopyxx.inc | 2 ++ core/bootsect.inc | 8 +++++--- core/runkernel.inc |
2013 Jul 19
3
[LLVMdev] fptoui calling a function that modifies ECX
Try adding ECX to the Defs of this part of lib/Target/X86/X86InstrCompiler.td like I've done below. I don't have a Windows machine to test myself. let Defs = [EAX, EDX, ECX, EFLAGS], FPForm = SpecialFP in { def WIN_FTOL_32 : I<0, Pseudo, (outs), (ins RFP32:$src), "# win32 fptoui", [(X86WinFTOL RFP32:$src)]>,
2013 Jul 19
0
[LLVMdev] fptoui calling a function that modifies ECX
Oh, excellent point, I agree. My bad. Now that I'm not assuming those are the sqrt, I see the sqrtpd's in the output. Also there are three fptoui's and there are 3 call instances. (Changing subject line again.) Now it looks like it's bug #13862 On 19/07/2013 4:51 PM, Craig Topper wrote: > I think those calls correspond to this > > %110 = fptoui double %109 to i32
2013 Jul 19
2
[LLVMdev] fptoui calling a function that modifies ECX
I don't think that's going to work. On Fri, Jul 19, 2013 at 12:24 AM, Peter Newman <peter at uformia.com> wrote: > Thank you, I'm trying this now. > > > On 19/07/2013 5:23 PM, Craig Topper wrote: > > Try adding ECX to the Defs of this part of > lib/Target/X86/X86InstrCompiler.td like I've done below. I don't have a > Windows machine to test
2013 Jul 19
0
[LLVMdev] fptoui calling a function that modifies ECX
Thank you, I'm trying this now. On 19/07/2013 5:23 PM, Craig Topper wrote: > Try adding ECX to the Defs of this part of > lib/Target/X86/X86InstrCompiler.td like I've done below. I don't have > a Windows machine to test myself. > > let Defs = [EAX, EDX, ECX, EFLAGS], FPForm = SpecialFP in { > def WIN_FTOL_32 : I<0, Pseudo, (outs), (ins RFP32:$src), >
2013 Jul 19
2
[LLVMdev] fptoui calling a function that modifies ECX
Here's my attempt at a fix. Adding Jakob to make sure I did this right. On Fri, Jul 19, 2013 at 2:34 AM, Peter Newman <peter at uformia.com> wrote: > That does appear to have worked. All my tests are passing now. > > I'll hand this out to our other devs & testers and make sure it's working > for them as well (not just on my machine). > > Thank you,
2013 Jul 19
0
[LLVMdev] fptoui calling a function that modifies ECX
That does appear to have worked. All my tests are passing now. I'll hand this out to our other devs & testers and make sure it's working for them as well (not just on my machine). Thank you, again. -- Peter N On 19/07/2013 5:45 PM, Craig Topper wrote: > I don't think that's going to work. > > > On Fri, Jul 19, 2013 at 12:24 AM, Peter Newman <peter at
2013 Jul 20
0
[LLVMdev] fptoui calling a function that modifies ECX
I've applied this and the test cases I have here continue to work, so it looks good to me. I've ran into another (seemingly unrelated) issue which I'll describe in a separate email to the dev list. -- Peter N On 20/07/2013 5:30 AM, Craig Topper wrote: > Here's my attempt at a fix. Adding Jakob to make sure I did this right. > > > On Fri, Jul 19, 2013 at 2:34 AM,
2018 Dec 05
4
Where's the optimiser gone? (part 5.a): missed tail calls, and more...
On Tue, Dec 4, 2018 at 3:58 PM Daniel Sanders via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Dec 4, 2018, at 15:11, Stefan Kanthak <stefan.kanthak at nexgo.de> wrote: > No, I understand his intent. I just doesn't align with my intent, > including the hoops he/LLVM wants me to jump through. > > He's not saying they're your bugs, he's just saying
2018 Dec 03
4
Where's the optimiser gone? (part 5.a): missed tail calls, and more...
"Tim Northover" <t.p.northover at gmail.com> wrote: > Hi, > > On Sat, 1 Dec 2018 at 17:37, Stefan Kanthak via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Compile the following functions with "-O3 -target amd64" > > You've been advised before, but you really need to start reporting > these as bugs[*] if you actually care about