similar to: Where's the optimiser gone? (part 5.a): missed tail calls, and more...

Displaying 20 results from an estimated 3000 matches similar to: "Where's the optimiser gone? (part 5.a): missed tail calls, and more..."

2018 Dec 03
4
Where's the optimiser gone? (part 5.a): missed tail calls, and more...
"Tim Northover" <t.p.northover at gmail.com> wrote: > Hi, > > On Sat, 1 Dec 2018 at 17:37, Stefan Kanthak via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Compile the following functions with "-O3 -target amd64" > > You've been advised before, but you really need to start reporting > these as bugs[*] if you actually care about
2018 Dec 01
2
Where's the optimiser gone? (part 5.c): missed tail calls, and more...
Compile the following functions with "-O3 -target i386-win32" (see <https://godbolt.org/z/exmjWY>): __int64 __fastcall div(__int64 foo, __int64 bar) { return foo / bar; } On the left the generated code; on the right the expected, properly optimised code: push dword ptr [esp + 16] | push dword ptr [esp + 16] | push dword ptr [esp + 16] |
2018 Dec 01
2
Where's the optimiser gone? (part 5.b): missed tail calls, and more...
Compile the following functions with "-O3 -target i386" (see <https://godbolt.org/z/VmKlXL>): long long div(long long foo, long long bar) { return foo / bar; } On the left the generated code; on the right the expected, properly optimised code: div: # @div push ebp | mov ebp, esp | push dword ptr [ebp + 20] | push
2018 Dec 05
4
Where's the optimiser gone? (part 5.a): missed tail calls, and more...
On Tue, Dec 4, 2018 at 3:58 PM Daniel Sanders via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Dec 4, 2018, at 15:11, Stefan Kanthak <stefan.kanthak at nexgo.de> wrote: > No, I understand his intent. I just doesn't align with my intent, > including the hoops he/LLVM wants me to jump through. > > He's not saying they're your bugs, he's just saying
2019 Mar 04
2
Where's the optimiser gone (part 11): use the proper instruction for sign extension
Compile with -O3 -m32 (see <https://godbolt.org/z/yCpBpM>): long lsign(long x) { return (x > 0) - (x < 0); } long long llsign(long long x) { return (x > 0) - (x < 0); } While the code generated for the "long" version of this function is quite OK, the code for the "long long" version misses an obvious optimisation: lsign: # @lsign mov
2018 Nov 27
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
"Sanjay Patel" <spatel at rotateright.com> wrote: > IIUC, you want to use x86-specific bit-hacks (sbb masking) in cases like > this: > unsigned int foo(unsigned int crc) { > if (crc & 0x80000000) > crc <<= 1, crc ^= 0xEDB88320; > else > crc <<= 1; > return crc; > } To document this for x86 too: rewrite the function
2014 Jul 13
2
[LLVMdev] IMUL x86 instruction
Hi, The x86 CPU IMUL instruction has forms such as: IMUL reg EDX:EAX ← EAX ∗ reg reg, EAX and EDX are 32bit registers. How can I represent this sort of instruction in LLVM IR ? It is really a 32bit * 32 bit = 64 bit, but no LLVM IR exists to do that. Or, a similar question: What LLVM IR would produce this IMUL instruction form? For context, I am writing a x86 to LLVM IR decompiler, so wish to
2006 Mar 25
3
Rails Plugins: Why to register your own functionality with send()?
Hi there, I have seen in the file column plugin ( http://www.kanthak.net/opensource/file_column/) from Sebastian Kanthak or David''s acts_as_taggable plugin that to register my functionality I need to do something like this: ApplicationHelper.send(:include, InPlaceEditAssociations) I am wondering why not: (a) module ApplicationHelper include InPlaceEditAssociatons end or: (b)
2007 Apr 30
0
[LLVMdev] Boostrap Failure -- Expected Differences?
On Apr 27, 2007, at 3:50 PM, David Greene wrote: > The saga continues. > > I've been tracking the interface changes and merging them with > the refactoring work I'm doing. I got as far as building stage3 > of llvm-gcc but the object files from stage2 and stage3 differ: > > > warning: ./cc1-checksum.o differs > warning: ./cc1plus-checksum.o differs > >
2018 Nov 28
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
On Wed, Nov 28, 2018 at 7:11 AM Sanjay Patel via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Thanks for reporting this and other perf opportunities. As I mentioned > before, if you could file bug reports for these, that's probably the only > way they're ever going to get fixed (unless you're planning to fix them > yourself). It's not an ideal situation, but
2018 Nov 06
4
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
Hi @ll, while clang/LLVM recognizes common bit-twiddling idioms/expressions like unsigned int rotate(unsigned int x, unsigned int n) { return (x << n) | (x >> (32 - n)); } and typically generates "rotate" machine instructions for this expression, it fails to recognize other also common bit-twiddling idioms/expressions. The standard IEEE CRC-32 for "big
2005 Feb 22
0
[LLVMdev] Area for improvement
On Mon, 21 Feb 2005, Jeff Cohen wrote: > Sorry, I thought I was running selection dag isel but I screwed up when > trying out the really big array. You're right, it does clean it up except > for the multiplication. > > So LoopStrengthReduce is not ready for prime time and doesn't actually get > used? I don't know what the status of it is. You could try it out,
2019 Mar 01
2
Condition removed? Difference between LLVM and GCC on a small testcase
Hello Dev, I have a very simple testcase, which shows strange difference between LLVM and GCC. Does anyone know which optimization pass removes the condition? Thanks! C code: extern void bar(int, int); void foo(int a) { int b, d; if (a > 114) { b = a * 58; } else { d = a * 51; } bar(b, d); } clang.7.0.1 -O2, LLVM generated assembly: 0: 6b c7 3a
2012 Nov 28
1
Build error of NSD4 on Debian Squeeze
Hello World, I am trying to build NSD4 on Debian Squeeze and I get the following errors when running `make`. ``` $ pwd /home/wiz/src/nsd/tags/NSD_4_0_0_imp_5 $ make [... output omitted ...] gcc -g -O2 -o nsd-checkconf answer.o axfr.o buffer.o configlexer.o configparse acket.o query.o rbtree.o radtree.o rdata.o region-allocator.o tsig.o tsig-opens 4_pton.o b64_ntop.o -lcrypto configparser.o: In
2020 Aug 23
2
Apropos "shouting": PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT
Hi Stefan, You can find the contribution guidelines here : https://llvm.org/docs/Contributing.html LLVM also have code of conduct : https://llvm.org/docs/CodeOfConduct.html On Sun, 23 Aug 2020 at 23:28, David Blaikie via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > > On Sun, Aug 23, 2020 at 10:54 AM Stefan Kanthak <stefan.kanthak at nexgo.de> > wrote: > >>
2005 Feb 22
0
[LLVMdev] Area for improvement
On Mon, 21 Feb 2005, Jeff Cohen wrote: > I noticed that fourinarow is one of the programs in which LLVM is much slower > than GCC, so I decided to take a look and see why that is so. The program > has many loops that look like this: > > #define ROWS 6 > #define COLS 7 > > void init_board(char b[COLS][ROWS+1]) > { > int i,j; > > for
2019 Aug 20
1
Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S
"H. Peter Anvin" <hpa at zytor.com> wrote August 20, 2019 12:51 AM: > On 8/14/19 9:42 PM, Stefan Kanthak wrote: >> Hi, >> >> both >> https://git.kernel.org/pub/scm/libs/klibc/klibc.git/plain/usr/klibc/arch/i386/libgcc/__ashldi3.S >> and >> https://git.kernel.org/pub/scm/libs/klibc/klibc.git/plain/usr/klibc/arch/i386/libgcc/__lshrdi3.S
2018 Nov 25
3
BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()
bswapdi2 for i386 is correct Bits 31:0 of the source are loaded into edx. Bits 63:32 are loaded into eax. Those are each bswapped. The ABI for the return is edx contains bits [63:32] and eax contains [31:0]. This is opposite of how the register were loaded. ~Craig On Sun, Nov 25, 2018 at 10:36 AM Craig Topper <craig.topper at gmail.com> wrote: > bswapsi2 on the x86-64 isn't using
2018 Nov 25
3
BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()
Hi @ll, targetting i386, LLVM/clang generates wrong code for the following functions: unsigned long __bswapsi2 (unsigned long ul) { return (((ul) & 0xff000000ul) >> 3 * 8) | (((ul) & 0x00ff0000ul) >> 8) | (((ul) & 0x0000ff00ul) << 8) | (((ul) & 0x000000fful) << 3 * 8); } unsigned long long __bswapdi2(unsigned long
2020 Aug 20
5
Clang is a resource hog, the installers for Windows miss quite some files, and are defect!
Hi @ll, BUGS #1 & #2: ~~~~~~~~~~~~~ The installer LLVM-10.0.0-win64.exe dumps the following DUPLICATE files in "C:\Program Files\LLVM\bin", WASTING about 500MB disk space, which is nearly a third of the disk space occupied by the whole package: | DIR "C:\Program Files\LLVM\bin" /O:-S ... | 25.03.2020 12:15 83.258.880 clang-cl.exe | 25.03.2020 12:03