Displaying 20 results from an estimated 3000 matches similar to: "Where's the optimiser gone? (part 5.a): missed tail calls, and more..."
2018 Dec 03
4
Where's the optimiser gone? (part 5.a): missed tail calls, and more...
"Tim Northover" <t.p.northover at gmail.com> wrote:
> Hi,
>
> On Sat, 1 Dec 2018 at 17:37, Stefan Kanthak via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Compile the following functions with "-O3 -target amd64"
>
> You've been advised before, but you really need to start reporting
> these as bugs[*] if you actually care about
2018 Dec 01
2
Where's the optimiser gone? (part 5.c): missed tail calls, and more...
Compile the following functions with "-O3 -target i386-win32"
(see <https://godbolt.org/z/exmjWY>):
__int64 __fastcall div(__int64 foo, __int64 bar)
{
return foo / bar;
}
On the left the generated code; on the right the expected,
properly optimised code:
push dword ptr [esp + 16] |
push dword ptr [esp + 16] |
push dword ptr [esp + 16] |
2018 Dec 01
2
Where's the optimiser gone? (part 5.b): missed tail calls, and more...
Compile the following functions with "-O3 -target i386"
(see <https://godbolt.org/z/VmKlXL>):
long long div(long long foo, long long bar)
{
return foo / bar;
}
On the left the generated code; on the right the expected,
properly optimised code:
div: # @div
push ebp |
mov ebp, esp |
push dword ptr [ebp + 20] |
push
2018 Dec 05
4
Where's the optimiser gone? (part 5.a): missed tail calls, and more...
On Tue, Dec 4, 2018 at 3:58 PM Daniel Sanders via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Dec 4, 2018, at 15:11, Stefan Kanthak <stefan.kanthak at nexgo.de> wrote:
> No, I understand his intent. I just doesn't align with my intent,
> including the hoops he/LLVM wants me to jump through.
>
> He's not saying they're your bugs, he's just saying
2019 Mar 04
2
Where's the optimiser gone (part 11): use the proper instruction for sign extension
Compile with -O3 -m32 (see <https://godbolt.org/z/yCpBpM>):
long lsign(long x)
{
return (x > 0) - (x < 0);
}
long long llsign(long long x)
{
return (x > 0) - (x < 0);
}
While the code generated for the "long" version of this function is quite
OK, the code for the "long long" version misses an obvious optimisation:
lsign: # @lsign
mov
2018 Nov 27
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
"Sanjay Patel" <spatel at rotateright.com> wrote:
> IIUC, you want to use x86-specific bit-hacks (sbb masking) in cases like
> this:
> unsigned int foo(unsigned int crc) {
> if (crc & 0x80000000)
> crc <<= 1, crc ^= 0xEDB88320;
> else
> crc <<= 1;
> return crc;
> }
To document this for x86 too: rewrite the function
2014 Jul 13
2
[LLVMdev] IMUL x86 instruction
Hi,
The x86 CPU IMUL instruction has forms such as:
IMUL reg
EDX:EAX ← EAX ∗ reg
reg, EAX and EDX are 32bit registers.
How can I represent this sort of instruction in LLVM IR ?
It is really a 32bit * 32 bit = 64 bit, but no LLVM IR exists to do that.
Or, a similar question:
What LLVM IR would produce this IMUL instruction form?
For context, I am writing a x86 to LLVM IR decompiler, so wish to
2006 Mar 25
3
Rails Plugins: Why to register your own functionality with send()?
Hi there,
I have seen in the file column plugin (
http://www.kanthak.net/opensource/file_column/) from Sebastian Kanthak or
David''s acts_as_taggable plugin that to register my functionality I need to
do something like this:
ApplicationHelper.send(:include, InPlaceEditAssociations)
I am wondering why not:
(a)
module ApplicationHelper
include InPlaceEditAssociatons
end
or:
(b)
2007 Apr 30
0
[LLVMdev] Boostrap Failure -- Expected Differences?
On Apr 27, 2007, at 3:50 PM, David Greene wrote:
> The saga continues.
>
> I've been tracking the interface changes and merging them with
> the refactoring work I'm doing. I got as far as building stage3
> of llvm-gcc but the object files from stage2 and stage3 differ:
>
>
> warning: ./cc1-checksum.o differs
> warning: ./cc1plus-checksum.o differs
>
>
2018 Nov 28
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
On Wed, Nov 28, 2018 at 7:11 AM Sanjay Patel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Thanks for reporting this and other perf opportunities. As I mentioned
> before, if you could file bug reports for these, that's probably the only
> way they're ever going to get fixed (unless you're planning to fix them
> yourself). It's not an ideal situation, but
2018 Nov 06
4
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
Hi @ll,
while clang/LLVM recognizes common bit-twiddling idioms/expressions
like
unsigned int rotate(unsigned int x, unsigned int n)
{
return (x << n) | (x >> (32 - n));
}
and typically generates "rotate" machine instructions for this
expression, it fails to recognize other also common bit-twiddling
idioms/expressions.
The standard IEEE CRC-32 for "big
2005 Feb 22
0
[LLVMdev] Area for improvement
On Mon, 21 Feb 2005, Jeff Cohen wrote:
> Sorry, I thought I was running selection dag isel but I screwed up when
> trying out the really big array. You're right, it does clean it up except
> for the multiplication.
>
> So LoopStrengthReduce is not ready for prime time and doesn't actually get
> used?
I don't know what the status of it is. You could try it out,
2019 Mar 01
2
Condition removed? Difference between LLVM and GCC on a small testcase
Hello Dev,
I have a very simple testcase, which shows strange difference between LLVM and GCC. Does anyone know which optimization pass removes the condition? Thanks!
C code:
extern void bar(int, int);
void foo(int a) {
int b, d;
if (a > 114) {
b = a * 58;
} else {
d = a * 51;
}
bar(b, d);
}
clang.7.0.1 -O2, LLVM generated assembly:
0: 6b c7 3a
2012 Nov 28
1
Build error of NSD4 on Debian Squeeze
Hello World,
I am trying to build NSD4 on Debian Squeeze and I get the following
errors when running `make`.
```
$ pwd
/home/wiz/src/nsd/tags/NSD_4_0_0_imp_5
$ make
[... output omitted ...]
gcc -g -O2 -o nsd-checkconf answer.o axfr.o buffer.o configlexer.o
configparse
acket.o query.o rbtree.o radtree.o rdata.o region-allocator.o tsig.o
tsig-opens
4_pton.o b64_ntop.o -lcrypto
configparser.o: In
2020 Aug 23
2
Apropos "shouting": PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT
Hi Stefan,
You can find the contribution guidelines here :
https://llvm.org/docs/Contributing.html
LLVM also have code of conduct : https://llvm.org/docs/CodeOfConduct.html
On Sun, 23 Aug 2020 at 23:28, David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On Sun, Aug 23, 2020 at 10:54 AM Stefan Kanthak <stefan.kanthak at nexgo.de>
> wrote:
>
>>
2005 Feb 22
0
[LLVMdev] Area for improvement
On Mon, 21 Feb 2005, Jeff Cohen wrote:
> I noticed that fourinarow is one of the programs in which LLVM is much slower
> than GCC, so I decided to take a look and see why that is so. The program
> has many loops that look like this:
>
> #define ROWS 6
> #define COLS 7
>
> void init_board(char b[COLS][ROWS+1])
> {
> int i,j;
>
> for
2019 Aug 20
1
Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S
"H. Peter Anvin" <hpa at zytor.com> wrote August 20, 2019 12:51 AM:
> On 8/14/19 9:42 PM, Stefan Kanthak wrote:
>> Hi,
>>
>> both
>> https://git.kernel.org/pub/scm/libs/klibc/klibc.git/plain/usr/klibc/arch/i386/libgcc/__ashldi3.S
>> and
>> https://git.kernel.org/pub/scm/libs/klibc/klibc.git/plain/usr/klibc/arch/i386/libgcc/__lshrdi3.S
2018 Nov 25
3
BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()
bswapdi2 for i386 is correct
Bits 31:0 of the source are loaded into edx. Bits 63:32 are loaded into
eax. Those are each bswapped. The ABI for the return is edx contains bits
[63:32] and eax contains [31:0]. This is opposite of how the register were
loaded.
~Craig
On Sun, Nov 25, 2018 at 10:36 AM Craig Topper <craig.topper at gmail.com>
wrote:
> bswapsi2 on the x86-64 isn't using
2018 Nov 25
3
BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()
Hi @ll,
targetting i386, LLVM/clang generates wrong code for the following
functions:
unsigned long __bswapsi2 (unsigned long ul)
{
return (((ul) & 0xff000000ul) >> 3 * 8)
| (((ul) & 0x00ff0000ul) >> 8)
| (((ul) & 0x0000ff00ul) << 8)
| (((ul) & 0x000000fful) << 3 * 8);
}
unsigned long long __bswapdi2(unsigned long
2020 Aug 20
5
Clang is a resource hog, the installers for Windows miss quite some files, and are defect!
Hi @ll,
BUGS #1 & #2:
~~~~~~~~~~~~~
The installer LLVM-10.0.0-win64.exe dumps the following DUPLICATE files
in "C:\Program Files\LLVM\bin", WASTING about 500MB disk space, which is
nearly a third of the disk space occupied by the whole package:
| DIR "C:\Program Files\LLVM\bin" /O:-S
...
| 25.03.2020 12:15 83.258.880 clang-cl.exe
| 25.03.2020 12:03