similar to: [LLVMdev] LLVM IR and Naked functions in C/C++

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] LLVM IR and Naked functions in C/C++"

2016 Jun 30
0
Help required regarding IPRA and Local Function optimization
One more interesting thing I have noticed is as following : In sqlite3 code consider 3 functions namely sqlite3Update, sqlite3Select and sqlite3Where begin sqlite3WhereBegin is called by both functions sqlite3Update and sqlite3Select but according to CallGraphSCC sqlite3Update is codegen before in that case during RegMask propagation phase default regmask is used for call site of
2016 Nov 13
2
llc generating code that writes below the stack pointer on darwin/x86-64
Hi, Is there something wrong with my inline assembly below? *** target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" target triple = "x86_64-apple-macosx10.5" define void @"\01_SYSTEM_$$_SETMXCSR$LONGWORD"(i32 %p.w) nobuiltin { ; [71] procedure
2016 Jun 30
4
Help required regarding IPRA and Local Function optimization
Hello Mentors, I am currently finding bug in Local Function related optimization due to which runtime failures are observed in some test cases, as those test cases are containing very large function with recursion and object oriented code so I am not able to find a pattern which is causing failure. So I tried following simple case to understand expected behavior from this optimization. Consider
2009 Jan 09
1
[LLVMdev] naked assembler / function written entirely in asm
Hi everybody. I'm having (yet) another look at trying to get naked functions from D (1) working in our LLVM D Compiler - LDC (2). I have this test case: /// D CODE /// extern(C) int printf(char*, ...); ulong retval() { asm { naked; mov EAX, 0xff; mov EDX, 0xaa; ret; } } ulong retval2() { return (cast(ulong)0xaa << 32) | 0xff; } void main() {
2013 Apr 21
0
[LLVMdev] Naked functions
In gcc, naked functions do not have a return instruction. It seems that in llvm they always have a return, although the prologue and epilogue is suppressed.
2013 Jul 10
4
[LLVMdev] unaligned AVX store gets split into two instructions
I'm seeing a difference in how LLVM 3.3 and 3.2 emit unaligned vector loads on AVX. 3.3 is splitting up an unaligned vector load but in 3.2, it was emitted as a single instruction (details below). In a matrix-matrix inner-kernel, I see a ~25% decrease in performance, which seems to be due to this. Any ideas why this changed? Thanks! Zach LLVM Code: define <4 x double> @vstore(<4 x
2010 Sep 01
2
[LLVMdev] equivalent IR, different asm
I attached preprocessed files. $ llvm-g++ gcc-RenderBoxModelObject.ii -fno-exceptions -arch x86_64 -O2 -c -o part.o vs $ clang++ clang-RenderBoxModelObject.ii -fno-exceptions -arch x86_64 -O2 -c -o part.o If I compile with clang, it causes a crash to webkit. -Argiris -------------- next part -------------- A non-text attachment was scrubbed... Name: prepro.zip Type: application/zip Size:
2012 Jul 29
0
[LLVMdev] rotate
*NOTE* IIRC compiling this with -O0 on x86-64 can yield the wrong result since clang will emit shifts and on intel shifts are mod the register size: ===== .section __TEXT,__text,regular,pure_instructions .globl _ror .align 4, 0x90 _ror: ## @ror .cfi_startproc ## BB#0: pushq %rbp Ltmp2: .cfi_def_cfa_offset 16 Ltmp3: .cfi_offset %rbp, -16 movq %rsp, %rbp
2006 Jun 27
2
Bare naked server
Thanks, Zed, for mongrel :!) I''ve been on TextDrive for some time running my sites behind apache w/lighttpd. Once it got running it was cool enough. When it ran. And there was still the webmin dance and all. So I decided to set things up on my little dedicated redhat server - since I''d had so much success locally w/mongrel. Fast. Simple. It has been great and I managed to move
2013 Dec 08
1
OGG loads as a naked file, but not if embedded in an IFF.
I can now load and play AIFF, WAV, and OGG files using libsndfile and libao. Now I'm moving on to loading and playing files embedded in an IFF file. This is presenting a problem. Playing an AIFF embedded in an IFF works fine. Playing an OGG embedded in the same IFF fails. Specifically, it appears that sf_open_fd() can't figure out what it's being told to load. Except for
2010 Sep 01
0
[LLVMdev] equivalent IR, different asm
On Sep 1, 2010, at 6:25 AM, Argyrios Kyrtzidis wrote: > The attached .ll files seem equivalent, but the resulting asm from 'opt-fail.ll' causes a crash to webkit. > I suspect the usage of registers is wrong, can someone take a look ? The difference is that there is a shift right after the multiply, before the divide. In IR, the difference is: %5 = mul nsw i32 %4, %tmp1
2010 Aug 31
0
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
On Aug 31, 2010, at 1:21 PMPDT, Argyrios Kyrtzidis wrote: > > Just to be clear, are you saying that the fact that, after using llc > on the second IR, the produced asm is using MM registers, indicates > a bug ? Yes. It's not immediately obvious whether it's in the opt or llc, though. Chris was doing work involving <2 x float> and may know about this. >
2015 May 04
2
[LLVMdev] Incorrect code generated for arm64
Thanks Bruce. > On 4 May 2015, at 13:18, Bruce Hoult <bruce at hoult.org> wrote: > > I can confirm that, with Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn) > > Very strange! Yes, that’s what I thought. I’ve also checked the binary downloads for OS X from llvm.org <http://llvm.org/> and get the same broken code from both the 3.5.2 and 3.6.0 releases.
2010 Sep 01
5
[LLVMdev] equivalent IR, different asm
The attached .ll files seem equivalent, but the resulting asm from 'opt-fail.ll' causes a crash to webkit. I suspect the usage of registers is wrong, can someone take a look ? $ llc opt-pass.ll -o - .section __TEXT,__text,regular,pure_instructions .globl __ZN7WebCore6kolos1ERiS0_PKNS_20RenderBoxModelObjectEPNS_10StyleImageE .align 4, 0x90
2012 Aug 13
2
[LLVMdev] x86 REP-prefixed instructions seem to be dropped by instruction decoder?
I think there's a bug somewhere in TableGen for the X86 disassembler emitter. The following test: $ echo "0xF3 0xA5" | ./bin/llvm-mc -disassemble .section __TEXT,__text,regular,pure_instructions movsd (from llvm trunk) 0xF3 is the REP prefix, so the printed instruction should be 'rep movsd', however all that is printed is 'movsd'. It seems that there
2010 Mar 16
0
[LLVMdev] LLVM-GCC generating too much code from inline assembly
You may find it helpful to reference http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html. In particular, the information regarding clobbers and constraints. Generally speaking, it's best not to use inline assembly at all. What are you trying to do that you find it necessary? On Mar 16, 2010, at 12:30 AM, Fred / Kettch wrote: > Hi, > > I recently switched to LLVM-GCC
2010 Mar 16
3
[LLVMdev] LLVM-GCC generating too much code from inline assembly
Hi, I recently switched to LLVM-GCC 4.2 on OS X, to go around a bug caused by gcc with optimized code. Unfortunately, I ran into another weird problem on LLVM-GCC. In my code, there's a file with a bunch of inline assembly blocks, that worked fine with GCC 4.2. Now, when compiling with LLVM-GCC 4.2, weird things happen. Here's an example: (the blocks are larger than that, but a single
2013 Oct 22
1
[LLVMdev] System call miscompilation using the fast register allocator
Hi, Apologies this is a bit lengthy. TLDR: I'm using Dragonegg + LLVM 3.2 and uClibc, and am finding that using the Fast register allocator (i.e. -optimize-regalloc=0) causes miscompilation of setsockopt calls (5-arg system calls). The problem doesn't happen with the default register allocation path selected. It can be worked around by manually simplifying the system call setup
2010 Mar 28
0
[LLVMdev] Which floating-point comparison?
On Sun, Mar 28, 2010 at 7:45 AM, Russell Wallace <russell.wallace at gmail.com> wrote: > I notice llvm provides both ordered and unordered variants of > floating-point comparison. Which of these is the right one to use by > default? I suppose the two criteria would be, in order of importance: > > 1. Which is more efficient (more directly maps to typical hardware)? You can
2012 Aug 14
0
[LLVMdev] x86 REP-prefixed instructions seem to be dropped by instruction decoder?
On 13 August 2012 12:02, Andrew Ruef <awruef at umd.edu> wrote: > I think there's a bug somewhere in TableGen for the X86 disassembler > emitter. The following test: > > $ echo "0xF3 0xA5" | ./bin/llvm-mc -disassemble > .section __TEXT,__text,regular,pure_instructions > movsd > > (from llvm trunk) > > 0xF3 is the REP prefix, so the