similar to: [LLVMdev] XMM in X86 Backend

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] XMM in X86 Backend"

2017 Mar 01
2
[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm
Hi, We seem to have found a bug in the LLVM 3.8 code generator. We are using MCJIT and have isolated working.ll and broken.ll after middle-end optimizations -- in the block merge128, notice that broken.ll has a fcmp une comparison to zero and a jump based on that branch: merge128: ; preds = %true71, %false72 %_rtB_724 = load %B_repro_T*, %B_repro_T**
2012 Apr 05
0
[LLVMdev] Suboptimal code due to excessive spilling
I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround? /Patrik Hägglund -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker Sent: den 28 mars 2012 03:18 To: llvmdev Subject: [LLVMdev] Suboptimal code due to excessive spilling Hi, I have run into the following strange behavior
2012 Mar 28
2
[LLVMdev] Suboptimal code due to excessive spilling
Hi, I have run into the following strange behavior and wanted to ask for some advice. For the C program below, function sum() gets inlined in foo() but the code generated looks very suboptimal (the code is an extract from a larger program). Below I show the 32-bit x86 assembly as produced by the demo page on the llvm home page ("Output A"). As you can see from the assembly, after
2013 Jul 19
0
[LLVMdev] llvm.x86.sse2.sqrt.pd not using sqrtpd, calling a function that modifies ECX
(Changing subject line as diagnosis has changed) I'm attaching the compiled code that I've been getting, both with CodeGenOpt::Default and CodeGenOpt::None . The crash isn't occurring with CodeGenOpt::None, but that seems to be because ECX isn't being used - it still gets set to 0x7fffffff by one of the calls to 76719BA1 I notice that X86::SQRTPD[m|r] appear in
2016 Jun 27
3
Finding caller-saved registers at a function call site
Hi Sanjoy, I'm having trouble finding caller-saved registers using the RegMask operand you've mentioned. As an example, I've got a C function that looks like this: double recurse(int depth, double val) { if(depth < max_depth) return recurse(depth + 1, val * 1.2) + val; else return outer_func(val); } As a quick refresher, all "xmm" registers are considered
2016 Jun 27
0
Finding caller-saved registers at a function call site
Ah, I see -- the registers left out of the mask are considered clobbered. Got it! At a high level, I'm interested in finding the locations of all values that are live at a given call site. You can think of it like a debugger, e.g. gdb -- I'd like to be able to unwind the stack, frame by frame, and locate all the live values for each function invocation (i.e., where they are in a
2019 Aug 09
0
[RFC PATCH v6 79/92] kvm: x86: emulate movsd xmm, m64
From: Mihai Don?u <mdontu at bitdefender.com> This is needed in order to be able to support guest code that uses movsd to write into pages that are marked for write tracking. Signed-off-by: Mihai Don?u <mdontu at bitdefender.com> Signed-off-by: Adalbert Laz?r <alazar at bitdefender.com> --- arch/x86/kvm/emulate.c | 32 +++++++++++++++++++++++++++----- 1 file changed, 27
2018 Nov 15
2
[RFC][llvm-mca] Adding binary support to llvm-mca.
Introduction ----------------- Currently llvm-mca only accepts assembly code as input. We would like to extend llvm-mca to support object files, allowing users to analyze the performance of binaries. The proposed changes (which involve both clang and llvm) optionally introduce an object file section, but this can be stripped-out if desired. For the llvm-mca binary support feature to be useful, a
2006 Apr 19
0
[LLVMdev] floating point exception and SSE2 instructions
Hi Simon, The x86 backend does generate scalar SSE2 instructions. For your example, it should emit something like: .text .align 4 .globl _sum_d _sum_d: subl $12, %esp movl 20(%esp), %eax movl 16(%esp), %ecx cmpl $0, %eax jne LBB_sum_d_2 # cond_true.preheader LBB_sum_d_1: # entry.bb9_crit_edge pxor %xmm0,
2012 Jun 30
2
[LLVMdev] llc -O# / opt -O# differences
Hey everyone, I'm running stock LLVM 3.1 release. Both llc and opt programs have the -O# arguments, however it looks like the results are somewhat different. Here's a silly unoptimized bit of code which I'm generating from my LLVM-backed program ; ModuleID = 'foo' %Coord = type { double, double, double } define double @foo(%Coord*, %Coord*) nounwind uwtable ssp { entry:
2012 Jan 04
1
[LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?
I write a small function and test it under clang and gcc, filet test.c: double X[100]; double Y[100]; double DA = 0.3; int f() { int i; for (i = 0; i < 100; i++) Y[i] = Y[i] - DA * X[i]; return 0; } clang -S -O3 -o test.s test.c -march=native -ccc-echo result: "D:/work/trunk/bin/Release/clang.exe" -cc1 -triple i686-pc-win32 -S -disable-fr e -disable-llvm-verifier
2018 Nov 21
2
[RFC][llvm-mca] Adding binary support to llvm-mca.
Hi Andrea, Thanks for your input. On Wed, Nov 21, 2018 at 12:43:52PM +0000, Andrea Di Biagio wrote: [... snip ...] > About the suggested design: > I like the idea of being able to identify code regions using a numeric > identifier. > However, what happens if a code region spans through multiple basic blocks? The current patch does not take into consideration cases where the region
2009 Dec 07
4
[LLVMdev] 2.6 JIT using wrong address for external functions
I have an external function that was added with ExecutionEngine::addGlobalMapping... and the JIT is burning the wrong address into the emitted function. All of the addresses have 0xffffff8d00000000 added to them. Does this look familiar to anyone? I'm using 2.6 on Solaris10/x64 if it matters. This has been working for about two months and I can't readily figure out what I changed to break
2007 Dec 20
1
[LLVMdev] Code Generation Problem llvm 1.9
I sent a long message yesterday describing a problem I thought had to do with the JIT stubs. After further investigating, the problem seems to be in the code generation. The following basic block seems to have an error in it's code generation: __exp.exit: ; preds = %codeRepl258, %__exp_bb_bb.exit phi double [ 1.000000e+00, %codeRepl258 ], [ %.reload.reload.i,
2009 Dec 07
0
[LLVMdev] 2.6 JIT using wrong address for external functions
I had that problem too: http://llvm.org/bugs/show_bug.cgi?id=5116. To work around the problem, you can: * Switch to the thread-unsafe lazy jit. * Allocate your JIT code within 2GB of your text segment. * Find a way to look up the external function with dlsym or maybe the ExecutionEngine's LazyFunctionCreator instead of addGlobalMapping. * Upgrade to the top of the svn tree. On Mon, Dec 7,
2018 Nov 27
2
[RFC][llvm-mca] Adding binary support to llvm-mca.
Thanks for clarifying it Matt. In general, I quite like your suggested design. My only concern is about the semantic of the two new intrinsics. You design doesn't allow mca ranges to span through multiple basic blocks. That constraint is acceptable for now, since llvm-mca doesn't know how to deal with control flow. However, I am a bit concerned about what might happen in future if we
2013 Oct 15
0
[LLVMdev] [llvm-commits] r192750 - Enable MI Sched for x86.
I should mention a couple of useful self-explanatory LLVM flags for triage: -enable-misched=false -verify-misched -Andy On Oct 15, 2013, at 4:43 PM, Eric Christopher <echristo at gmail.com> wrote: > Grats on the work, a long time coming! > > Beware the incoming register allocation bugs ;) > > -eric > > On Tue, Oct 15, 2013 at 4:33 PM, Andrew Trick <atrick at
2016 Oct 12
4
[test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On Wed, Oct 12, 2016 at 10:53 AM, Hal Finkel <hfinkel at anl.gov> wrote: > I don't think that Clang/LLVM uses it by default on x86_64. If you're using -Ofast, however, that would explain it. I recommend looking at -O3 vs -O0 and make sure those are the same. -Ofast enables -ffast-math, which can legitimately cause differences. > The following tests pass at "-O3" and
2011 Nov 30
0
[PATCH 2/4] x86/emulator: add emulation of SIMD FP moves
Clone the existing movq emulation to also support the most fundamental SIMD FP moves. Extend the testing code to also exercise these instructions. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -629,6 +629,60 @@ int main(int argc, char **argv) else
2018 Dec 03
2
[RFC][llvm-mca] Adding binary support to llvm-mca.
Hi Andrea, On Mon, Dec 03, 2018 at 01:21:33PM +0000, Andrea Di Biagio wrote: > So, I have been thinking a bit more about this whole design. > > The more I think about your suggested design, the more I am convinced that > we should do something more to support ranges in binary object files too. > My understanding is that the reason why we don't support object files in >