Displaying 20 results from an estimated 10000 matches similar to: "how to find out lea instruction causes skype crash when starting"
2013 Sep 30
0
[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom
Was there any development on this? I noticed that clang still produces
a lea for the testcase in llvm.org/pr13320.
On 28 September 2012 11:36, Nowicki, Tyler <tyler.nowicki at intel.com> wrote:
> Hi,
>
>
>
> Here is an update on our proposal to improve the uses of LEA on Atom
> processors.
>
>
>
> 1. Disable current generation of LEAs
>
>
>
> Due to
2012 Sep 28
2
[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom
Hi,
Here is an update on our proposal to improve the uses of LEA on Atom processors.
1. Disable current generation of LEAs
Due to a 3 cycle stall between the ALU and the AGU any address generation done using math instruction will cause a stall on loads and stores which are within 3 cycles of the address generation. Consequently, the heuristics for using LEAs efficiently must know how many
2013 Oct 02
0
[LLVMdev] Codegen performance issue: LEA vs. INC.
This sounds like llvm.org/pr13320.
On 17 September 2013 18:20, Bader, Aleksey A <aleksey.a.bader at intel.com> wrote:
> Hi all.
>
>
>
> I’m looking for an advice on how to deal with inefficient code generation
> for Intel Nehalem/Westmere architecture on 64-bit platform for the attached
> test.cpp (LLVM IR is in test.cpp.ll).
>
> The inner loop has 11 iterations
2013 Sep 17
2
[LLVMdev] Codegen performance issue: LEA vs. INC.
Hi all.
I'm looking for an advice on how to deal with inefficient code generation for Intel Nehalem/Westmere architecture on 64-bit platform for the attached test.cpp (LLVM IR is in test.cpp.ll).
The inner loop has 11 iterations and eventually unrolled.
Test.lea.s is the assembly code of the outer loop. It simply has 11 loads, 11 FP add, 11 FP mull, 1 FP store and lea+mov for index
2013 Oct 03
2
[LLVMdev] Codegen performance issue: LEA vs. INC.
The two address pass is only concerned about register pressure. It sounds like it should be taught about profitability. In cases where profitability can only be determined with something machinetracemetric then it probably should live it to more sophisticated pass like regalloc.
In this case, we probably need a profitability target hook which knows about lea. We should also consider disabling
2013 Oct 05
0
[LLVMdev] Codegen performance issue: LEA vs. INC.
On Oct 2, 2013, at 11:48 PM, Evan Cheng <evan.cheng at apple.com> wrote:
> The two address pass is only concerned about register pressure. It sounds like it should be taught about profitability. In cases where profitability can only be determined with something machinetracemetric then it probably should live it to more sophisticated pass like regalloc.
>
> In this case, we
2013 Oct 05
1
[LLVMdev] Codegen performance issue: LEA vs. INC.
> The lea->cmp problem is fixed by switching to the MI scheduler. Please run with -mllvm -misched-bench to confirm.
I get the same output in the testcase in pr13320. The leaq is in
between the cmp and the jmp, preventing macro-fusion.
Cheers,
Rafael
2012 Aug 10
0
[LLVMdev] RFC: Adding pass in X86PassConfig::addPreEmitPass for LEA optimization on Atom
Hi,
We are getting ready to implement several heuristics for correctly using LEAs to avoid stalls in the address generator on Atom. Our plan is to:
1. Disabling LEA generation on Atom in X86ISelDAGToDAG:: SelectLEAAddr() for all but a few pseudo-instructions
2. Identify loads and stores in a X86PassConfig::addPreEmitPass() pass and examine several preceding instructions to
2011 Sep 21
1
[LLVMdev] Instruction Selection
I've got a question about instruction selection for a backend I'm writing.
The target has two register classes, RC1 and RC2. The instruction set is far from orthogonal.
The ADD instruction is two address with both register/immediate and register/memory forms. The register operand is in the RC1 class.
The LEA instruction is three address with the destination register in the RC2 class.
2013 Jan 04
3
[LLVMdev] instruction scheduling issue
Hi all,
I'm trying to insert a function call "llvm_memory_profiling " right before each memory access. The function uses the effective address of the memory access as its single parameter.
A example is as follows: the function call at 402a99 has a parameter passed to %rdi at 402a91. One can see that the function call is exactly before the
memory access I want to monitor because
2017 Dec 27
1
Convert MachineInstr to MCInst in AsmPrinter.cpp
Hello everyone,
In the file *lib/CodeGen/AsmPrinter/AsmPrinter.cpp*, I would like to obtain
an MCInst corresponding to its MachineInstr. Can anyone tell me a way to do
that?
If that is not possible, then, I would like to know if a given MachineInstr
is an *lea *instruction and I would like to know if the symbol involved
with this lea instruction is a jump-table.
For instance, given a
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
I submitted the problem report to clang's bugzilla but no one seems to
care so I have to send it to the mailing list.
clang 3.7 svn (trunk 229055 as the time I was to report this problem)
generates slower code than 3.5 (Apple LLVM version 6.0
(clang-600.0.56) (based on LLVM 3.5svn)) for the following code.
It is a "8 queens puzzle" solver written as an educational example. As
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
The regressions in the performance of generated code, introduced
by the llvm 3.6 release, don't seem to be limited to this 8 queens
puzzle" solver test case. See...
http://www.phoronix.com/scan.php?page=article&item=llvm-clang-3.5-3.6-rc1&num=1
where a bit hit in the performance of the Sparse Matrix Multiply test
of the SciMark v2.0 benchmark was observed as well as others.
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
Using the SciMark 2.0 code from
http://math.nist.gov/scimark2/scimark2_1c.zip compiled with the
same...
make CFLAGS="-O3 -march=native"
I am able to reproduce the 22% performance regression in the run time
of the Sparse matmult benchmark.
For 10 runs of the scimark2 benechmark, I get 998.439+/-0.4828 with
the release llvm clang 3.5.1 compiler
and 1217.363+/-1.1004 for the current
2013 Sep 12
2
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
> Anyway, thanks very much for the information. Hopefully that'll let me
> track things down.
Let me know if you need some more information or dumps.
> Would you mind me taking a day or so to investigate what's going on
> here properly? Introducing a volatile to work around a bug in Clang
> itself just seems perverse to me. (And we shouldn't let a CodeGen bug
>
2013 Jan 07
0
[LLVMdev] instruction scheduling issue
Liu,
I do not think there is a trivial way to do it. Do you really _have_ to
have those instructions together, or mere order is enough?
Also, how much performance are you willing to sacrifice to do what you do?
Maybe turning off scheduling all together is an acceptable solution?
Sergei
---
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
The Linux
2013 Sep 13
0
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Pretty sure you need to check EAX>=7 from cpuid leaf 0 before calling leaf
7 and you need to use the pass ECX=0 to leaf 7. See
lib/Target/X86/X86Subtarget.cpp which uses a GetX86CpuIDAndInfoEx function
to pass EAX and ECX to cpuid.
I don't think it explains your compiler bug though.
On Thu, Sep 12, 2013 at 2:12 PM, Adam Strzelecki <ono at java.pl> wrote:
> > Anyway, thanks
2012 Feb 17
1
[PATCH] x86/mm: Make sure the event channel is released accurately
# HG changeset patch
# User h00166998@h00166998.china.huawei.com
# Date 1329462865 -28800
# Node ID 9fd12f919ddbd15927117eff42149664dba698ca
# Parent b75664e5390583c5d2075c82a14245bc941b3aaf
x86/mm: Make sure the event channel is released accurately
In xenpaging source code,there is an interdomain communication between dom0
and domU. In mem_event_enable(),the function
2008 Feb 11
2
[LLVMdev] "make check" failures: leaq in fold-mul-lohi.ll, stride-nine-with-base-reg.ll, stride-reuse.ll
I'm seeing the following failures with "make check" (x86-32 linux):
FAIL: test/CodeGen/X86/fold-mul-lohi.ll
Failed with exit(1) at line 2
while running: llvm-as < test/CodeGen/X86/fold-mul-lohi.ll | llc -march=x86-64 | not grep lea
leaq B, %rsi
leaq A, %r8
leaq P, %rsi
child process exited abnormally
FAIL:
2004 Nov 01
3
Gettin station Ident played on connectiong to stream
Our listeners click on a link which ends in .m3u. So can this be done...if
so how?? I really am new to this sorry!!
--On 01 November 2004 17:48 -0500 Jason <Jason@Weatherserver.net> wrote:
> I do this using .m3u playlists. But I think you mean a way all streams
> get the intro even if someone connects directly to the mount point
> without using the .m3u files.
>
>
>