similar to: [LLVMdev] Another missed optimization opportunity?

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] Another missed optimization opportunity?"

2013 Apr 24
0
[LLVMdev] Another missed optimization opportunity?
Hi Scott, On 24/04/13 19:40, Scott Pakin wrote: > I was suprised to find that some bitcode I'm generating isn't getting > optimized. Here, I'm doing the equivalent of "myarray[5]++" (on an > "extern int *myarray"), repeated three times: does your bitcode contain data layout information? Ciao, Duncan. > > @myarray = external global i32* >
2013 Apr 24
0
[LLVMdev] Another missed optimization opportunity?
The semantic reason is that the optimizer is required to assume that the i32 stores could be storing to the storage of myarray. LLVM IR does not permit optimizers to optimize based on the nominal types of memory objects or memory accesses. This gets optimized in C, because the C compiler adds special TBAA metadata annotations to the loads and stores which say that the stores of "int" do
2013 Apr 24
2
[LLVMdev] Another missed optimization opportunity?
On 04/24/2013 01:29 PM, Caldarale, Charles R wrote: > Is this a potential aliasing effect? Since myarray is defined as a pointer, not an array, it's theoretically possible that the address therein refers to the same memory location as the pointer itself. I was thinking along those lines, but I haven't been able to come up with a specific instance of what could possibly be aliased.
2013 Apr 24
0
[LLVMdev] Another missed optimization opportunity?
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Scott Pakin > Subject: [LLVMdev] Another missed optimization opportunity? > I'm doing the equivalent of "myarray[5]++" (on an > "extern int *myarray"), repeated three times: > I had expected the three increments by 1 to > be collapsed into a single increment
2013 Apr 24
0
[LLVMdev] Another missed optimization opportunity?
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Scott Pakin > Subject: Re: [LLVMdev] Another missed optimization opportunity? > > Is this a potential aliasing effect? Since myarray is defined as a > > pointer, not an array, it's theoretically possible that the address > > therein refers to the same memory location as
2013 Apr 24
0
[LLVMdev] Another missed optimization opportunity?
Hey Scott, On Wed, Apr 24, 2013 at 1:40 PM, Scott Pakin <pakin at lanl.gov> wrote: ... > > Is there some semantic reason that the increments aren't allowed to be > combined, or is this a missed optimization opportunity in LLVM? > > I believe that the wildcard is the extern keyword. Since the external symbol isn't resolved until link time, I suspect that it would be
2016 Nov 17
4
RFC: Insertion of nops for performance stability
Hi all, These days I am working on a feature designed to insert nops for IA code generation that will provide performance improvements and performance stability. This feature will not affect other architectures. It will, however, set up an infrastructure for other architectures to do the same, if ever needed. Here are some examples for cases in which nops can improve performance: 1. DSB
2016 Nov 20
3
RFC: Insertion of nops for performance stability
Hi Hal, A pre-emit pass will indeed be preferable. I originally thought of it, too, however I could not figure out how can such a pass have an access to information on instruction sizes and block alignments. I know that for X86, at least, the branch relaxation is happening during the layout phase in the Assembler, where I plan to integrate the nop insertion such that the new MCPerfNopFragment
2016 Nov 21
2
RFC: Insertion of nops for performance stability
Hi Hal, Thanks for the reference. I’ve looked at PPCBranchSelector and the PowerPC backend. It is very different from the X86 architecture and unfortunately the way branch relaxation and alignment related issues are handled in PPC cannot be copied to X86. This is because: 1. PPC instructions are of fixed length while X86 instructions are of variable length, and their length can change
2009 Jun 19
1
Shell Script: Simple array usage = bad substitution?
Hey Guys n Gals; I have some arrays that I can't seem to expand correctly (if that's the correct word?), imagine the following example: #!/bin/bash myArray=("First" "Second" "Third") First=("Monday" "Tuesdays" "Wednesday") Second=("One" "Two" "Three") Third=("A" "B"
2017 Mar 01
2
[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm
Hi, We seem to have found a bug in the LLVM 3.8 code generator. We are using MCJIT and have isolated working.ll and broken.ll after middle-end optimizations -- in the block merge128, notice that broken.ll has a fcmp une comparison to zero and a jump based on that branch: merge128: ; preds = %true71, %false72 %_rtB_724 = load %B_repro_T*, %B_repro_T**
2016 Feb 11
3
Expected constant simplification not happening
Hi the appended IR code does not optimize to my liking :) this is the interesting part in x86_64, that got produced via clang -Os: --- movq -16(%r12), %rax movl -4(%rax), %ecx andl $2298949, %ecx ## imm = 0x231445 cmpq $2298949, (%rax,%rcx) ## imm = 0x231445 leaq 8(%rax,%rcx), %rax cmovneq %r15, %rax movl $2298949, %esi ## imm = 0x231445 movq %r12, %rdi movq %r14,
2015 Mar 03
2
[LLVMdev] Need a clue to improve the optimization of some C code
Hi I have some inline function C code, that llvm could be optimizing better. Since I am new to this, I wonder if someone could give me a few pointers, how to approach this in LLVM. Should I try to change the IR code -somehow- to get the code generator to generate better code, or should I rather go to the code generator and try to add an optimization pass ? Thanks for any feedback. Ciao
2017 Mar 07
4
[BUG Report] -dead_strip, strips prefix data unconditionally on macOS
Firstly, do you need "main.dsp" defined as an external symbol, or can all external references go via "main"? If the answer is the latter, that will make the solution simpler. If only the latter, you will need to make a change to LLVM here: http://llvm-cs.pcc.me.uk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#650 Basically you would need to add a hook to the TargetLoweringObjectFile
2016 Dec 07
1
Expected constant simplification not happening
Hello Has there been any progress on this topic ? The 3.9 optimizer output is still the same as I just looked. https://llvm.org/bugs/show_bug.cgi?id=24448 Ciao Nat! Sanjay Patel schrieb: > [cc'ing Zia] > > We have this transform with -Os for some cases after: > http://reviews.llvm.org/rL244601 > http://reviews.llvm.org/D11363 > > but something in this example is
2015 Jul 24
2
[LLVMdev] SIMD for sdiv <2 x i64>
On 07/24/2015 03:42 AM, Benjamin Kramer wrote: >> On 24.07.2015, at 08:06, zhi chen <zchenhn at gmail.com> wrote: >> >> It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing
2010 Aug 31
5
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
Hi, I've attached 2 .ll files which are supposed to be equivalent but 'unopt-fail.ll' causes a crash in webkit's test suite while 'unopt-pass.ll' does not. I can't give more details about the crash, when I run the crashing test it in isolation it passes, when I run the full suite it crashes; it boggles the mind. Below I provide the optimized asm that is produced from
2015 Jul 24
0
[LLVMdev] SIMD for sdiv <2 x i64>
------------------------------------ IR ------------------------------------------------------------------ if.then.i.i.i.i.i.i: ; preds = %if.then4 %S25_D = zext <2 x i32> %splatLDS17_D.splat to <2 x i64> %umul_with_overflow.i.iS26_D = shl <2 x i64> %S25_D, <i64 3, i64 3> %extumul_with_overflow.i.iS26_D = extractelement <2 x i64>
2006 Jan 16
3
new comer's question
I am new to R. I try to search the web but could not find the answer so I post it here asking for help. I have a csv file looks like this: (between two ==== lines) =========================== Machine Name,"Resource, Type","Resource, Sub-type","Resource, Instance",Date,,Data ->,,,,,, ,0.041666667,,,,,,,,,,, Time (HH:MM)
2008 Feb 13
2
[Linux/Python 2.4.2] Forking Python doesn't work
Hello When a call comes in, I'd like to fork a Python script that broadcasts a message so that users see the CID name + number pop up on their computer screen, and simultaneously ring their phones. The following script doesn't work as planned: It waits until the script ends before moving on to the next step, which is Dial(): =========== exten =>