thr3ads.net - similar to: "[LLVMdev] Another missed optimization opportunity?"

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] Another missed optimization opportunity?"

[LLVMdev] Another missed optimization opportunity?

2013 Apr 24

[LLVMdev] Another missed optimization opportunity?

Hi Scott, On 24/04/13 19:40, Scott Pakin wrote: > I was suprised to find that some bitcode I'm generating isn't getting > optimized. Here, I'm doing the equivalent of "myarray[5]++" (on an > "extern int *myarray"), repeated three times: does your bitcode contain data layout information? Ciao, Duncan. > > @myarray = external global i32* >

[LLVMdev] Another missed optimization opportunity?

2013 Apr 24

[LLVMdev] Another missed optimization opportunity?

The semantic reason is that the optimizer is required to assume that the i32 stores could be storing to the storage of myarray. LLVM IR does not permit optimizers to optimize based on the nominal types of memory objects or memory accesses. This gets optimized in C, because the C compiler adds special TBAA metadata annotations to the loads and stores which say that the stores of "int" do

[LLVMdev] Another missed optimization opportunity?

2013 Apr 24

[LLVMdev] Another missed optimization opportunity?

On 04/24/2013 01:29 PM, Caldarale, Charles R wrote: > Is this a potential aliasing effect? Since myarray is defined as a pointer, not an array, it's theoretically possible that the address therein refers to the same memory location as the pointer itself. I was thinking along those lines, but I haven't been able to come up with a specific instance of what could possibly be aliased.

[LLVMdev] Another missed optimization opportunity?

2013 Apr 24

[LLVMdev] Another missed optimization opportunity?

> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Scott Pakin > Subject: [LLVMdev] Another missed optimization opportunity? > I'm doing the equivalent of "myarray[5]++" (on an > "extern int *myarray"), repeated three times: > I had expected the three increments by 1 to > be collapsed into a single increment

[LLVMdev] Another missed optimization opportunity?

2013 Apr 24

[LLVMdev] Another missed optimization opportunity?

> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Scott Pakin > Subject: Re: [LLVMdev] Another missed optimization opportunity? > > Is this a potential aliasing effect? Since myarray is defined as a > > pointer, not an array, it's theoretically possible that the address > > therein refers to the same memory location as

[LLVMdev] Another missed optimization opportunity?

2013 Apr 24

[LLVMdev] Another missed optimization opportunity?

Hey Scott, On Wed, Apr 24, 2013 at 1:40 PM, Scott Pakin <pakin at lanl.gov> wrote: ... > > Is there some semantic reason that the increments aren't allowed to be > combined, or is this a missed optimization opportunity in LLVM? > > I believe that the wildcard is the extern keyword. Since the external symbol isn't resolved until link time, I suspect that it would be

RFC: Insertion of nops for performance stability

2016 Nov 17

RFC: Insertion of nops for performance stability

Hi all, These days I am working on a feature designed to insert nops for IA code generation that will provide performance improvements and performance stability. This feature will not affect other architectures. It will, however, set up an infrastructure for other architectures to do the same, if ever needed. Here are some examples for cases in which nops can improve performance: 1. DSB

RFC: Insertion of nops for performance stability

2016 Nov 20

RFC: Insertion of nops for performance stability

Hi Hal, A pre-emit pass will indeed be preferable. I originally thought of it, too, however I could not figure out how can such a pass have an access to information on instruction sizes and block alignments. I know that for X86, at least, the branch relaxation is happening during the layout phase in the Assembler, where I plan to integrate the nop insertion such that the new MCPerfNopFragment

RFC: Insertion of nops for performance stability

2016 Nov 21

RFC: Insertion of nops for performance stability

Hi Hal, Thanks for the reference. I’ve looked at PPCBranchSelector and the PowerPC backend. It is very different from the X86 architecture and unfortunately the way branch relaxation and alignment related issues are handled in PPC cannot be copied to X86. This is because: 1. PPC instructions are of fixed length while X86 instructions are of variable length, and their length can change

Shell Script: Simple array usage = bad substitution?

2009 Jun 19

Shell Script: Simple array usage = bad substitution?

Hey Guys n Gals; I have some arrays that I can't seem to expand correctly (if that's the correct word?), imagine the following example: #!/bin/bash myArray=("First" "Second" "Third") First=("Monday" "Tuesdays" "Wednesday") Second=("One" "Two" "Three") Third=("A" "B"

[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm

2017 Mar 01

[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm

Hi, We seem to have found a bug in the LLVM 3.8 code generator. We are using MCJIT and have isolated working.ll and broken.ll after middle-end optimizations -- in the block merge128, notice that broken.ll has a fcmp une comparison to zero and a jump based on that branch: merge128: ; preds = %true71, %false72 %_rtB_724 = load %B_repro_T*, %B_repro_T**

Expected constant simplification not happening

2016 Feb 11

Expected constant simplification not happening

Hi the appended IR code does not optimize to my liking :) this is the interesting part in x86_64, that got produced via clang -Os: --- movq -16(%r12), %rax movl -4(%rax), %ecx andl $2298949, %ecx ## imm = 0x231445 cmpq $2298949, (%rax,%rcx) ## imm = 0x231445 leaq 8(%rax,%rcx), %rax cmovneq %r15, %rax movl $2298949, %esi ## imm = 0x231445 movq %r12, %rdi movq %r14,

[LLVMdev] Need a clue to improve the optimization of some C code

2015 Mar 03

[LLVMdev] Need a clue to improve the optimization of some C code

Hi I have some inline function C code, that llvm could be optimizing better. Since I am new to this, I wonder if someone could give me a few pointers, how to approach this in LLVM. Should I try to change the IR code -somehow- to get the code generator to generate better code, or should I rather go to the code generator and try to add an optimization pass ? Thanks for any feedback. Ciao

[BUG Report] -dead_strip, strips prefix data unconditionally on macOS

2017 Mar 07

[BUG Report] -dead_strip, strips prefix data unconditionally on macOS

Firstly, do you need "main.dsp" defined as an external symbol, or can all external references go via "main"? If the answer is the latter, that will make the solution simpler. If only the latter, you will need to make a change to LLVM here: http://llvm-cs.pcc.me.uk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#650 Basically you would need to add a hook to the TargetLoweringObjectFile

Expected constant simplification not happening

2016 Dec 07

Expected constant simplification not happening

Hello Has there been any progress on this topic ? The 3.9 optimizer output is still the same as I just looked. https://llvm.org/bugs/show_bug.cgi?id=24448 Ciao Nat! Sanjay Patel schrieb: > [cc'ing Zia] > > We have this transform with -Os for some cases after: > http://reviews.llvm.org/rL244601 > http://reviews.llvm.org/D11363 > > but something in this example is

[LLVMdev] SIMD for sdiv <2 x i64>

2015 Jul 24

[LLVMdev] SIMD for sdiv <2 x i64>

On 07/24/2015 03:42 AM, Benjamin Kramer wrote: >> On 24.07.2015, at 08:06, zhi chen <zchenhn at gmail.com> wrote: >> >> It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

2010 Aug 31

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

Hi, I've attached 2 .ll files which are supposed to be equivalent but 'unopt-fail.ll' causes a crash in webkit's test suite while 'unopt-pass.ll' does not. I can't give more details about the crash, when I run the crashing test it in isolation it passes, when I run the full suite it crashes; it boggles the mind. Below I provide the optimized asm that is produced from

[LLVMdev] SIMD for sdiv <2 x i64>

2015 Jul 24

[LLVMdev] SIMD for sdiv <2 x i64>

------------------------------------ IR ------------------------------------------------------------------ if.then.i.i.i.i.i.i: ; preds = %if.then4 %S25_D = zext <2 x i32> %splatLDS17_D.splat to <2 x i64> %umul_with_overflow.i.iS26_D = shl <2 x i64> %S25_D, <i64 3, i64 3> %extumul_with_overflow.i.iS26_D = extractelement <2 x i64>

new comer's question

2006 Jan 16

new comer's question

I am new to R. I try to search the web but could not find the answer so I post it here asking for help. I have a csv file looks like this: (between two ==== lines) =========================== Machine Name,"Resource, Type","Resource, Sub-type","Resource, Instance",Date,,Data ->,,,,,, ,0.041666667,,,,,,,,,,, Time (HH:MM)

[Linux/Python 2.4.2] Forking Python doesn't work

2008 Feb 13

[Linux/Python 2.4.2] Forking Python doesn't work

Hello When a call comes in, I'd like to fork a Python script that broadcasts a message so that users see the CID name + number pop up on their computer screen, and simultaneously ring their phones. The following script doesn't work as planned: It waits until the script ends before moving on to the next step, which is Dial(): =========== exten =>

similar to: [LLVMdev] Another missed optimization opportunity?