similar to: Comparing Clang and GCC: only clang stores updated value in each iteration.

Displaying 20 results from an estimated 600 matches similar to: "Comparing Clang and GCC: only clang stores updated value in each iteration."

2018 Sep 21
2
Comparing Clang and GCC: only clang stores updated value in each iteration.
Hi Philip and Eli, > I think your example may be a bit over reduced.  Unless I'm misreading > this, a starts at 1, is incremented one each iteration, and then is > tested against zero.  The only way this loop can exit is if a has > wrapped around and C++ states that signed integers are assumed to not > overflow.  We can/should be replacing the whole loop with an
2018 Sep 20
3
Aliasing rules difference between GCC and Clang
Hi, I found a difference between Clang and GCC in alias handling. This was with a benchmark where Clang was considerably slower, and in a hot function which does many more loads from the same address due to stores between the uses. In other words, a value is loaded and used, another value is stored, and then the first value is loaded once again before its second use. This happens many times,
2018 Sep 21
2
Aliasing rules difference between GCC and Clang
> > I would say that GCC is wrong and should also have a version where a > could be equal to b. There is no restrict keyword, so they could be > equal. > This was between a/b and c, not between a and b. Could you explain your opinion a bit more in detail, please? /Jonas > Cheers, > > Matthieu > > Le jeu. 20 sept. 2018 à 16:56, Jonas Paulsson via llvm-dev >
2017 Dec 19
4
A code layout related side-effect introduced by rL318299
Hi, Recently 10% performance regression on an important benchmark showed up after we integrated https://reviews.llvm.org/rL318299. The analysis showed that rL318299 triggered loop rotation on an multi exits loop, and the loop rotation introduced code layout issue. The performance regression is a side-effect of rL318299. I got two testcases a.ll and b.ll attached to illustrate the problem. a.ll
2017 Dec 19
2
A code layout related side-effect introduced by rL318299
On Mon, Dec 18, 2017 at 5:46 PM Xinliang David Li <davidxl at google.com> wrote: > The introduction of cleanup.cond block in b.ll without loop-rotation > already makes the layout worse than a.ll. > > > Without introducing cleanup.cond block, the layout out is > > entry->while.cond -> while.body->ret > > All the arrows are hot fall through edges which is
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
Thank You, It means vmovdqa64 zmm22, zmmword ptr [rip + .LCPI0_0] # zmm22 = [8,9,10,11,12,13,14,15] zmm22 will contain 64 bit constant values which are indexes here zmm22=8, 9, 10, 11, 12,13,14,15. not the values loaded from these locations. and zmm2 contains constant 4000. so, vpmuludq zmm14, zmm10, zmm2 ; will multiply the indexes values with 4000, as for array b the stride is 4000. zmm14=
2010 Oct 07
2
[LLVMdev] [Q] x86 peephole deficiency
Hi all, I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125) and now I am running into a deficiency of the x86 peephole optimizer (or jump-threader?). Here is what I get: andl $3, %edi je .LBB0_4 # BB#2: # %nz # in Loop: Header=BB0_1 Depth=1 cmpl $2, %edi
2012 Apr 25
3
[LLVMdev] Not enough optimisations in the SelectionDAG phase?
For the following code fragment, ; <label>:27 ; preds = %27, %entry %28 = load volatile i32* inttoptr (i64 2149581832 to i32*), align 8 %29 = icmp slt i32 %28, 0 br i1 %29, label %27, label %loop.exit loop.exit: ; preds = %27 llc will generate following MIPS code, $BB0_1: lui $3, 32800 ori $3, $3, 1032 lw
2013 Feb 20
3
[LLVMdev] Is va_arg correct on Mips backend?
I didn't have Mips board. I compile as the commands and check the asm output as below. 1. Question: The distance of caller arg[4] and arg[5] is 4 bytes. But the the callee get every arg[] by 8 bytes offset (arg_ptr1+8 or arg_ptr2+8). I assume the #BB#4 and #BB#5 are the arg_ptr which is the pointer to access the stack arguments. 2. Question: Stack memory 28($sp) has no initial value. If
2020 Jun 01
3
Aarch64: unaligned access despite -mstrict-align
Hi, I experienced a crash in code compiled with Clang 10.0.0 due to a misaligned 64-bit data access. The (ARMv8) CPU is configured with SCTL.A == 1 (alignment check enable). With SCTLR.A == 0 the code runs as expected. After some investigation I came up with the following reproducer: ---8<-------8<-------8<-------8<-------8<-------8<-------8<------- $ cat test.c extern char
2013 Feb 20
0
[LLVMdev] Is va_arg correct on Mips backend?
Does it make a difference if you give the "-target" option to clang? $ clang -target mips-linux-gnu ch8_3.cpp -o ch8_3.bc -emit-llvm -c The .s file generated this way looks quite different from the one in your email. On Tue, Feb 19, 2013 at 5:06 PM, Jonathan <gamma_chen at yahoo.com.tw> wrote: > I didn't have Mips board. I compile as the commands and check the asm >
2010 Oct 07
0
[LLVMdev] [Q] x86 peephole deficiency
On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote: > Hi all, > > I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125) > and now I am running into a deficiency of the x86 > peephole optimizer (or jump-threader?). Here is what I get: > > > andl $3, %edi > je .LBB0_4 > # BB#2: # %nz >
2010 Dec 14
2
[LLVMdev] Branch delay slots broken.
The Sparc, Microblaze, and Mips code generators implement branch delay slots. They all seem to exhibit the same bug, which is not surprising since the code is very similar. If I compile code with this snippit: while (n--) *s++ = (char) c; I get this (for the Microblaze): swi r19, r1, 0 add r3, r0, r0 cmp r3, r3, r7 beqid r3,
2009 May 21
2
Naming a random effect in lmer
Dear guRus: I am using lmer for a mixed model that includes a random intercept for a set of effects that have the same distribution, Normal(0, sig2b). This set of effects is of variable size, so I am using an as.formula statement to create the formula for lmer. For example, if the set of random effects has dimension 8, then the lmer call is: Zs<-
2013 Aug 29
2
[LLVMdev] .globl
I need to be able to emit .globl for the soft float routines used by mips16. The routines are called but there is no .globl definition for them. How can I do this? Background: I have a strange issue that I encountered with mips16 hard float. Part of mips16 hard float is to emit calls to runtime routines with the same signature as usual soft float routines, except that they are implemented
2019 Jun 30
6
[hexagon][PowerPC] code regression (sub-optimal code) on LLVM 9 when generating hardware loops, and the "llvm.uadd" intrinsic.
Hi All, The following code : void hexagon2( int *a, int *res ) { int i = 100; while ( i-- ) { *res++ = *a++; } } gets compiled as a sub-optimal Software loop on LLVM 9.0 instead of a Hardware loop, whereas it was compiled as a Hardware Loop in LLVM 7.0. This is the final assembly code generated by LLVM 9.0 : .text .file "main.c" .globl hexagon2 // --
2012 Apr 29
0
[LLVMdev] Not enough optimisations in the SelectionDAG phase?
On Apr 24, 2012, at 11:48 PM, Fan Dawei wrote: > For the following code fragment, > > ; <label>:27 ; preds = %27, %entry > %28 = load volatile i32* inttoptr (i64 2149581832 to i32*), align 8 > %29 = icmp slt i32 %28, 0 > br i1 %29, label %27, label %loop.exit > > loop.exit: ; preds = %27
2013 Feb 19
0
[LLVMdev] Is va_arg correct on Mips backend?
Which part of the generated code do you think is not correct? Could you be more specific? I compiled this program with clang and ran it on a mips board. It returns the expected result (21). On Tue, Feb 19, 2013 at 4:15 AM, Jonathan <gamma_chen at yahoo.com.tw> wrote: > I check the Mips backend for the following C code fragment compile result. > It seems not correct. Is it my
2017 Mar 17
3
LoopVectorizer with ifconversion
On 17 March 2017 at 16:34, Hal Finkel <hfinkel at anl.gov> wrote: > In general, this is true everywhere. In a large vectorized loop, this cost > may well be worthwhile. The idea is that the cost model should account for > all of these costs. If it doesn't properly, we should fix that. Isn't this only worth when the SIMD instructions can be conditionalised per lane? I
2013 Feb 19
2
[LLVMdev] Is va_arg correct on Mips backend?
I check the Mips backend for the following C code fragment compile result. It seems not correct. Is it my misunderstand or it's a bug. //ch8_3.cpp #include <stdarg.h> int sum_i(int amount, ...) { int i = 0; int val = 0; int sum = 0; va_list vl; va_start(vl, amount); for (i = 0; i < amount; i++) { val = va_arg(vl, int); sum += val; } va_end(vl);