similar to: A code layout related side-effect introduced by rL318299

Displaying 20 results from an estimated 400 matches similar to: "A code layout related side-effect introduced by rL318299"

A code layout related side-effect introduced by rL318299

2017 Dec 19

2

A code layout related side-effect introduced by rL318299

On Mon, Dec 18, 2017 at 5:46 PM Xinliang David Li <davidxl at google.com> wrote: > The introduction of cleanup.cond block in b.ll without loop-rotation > already makes the layout worse than a.ll. > > > Without introducing cleanup.cond block, the layout out is > > entry->while.cond -> while.body->ret > > All the arrows are hot fall through edges which is

2017 Apr 28

3

Store unswitch

Hi Danny, Thanks for that :) However I've just updated the prototype patch to NewGVN and it didn't need any API changes - all I rely on is GVNExpression. Hongbin, I wanted to explain a little about what GVNSink can currently do, what it was designed for and hopefully how to make it handle your testcase. *Background* Common code sinking is more difficult to efficently do than one might

[LLVMdev] Apparent indeterminism in PreVerifier

2013 Jan 29

2

[LLVMdev] Apparent indeterminism in PreVerifier

Hello everybody, I have a case of suspected indeterminism and I would like to verify that it is not a known issue before I dig deep into it. It seems to happen during PreVerifier pass ("Preliminary module verification"). The little I understand/assume about it, a verifier pass is not supposed to change the code (or is it?) but in debug stream I see the following: Common predecessor:

[LLVMdev] Apparent indeterminism in PreVerifier

2013 Jan 29

0

[LLVMdev] Apparent indeterminism in PreVerifier

Nadav, As I peel this onion, it looks like you might know something about InnerLoopVectorizer::addRuntimeCheck. What does it do, and can it be causing the below described issue? Could resuming somehow (indeterministically) switch the order of PHIs in the original code? Thanks a lot. Sergei. --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

[LLVMdev] Apparent indeterminism in PreVerifier

2013 Jan 29

2

[LLVMdev] Apparent indeterminism in PreVerifier

Hi Sergei, "addRuntimeCheck" inserts code that checks that two or more arrays are disjoint. I looked at the code and it looks fine. We generate PHIs in the order that they appear in a vector. The values are inserted in 'canVectorizeMemory', which also looks fine. Please let me know if you think I missed something. Thanks, Nadav On Jan 29, 2013, at 8:48 AM, Sergei Larin

[LLVMdev] Apparent indeterminism in PreVerifier

2013 Jan 29

0

[LLVMdev] Apparent indeterminism in PreVerifier

Nadav, Thanks for the quick response. By now I am convinced that the given loop ends up vectorized with enough difference to cause bad things later on, but I have not found the exact cause yet. To continue with my work I'll have to simply turn off vectorization for now, but I will come back and investigate. Again, there is some indeterminism in order of PHIs processing somewhere. I'll

[LLVMdev] Apparent indeterminism in PreVerifier

2013 Jan 29

1

[LLVMdev] Apparent indeterminism in PreVerifier

Is there a test case that you can share ? On Jan 29, 2013, at 9:24 AM, Sergei Larin <slarin at codeaurora.org> wrote: > Nadav, > > Thanks for the quick response. By now I am convinced that the given loop > ends up vectorized with enough difference to cause bad things later on, but > I have not found the exact cause yet. To continue with my work I'll have to >

[LLVMdev] MP1: Gelementptr question

2002 Sep 14

1

[LLVMdev] MP1: Gelementptr question

The following is legal LLVM code in which ptr, ptr2, and ptr3 are all aliases: %struct = type { int, int } implementation int %p() { %ptr1 = alloca %struct %ptr2 = getelementptr %struct* %ptr1 %ptr3 = getelementptr %struct* %ptr2, uint 0 %pint = getelementptr %struct* %ptr3, uint 0, ubyte 0 %rval = load int* %pint ret int %rval } Should our pass a) ignore this, not replace %ptr1,

Aliasing and forwarding optimization

2020 Jun 19

2

Aliasing and forwarding optimization

----Snip-- struct st1{ int a; }; struct st2{ int b; }; struct st { struct st1 obj1; struct st2 obj2; }Obj; int test1(struct st1 * ptr1 , struct st2 * ptr2, struct st2 *ptr3) { ptr1->a = 10; *ptr3 = *ptr2; return ptr1->a; } --Snip--- For the above case GCC is able to store forward the value 10 to the return place. LLVM is not doing this. GCC https://godbolt.org/z/FCjCXy LLVM

MMX/mmxext optimisations

2004 Aug 24

5

MMX/mmxext optimisations

quite some speed improvement indeed. attached the updated patch to apply to svn/trunk. j -------------- next part -------------- A non-text attachment was scrubbed... Name: theora-mmx.patch.gz Type: application/x-gzip Size: 8648 bytes Desc: not available Url : http://lists.xiph.org/pipermail/theora-dev/attachments/20040824/5a5f2731/theora-mmx.patch-0001.bin

[LLVMdev] instruction scheduling issue

2013 Jan 07

0

[LLVMdev] instruction scheduling issue

Liu, This is likely a better solution for you - you do not want to mess with the scheduler unless you really have to ;) Sergei --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Krzysztof Parzyszek > Sent:

[LLVMdev] instruction scheduling issue

2013 Jan 07

4

[LLVMdev] instruction scheduling issue

On 1/7/2013 2:15 PM, Xu Liu wrote: > > This would be ideal. How can I do the instrumentation pass after the > instruction scheduling? You could derive your own class from TargetPassConfig, and add the annotation pass in YourDerivedTargetPassConfig::addPreEmitPass. This will add your annotation pass very late, just before the final code is emitted. If you're using the X86 target,

Finding which registers the operand of a load maps to

2018 Mar 21

0

Finding which registers the operand of a load maps to

Appreciate all of the quick responses to my ridiculous questions so far. Hoping this one attracts similarly good discussion! Let's say I have the following series of instructions: %a = load i32, i32* %ptr1 %b = load i32, i32* %ptr2 %c = add i32 %a, %b store i32 %c, i32* %ptr3 This gets compiled (roughly) to mov eax, dword ptr [rsp - 4] add eax, dword ptr [rsp - 8] mov dword

[LLVMdev] [Polly] Assert in Scope construction

2013 Jul 03

2

[LLVMdev] [Polly] Assert in Scope construction

Should have changed the subject line... --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Sergei Larin > Sent: Wednesday, July 03, 2013 12:29 PM > To: 'Tobias Grosser' > Cc: 'llvmdev'

Patch to wobbly snap for outputs

2006 Dec 08

4

Patch to wobbly snap for outputs

Here's a patch to wobbly.c to handle edge snapping with multiple outputs... Also, I tweaked the window edge snapping to include dock window types, to support the case where dock windows may be on the inner edges of multiple monitors (and thus currently ignored as struts in the output workarea setup). I personally think we should include these "inner" struts when calculating the

Legality of transformation

2020 Apr 04

4

Legality of transformation

Please consider the following C code: * #define SZ 2048 int main(void) { int A[SZ]; int B[SZ]; int i, tmp; for (i = 0; i < SZ; i++) { tmp = A[i]; B[i] = tmp; } assert(A[SZ/2] == B[SZ/2]); }* On running -O1 followed by -reg2mem I get the following IR: *define dso_local i32 @main() local_unnamed_addr #0 {entry: %A = alloca [2048

[LLVMdev] [LNT] Question about results reliability in LNT infrustructure

2013 Jul 02

0

[LLVMdev] [LNT] Question about results reliability in LNT infrustructure

On 07/01/2013 09:41 AM, Renato Golin wrote: > On 1 July 2013 02:02, Chris Matthews <chris.matthews at apple.com> wrote: > >> One thing that LNT is doing to help “smooth” the results for you is by >> presenting the min of the data at a particular revision, which (hopefully) >> is approximating the actual runtime without noise. >> > > That's an

[SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

2018 Aug 15

2

[SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

Hello, If I run clang on the following code: void func(unsigned n) { > for (unsigned long x = 1; x < n; ++x) > dummy(x); > } I get the following llvm ir: define void @func(i32 %n) { > entry: > %conv = zext i32 %n to i64 > %cmp5 = icmp ugt i32 %n, 1 > br i1 %cmp5, label %for.body, label %for.cond.cleanup > for.cond.cleanup:

IR -> source pretty printing?

2016 Jul 13

3

IR -> source pretty printing?

Hi, I often find myself staring at IR and wanting to look at the C source code it corresponds to. To do so, I look up the debug identifier for the given IR line, scroll to the bottom of the IR file to find the debug identifier, look at the debug location (source and column), and then look at the source file. Too many steps. What would be great is a tool that took two files, i.e., a .c file and a

[LLVMdev] [LNT] Question about results reliability in LNT infrustructure

2013 Jul 01

2

[LLVMdev] [LNT] Question about results reliability in LNT infrustructure

On 1 July 2013 02:02, Chris Matthews <chris.matthews at apple.com> wrote: > One thing that LNT is doing to help “smooth” the results for you is by > presenting the min of the data at a particular revision, which (hopefully) > is approximating the actual runtime without noise. > That's an interesting idea, as you said, if you run multiple times on every revision. On ARM,