thr3ads.net - similar to: "AliasAnalysis: may-alias subcategory"

Displaying 20 results from an estimated 8000 matches similar to: "AliasAnalysis: may-alias subcategory"

2017 Aug 07

AliasAnalysis: may-alias subcategory

There are function which does have optimization opportunities but because of may-alias memory dependencies sometimes optimization is not effective. May be runtime checks kills the gains of optimization. For such cases aiming to do interprocedural function specialization optimization where in the clone function version no-alias assumption can be assumed and the original function version will hold

[LLVMdev] Modularizing LICM

2014 Dec 22

[LLVMdev] Modularizing LICM

One way you could go is to expose the interface in include/llvm/Transforms/Utils/LoopUtils.h. There's a similar approach in the LCSSA and LoopSimplify passes, both define functions used by other passes (e.g LoopUnroll and LICM). On Fri, Dec 19, 2014 at 10:58 PM, Philip Reames <listmail at philipreames.com> wrote: > I've come across similar use cases recently. In particular,

[LLVMdev] Alias-based Loop Versioning

2015 May 25

[LLVMdev] Alias-based Loop Versioning

It’s a good thought in general Adam, but I worried about following scenarios: 1) As Dibyendu already mentioned Check1 + Check2 is not very clear. If your intent is superset/union of check1 & check2 then I’m not sure it will always help passes those needs smaller checks (i.e. loop distribution) Every pass has a different need of runtime check, i.e. vectorizer checks each memory against all

[LLVMdev] Alias-based Loop Versioning

2015 May 23

[LLVMdev] Alias-based Loop Versioning

----- Original Message ----- > From: "Dibyendu Das" <Dibyendu.Das at amd.com> > To: "Adam Nemet" <anemet at apple.com>, "Dev" <llvmdev at cs.uiuc.edu>, "Ashutosh Nema" <Ashutosh.Nema at amd.com>, "Hal > Finkel" <hfinkel at anl.gov> > Sent: Saturday, May 23, 2015 5:45:27 AM > Subject: RE: [LLVMdev]

[LLVMdev] Alias-based Loop Versioning

2015 May 28

[LLVMdev] Alias-based Loop Versioning

Thanks for the feedback. Sounds like that at this point in time we can’t really settle on a single strategy. We probably want to support all of these uses-cases: 1. A common early loop-versioning pass, probably fairly conservative initially (e.g. if you need a single memcheck to remove all may-aliasing from a hight-trip-count loop it’s probably a good idea to version). Even if the pass would

[LLVMdev] Modularizing LICM

2014 Dec 15

[LLVMdev] Modularizing LICM

Hi, I'm writing a new loop pass, and has a need to call LICM(Loop Invariant Code Motion) 'PromoteAliasSet' on modified loop. For now I didn't found any why to call 'PromoteAliasSet' from my pass explicitly. The only way is to schedule LICM pass after my pass. For some reason my pass need more control and preferring to call LICM 'PromoteAliasSet' instead running

Loop invariant not being optimized

2016 Nov 18

Loop invariant not being optimized

I tried changing 'noalias' to 'restrict' in the code and I get: fma.c:17:12: warning: 'restrict' attribute only applies to return values that are pointers It seems like 'noalias' would be the correct attribute here, from the article you linked: "if a function is annotated as noalias, the optimizer can assume that, in addition to the parameters themselves,

[LLVMdev] Alias-based Loop Versioning

2015 May 21

[LLVMdev] Alias-based Loop Versioning

There is a work taking place by multiple people in this area and more is expected to happen and I’d like to make sure we’re working toward a common end goal. I tried to collect the use-cases for run-time memory checks and the specific memchecks required for each: 1. Loop Vectorizer: each memory access is checked against all other memory accesses in the loop (except read vs read) 2. Loop

InstCombine GEP

2017 Aug 10

InstCombine GEP

> On Thu, Aug 10, 2017 at 12:22 AM, Nema, Ashutosh via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> I’m not sure how transforming GEP offset to i8 type will help alias >> analysis & SROA for the mentioned test case. > > It should neither help nor hinder AA or SROA -- the two GEPs (the complex one and the simple one) are equivalent. > Since memory isn't

[LLVMdev] RFC: Loop versioning for LICM

2015 Mar 24

[LLVMdev] RFC: Loop versioning for LICM

> On Mar 20, 2015, at 8:02 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > > > Yes, this is what I was proposing above and here ;): > Thanks Adam it’s for confirming J NP :). > > > No, not hasLoopInvariantStore but hasAccessToLoopInvariantAddress. > Its only for invariant stores[not loads], Using ‘hasLoopInvariantStore’ (or a name with invariant store)

[LLVMdev] Cast to SCEVAddRecExpr

2015 Mar 19

[LLVMdev] Cast to SCEVAddRecExpr

Yes, I can get "SCEVAddRecExpr" from operands of "(sext i32 {2,+,2}<%for.body4> to i64)". So whenever SCEV cast to "SCEVAddRecExpr" fails, we have drill down for such patterns ? Is that the right way ? Regards, Ashutosh -----Original Message----- From: Nick Lewycky [mailto:nicholas at mxc.ca] Sent: Thursday, March 19, 2015 1:02 PM To: Nema, Ashutosh Cc:

Why getFunction() of CallGraphNode return NULL function?

2016 Jan 20

Why getFunction() of CallGraphNode return NULL function?

So, I won't know the called function statically, Right? -------------------------------------------- Qiuping Yi Institute Of Software Chinese Academy of Sciences On Wed, Jan 20, 2016 at 2:24 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > Typically for C++ virtual function you will see an indirect callSite > (unless not de-virtualized). > > > > Regards, >

[LLVMdev] RFC: Loop versioning for LICM

2015 Mar 20

[LLVMdev] RFC: Loop versioning for LICM

> On Mar 19, 2015, at 9:46 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > > Thanks Adam for your reply. > > From: Adam Nemet [mailto:anemet at apple.com <mailto:anemet at apple.com>] > Sent: Friday, March 20, 2015 3:23 AM > To: Nema, Ashutosh > Cc: Hal Finkel; Philip Reames; llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu> > Subject:

InstCombine GEP

2017 Aug 10

InstCombine GEP

Hi, I have a doubt with GEP transformation in the instruction-combiner. Consider below test-case: struct ABC { int A; int B[100]; struct XYZ { int X; int Y[100]; } OBJ; }; void Setup(struct ABC *); int foo(int offset) { struct ABC *Ptr = malloc(sizeof(struct ABC)); Setup(Ptr); return Ptr->OBJ.X + Ptr->OBJ.Y[33]; } Generated IR for the test-case: define i32 @foo(i32

X86 TRUNCATE cost for AVX & AVX2 mode

2016 Apr 12

X86 TRUNCATE cost for AVX & AVX2 mode

<Copied Cong> Thanks Elena. Mostly I was interested in why such a high cost 30 kept for TRUNCATE v16i32 to v16i8 in SSE41. Looking at the code it appears like TRUNCATE v16i32 to v16i8 in SSE41 is very expensive vs SSE2. I feel this number should be same/close to the cost mentioned for same operation in SSE2ConversionTbl. Below patch from Cong Hou reduce cost for same operation in SSE2

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

It seems that http://reviews.llvm.org/D15690 only implemented gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to enable gather for AVX/2? Thanks. Best, Zhi On Thu, Feb 25, 2016 at 8:28 AM, Sanjay Patel <spatel at rotateright.com> wrote: > I don't think gather has been enabled for AVX2 as of r261875. > Masked load/store were enabled for AVX with: >

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

If I'm understanding correctly, you're saying that vgather* is slow on all of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will not generate it for any of those machines. Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() && !hasAVX512()". It could break for some hypothetical future processor that manages to

[Proposal][RFC] Epilog loop vectorization

2017 Mar 14

[Proposal][RFC] Epilog loop vectorization

Summarizing the discussion on the implementation approaches. Discussed about two approaches, first running ‘InnerLoopVectorizer’ again on the epilog loop immediately after vectorizing the original loop within the same vectorization pass, the second approach where re-running vectorization pass and limiting vectorization factor of epilog loop by metadata. <Approach-2> Challenges with

[LLVMdev] RFC: Loop versioning for LICM

2015 Mar 19

[LLVMdev] RFC: Loop versioning for LICM

Hi Ashutosh, > On Mar 16, 2015, at 9:06 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > > Hi Adam, > > From: Adam Nemet [mailto:anemet at apple.com <mailto:anemet at apple.com>] > Sent: Wednesday, March 11, 2015 10:48 AM > To: Nema, Ashutosh > Cc: llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu> > Subject: Re: [LLVMdev] RFC: Loop

[Proposal][RFC] Epilog loop vectorization

2017 Feb 28

[Proposal][RFC] Epilog loop vectorization

I have tried running both gvn and newgvn but it did not helped in hoisting the alias checks: Please check, maybe I have missed something. <TestCase> void foo (char *A, char *B, char *C, int len) { int i = 0; for (i=0 ; i< len; i++) A[i] = B[i] + C[i]; } <Command> $ opt –O3 –gvn test.ll –o test.opt.ll $ opt –O3 –newgvn test.ll –o test.opt.ll “test.ll” is attached, it

similar to: AliasAnalysis: may-alias subcategory