thr3ads.net - search: "ashutosh"

Displaying 20 results from an estimated 205 matches for "ashutosh".

2015 Mar 24

[LLVMdev] RFC: Loop versioning for LICM

> On Mar 20, 2015, at 8:02 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > > > Yes, this is what I was proposing above and here ;): > Thanks Adam it’s for confirming J NP :). > > > No, not hasLoopInvariantStore but hasAccessToLoopInvariantAddress. > Its only for invariant stores[not loads], Using ‘hasL...

Why getFunction() of CallGraphNode return NULL function?

2016 Jan 20

Why getFunction() of CallGraphNode return NULL function?

So, I won't know the called function statically, Right? -------------------------------------------- Qiuping Yi Institute Of Software Chinese Academy of Sciences On Wed, Jan 20, 2016 at 2:24 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > Typically for C++ virtual function you will see an indirect callSite > (unless not de-virtualized). > > > > Regards, > > Ashutosh > > > > *From:* Qiuping Yi [mailto:yiqiuping at gmail.com] > *Sent:* Wednesday, January...

[LLVMdev] Cast to SCEVAddRecExpr

2015 Mar 19

[LLVMdev] Cast to SCEVAddRecExpr

Yes, I can get "SCEVAddRecExpr" from operands of "(sext i32 {2,+,2}<%for.body4> to i64)". So whenever SCEV cast to "SCEVAddRecExpr" fails, we have drill down for such patterns ? Is that the right way ? Regards, Ashutosh -----Original Message----- From: Nick Lewycky [mailto:nicholas at mxc.ca] Sent: Thursday, March 19, 2015 1:02 PM To: Nema, Ashutosh Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Cast to SCEVAddRecExpr Nema, Ashutosh wrote: > Hi Nick, > > Thanks for looking into it. > > I have...

[LLVMdev] RFC: Loop versioning for LICM

2015 Mar 20

[LLVMdev] RFC: Loop versioning for LICM

> On Mar 19, 2015, at 9:46 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > > Thanks Adam for your reply. > > From: Adam Nemet [mailto:anemet at apple.com <mailto:anemet at apple.com>] > Sent: Friday, March 20, 2015 3:23 AM > To: Nema, Ashutosh > Cc: Hal Finkel; Philip Reames; llvmdev at cs.uiuc.edu &...

X86 TRUNCATE cost for AVX & AVX2 mode

2016 Apr 12

X86 TRUNCATE cost for AVX & AVX2 mode

...be same/close to the cost mentioned for same operation in SSE2ConversionTbl. Below patch from Cong Hou reduce cost for same operation in SSE2 mode. http://reviews.llvm.org/rL256194 Looks like as the part of same patch we should reduce cost for TRUNCATE v16i32 to v16i8 in SSE4.1 as well. Regards, Ashutosh From: Demikhovsky, Elena [mailto:elena.demikhovsky at intel.com] Sent: Monday, April 11, 2016 9:05 PM To: Nema, Ashutosh <Ashutosh.Nema at amd.com> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Zuckerman, Michael <michael.zuckerman at intel.com> Subject: RE: X86 TRUNCATE cost for AV...

[LLVMdev] RFC: Loop versioning for LICM

2015 Mar 19

[LLVMdev] RFC: Loop versioning for LICM

Hi Ashutosh, > On Mar 16, 2015, at 9:06 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > > Hi Adam, > > From: Adam Nemet [mailto:anemet at apple.com <mailto:anemet at apple.com>] > Sent: Wednesday, March 11, 2015 10:48 AM > To: Nema, Ashutosh > Cc: llvmdev at cs.u...

ScalarEvolution questions

2018 May 16

ScalarEvolution questions

On Wed, May 16, 2018 at 1:24 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > Hi Sanjoy, > > Your inputs really helped. > > Using “isImpliedCond”, able to relate and find the min for cases like: > > SCEV1: (-1 + (sext i32 %n.addr.058 to i64))<nsw> > SCEV2: 0 > Extra-Condition: (n.addr.058 > 7) >...

[Proposal][RFC] Epilog loop vectorization

2017 Mar 14

[Proposal][RFC] Epilog loop vectorization

...erLoopVectorizer:: vectorize’ again. 3) Block layout description with epilog loop vectorization is available at https://reviews.llvm.org/file/data/fxg5vx3capyj257rrn5j/PHID-FILE-x6thnbf6ub55ep5yhalu/LayoutDescription.png Approach-1 looks feasible, please comment if any objections. Regards, Ashutosh From: Nema, Ashutosh Sent: Wednesday, March 1, 2017 10:42 AM To: 'Daniel Berlin' <dberlin at dberlin.org> Cc: anemet at apple.com; Hal Finkel <hfinkel at anl.gov>; Zaks, Ayal <ayal.zaks at intel.com>; Renato Golin <renato.golin at linaro.org>; mkuper at google.com;...

ScalarEvolution questions

2018 May 10

ScalarEvolution questions

Thanks Sanjoy, I'll look into this. -----Original Message----- From: Sanjoy Das [mailto:sanjoy at playingwithpointers.com] Sent: Thursday, May 10, 2018 8:07 AM To: Nema, Ashutosh <Ashutosh.Nema at amd.com> Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] ScalarEvolution questions Hi Ashutosh, On Wed, May 9, 2018 at 3:28 AM, Nema, Ashutosh via llvm-dev <llvm-dev at lists.llvm.org> wrote: > I’m new to ScalarEvolution and wanted to explore its capabiliti...

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

...gt; - * Elena* >> >> >> >> *From:* zhi chen [mailto:zchenhn at gmail.com] >> *Sent:* Thursday, February 25, 2016 01:20 >> *To:* Demikhovsky, Elena <elena.demikhovsky at intel.com> >> *Cc:* Sanjay Patel <spatel at rotateright.com>; Nema, Ashutosh < >> Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> >> >> *Subject:* Re: [llvm-dev] how to force llvm generate gather intrinsic >> >> >> >> Hi Elena, >> >> >> >> Are the masked_load and gather working no...

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

...processors. > > > > - * Elena* > > > > *From:* zhi chen [mailto:zchenhn at gmail.com] > *Sent:* Thursday, February 25, 2016 20:48 > *To:* Sanjay Patel <spatel at rotateright.com> > *Cc:* Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh < > Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> > > *Subject:* Re: [llvm-dev] how to force llvm generate gather intrinsic > > > > It seems that http://reviews.llvm.org/D15690 only implemented > gather/scatter for AVX-512, but not for AVX/AVX2...

Why getFunction() of CallGraphNode return NULL function?

2016 Jan 20

Why getFunction() of CallGraphNode return NULL function?

Dear Ashutosh, Thank you, I can handle some indirect callSites by getFunction() of InvokInst and CallInst. However, when I am handling C++ programs, I found the calls of member functions are converted to some strange indirect calls. For example: table->truncate(sysTransaction); // from mysql are transl...

[LLVMdev] Cast to SCEVAddRecExpr

2015 Mar 19

[LLVMdev] Cast to SCEVAddRecExpr

Hi Nick, Thanks for looking into it. I have tried that as well but it didn't worked. "AddExpr->getOperand(0))" node is: " (4 * (sext i32 {2,+,2}<%for.body4> to i64))<nsw>" When I cast this to "SCEVAddRecExpr" it returns NULL. Regards, Ashutosh -----Original Message----- From: Nick Lewycky [mailto:nicholas at mxc.ca] Sent: Thursday, March 19, 2015 12:19 PM To: Nema, Ashutosh Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Cast to SCEVAddRecExpr Nema, Ashutosh wrote: > Hi, > I'm trying to cast one of the SCEV node to "S...

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

Yes, masked load/store/gather/scatter are completed. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 01:20 To: Demikhovsky, Elena <elena.demikhovsky at intel.com> Cc: Sanjay Patel <spatel at rotateright.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to force llvm generate gather intrinsic Hi Elena, Are the masked_load and gather working now? Best, Zhi On Sat, Jan 23, 2016 at 12:06 PM, Demikhovsky, Elena <elena.demikhovsky at intel.c...

[LLVMdev] Cast to SCEVAddRecExpr

2015 Mar 31

[LLVMdev] Cast to SCEVAddRecExpr

...* I expected SCEV will hoist AddRec to top level, but it's not doing that. Instead if I change "var[i << 1]" to "var[i * 2]" then I'm getting "SCEVAddRecExpr". What is your opinion on this, to me this looks like a missed opportunity in SCEV. Regards, Ashutosh -----Original Message----- From: Nema, Ashutosh Sent: Tuesday, March 31, 2015 9:09 AM To: 'Nick Lewycky' Cc: llvmdev at cs.uiuc.edu Subject: RE: [LLVMdev] Cast to SCEVAddRecExpr Hi Nick, Consider below case: for (j=1; j < itr; j++) { - - - - for (i=1; i < itr; i++) {...

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

No. Gather operation is slow on AVX2 processors. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 20:48 To: Sanjay Patel <spatel at rotateright.com> Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to force llvm generate gather intrinsic It seems that http://reviews.llvm.org/D15690 only implemented gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to enable gather for A...

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

...gt; - * Elena* >> >> >> >> *From:* zhi chen [mailto:zchenhn at gmail.com] >> *Sent:* Thursday, February 25, 2016 20:48 >> *To:* Sanjay Patel <spatel at rotateright.com> >> *Cc:* Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh < >> Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> >> >> *Subject:* Re: [llvm-dev] how to force llvm generate gather intrinsic >> >> >> >> It seems that http://reviews.llvm.org/D15690 only implemented >> gather/scatter f...

AliasAnalysis: may-alias subcategory

2017 Aug 07

AliasAnalysis: may-alias subcategory

...l loop version } void callFoo() { foo.clone(A1, B1, C1); // Call to optimal version foo.clone(A2, B2, C2); // Call to optimal version foo (A3, B3, C3); // Call to default version } For such cases I like to differentiate between “may-alias” and “may-alias-because-it's-input”. Regards, Ashutosh -----Original Message----- From: Nuno Lopes [mailto:nunoplopes at sapo.pt] Sent: Monday, August 7, 2017 6:12 PM To: Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] AliasAnalysis: may-alias subcategory You're right that stat...

ScalarEvolution questions

2018 May 16

ScalarEvolution questions

...SCEV1. This is how using “isImpliedCond”: Result = SE->isImpliedCond(ICmpInst::ICMP_SGT, B, A, ICmpInst::ICMP_SGT, Var, Const); Inputs: A: (1 + (-1 * (sext i32 %n.addr.058 to i64)))<nsw> B: 0 Var: %n.addr.058 Const: 7 In this case "isImpliedCond" simply returns false. Thanks, Ashutosh -----Original Message----- From: Nema, Ashutosh Sent: Thursday, May 10, 2018 10:10 AM To: 'Sanjoy Das' <sanjoy at playingwithpointers.com> Cc: llvm-dev at lists.llvm.org Subject: RE: [llvm-dev] ScalarEvolution questions Thanks Sanjoy, I'll look into this. -----Original Message...

[Proposal][RFC] Epilog loop vectorization

2017 Feb 28

[Proposal][RFC] Epilog loop vectorization

...;Command> $ opt –O3 –gvn test.ll –o test.opt.ll $ opt –O3 –newgvn test.ll –o test.opt.ll “test.ll” is attached, it got already vectorized by the approach running vectorizer twice by annotate the remainder loop with metadata to limit the vectorization factor for epilog vector loop. Regards, Ashutosh From: anemet at apple.com [mailto:anemet at apple.com] Sent: Tuesday, February 28, 2017 1:33 AM To: Hal Finkel <hfinkel at anl.gov> Cc: Daniel Berlin <dberlin at dberlin.org>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; Zaks, Ayal <ayal.zaks at intel.com>; Renato Golin <r...

search for: ashutosh