similar to: Speculation and control dependent no wrap flags

Displaying 20 results from an estimated 6000 matches similar to: "Speculation and control dependent no wrap flags"

2016 Sep 27
4
Inferring nsw/nuw flags for increment/decrement based on relational comparisons
On 2016-09-27 02:28, Philip Reames wrote: > On 09/20/2016 12:05 PM, Matti Niemenmaa via llvm-dev wrote: >> I posted some questions related to implementing inference of nsw/nuw >> flags based on known icmp results to Bug 30428 ( >> https://llvm.org/bugs/show_bug.cgi?id=30428 ) and it was recommended >> that I engage a wider audience by coming here. The minimal context is
2016 Sep 28
4
Load combine pass
One of the arguments for doing this earlier is inline cost perception of the original pattern. Reading i32/i64 by bytes look much more expensive than it is and can prevent inlining of interesting function. Inhibiting other optimizations concern can be addressed by careful selection of the pattern we’d like to match. I limit the transformation to the case when all the individual have no uses other
2019 Sep 11
2
Load combine pass
Hi, Can I ask what is the status of load widening. It seems there is no load widening on IR at all. // Paweł On Wed, Oct 5, 2016 at 1:49 PM Artur Pilipenko via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Philip and I talked about this is person. Given the fact that load > widening in presence of atomics is irreversible transformation we agreed > that we don't want to do
2019 Sep 12
2
Load combine pass
Ok, thanks. Are there any plans to reintroduce it on the IR level? I'm not confident this is strictly necessary, but in some cases not having load widening ends up really bad. Like in the case where vectorizer tries to do something about it: https://godbolt.org/z/60RuEw https://bugs.llvm.org/show_bug.cgi?id=42708 At the current state I'm forced to use memset() to express uint64 load from
2019 Sep 25
2
Load combine pass
If we do load combining at the IR level, one thing we'll need to give some thought to is atomicity.  Combining two atomic loads into a wider (legal) atomic load is not a reversible transformation given our current specification. I've been thinking about a concept I've been tentatively calling "element wise atomicity" which would make this a reversible transform by
2016 Sep 29
2
Load combine pass
> On 29 Sep 2016, at 03:23, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi Artur, > > Artur Pilipenko via llvm-dev wrote: > > One of the arguments for doing this earlier is inline cost > > perception of the original pattern. Reading i32/i64 by bytes look much > > more expensive than it is and can prevent inlining of interesting > >
2019 Dec 01
4
ConstantRange modelling precision?
Hello. This question has come up in https://reviews.llvm.org/D70043 There, i'm teaching ConstantRange how no-wrap flags affect the range of `mul` instruction, with end goal of exploiting this in LVI/CVP. There are certain combinations of ranges and no-wrap flags that result in always-overflowing `mul`. For example, `mul nuw nsw i4 [2,0), [4,0)` always overflows:
2016 Sep 28
3
Load combine pass
Hi, I'm trying to optimize a pattern like this into a single i16 load: %1 = bitcast i16* %pData to i8* %2 = load i8, i8* %1, align 1 %3 = zext i8 %2 to i16 %4 = shl nuw i16 %3, 8 %5 = getelementptr inbounds i8, i8* %1, i16 1 %6 = load i8, i8* %5, align 1 %7 = zext i8 %6 to i16 %8 = shl nuw nsw i16 %7, 0 %9 = or i16 %8, %4 I came across load combine pass which is motivated
2016 Sep 29
3
Load combine pass
> On 29 Sep 2016, at 21:01, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi Artur, > > Artur Pilipenko wrote: > > > BTW, do we really need to emit an atomic load if all the individual > > components are bytes? > > Depends -- do you mean at the at the hardware level or at the IR > level? > > If you mean at the IR level, then I
2016 Sep 23
6
Improving SCEV's behavior around IR level no-wrap flags
Hi all, This is about a project I've been prototyping on-and-off for a while that has finally reached a point where I can claim it to be "potentially viable". I'd like to gather some input from the community before moving too far ahead. # The problem There is a representation issue within SCEV that prevents it from fully using information from nsw/nuw flags present in the
2017 Aug 08
2
Improving SCEV's behavior around IR level no-wrap
Hi Sanjoy, Any update on this? Are there plans to implement this proposal? Thanks, Pankaj -----Original Message----- Date: Fri, 23 Sep 2016 02:09:19 -0700 From: Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> To: llvm-dev <llvm-dev at lists.llvm.org>, Andrew Trick <atrick at apple.com>, Dan Gohman <dan433584 at gmail.com>, Hal Finkel <hfinkel at anl.gov>,
2015 Dec 21
3
Hash of a module
Yes, I'm running all the existing passes that I know how to run. I didn't know they returned change-made. Thanks! On Mon, Dec 21, 2015 at 12:36 PM, Artur Pilipenko < apilipenko at azulsystems.com> wrote: > Are you going to run some of the existing passes? Why can’t you just use > the returned change-made value from the passes? > > Artur > > > On 20 Dec 2015, at
2015 Apr 24
2
[LLVMdev] Speculative loads and alignment
Hi, There are several optimizations where we try to load speculatively. There are also two similar functions to determine whether it's a safe transformation: * isSafeToLoadUnconditionally * isSafeToSpeculativelyExecute isSafeToLoadUnconditionally tries to take load alignment into account but fails to do this in some cases. It checks alignment for pointers derived from allocas and global
2016 Feb 15
5
Masked intrinsics and non-default address spaces
Masked load/store are overloaded intrinsics, the only generic type is the type of the value being loaded/stored. The signature of the intrinsic is generated based on this type. The type of the pointer argument is generated as a pointer to the return type with default addrspace. E.g.: declare <8 x i32> @llvm.masked.load.v8i32(<8 x i32>*, i32, <8 x i1>, <8 x i32>) The
2017 Feb 06
2
Adding Extended-SSA to LLVM
On Sun, Feb 5, 2017 at 3:41 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote: > Thanks for the answers! The plan makes sense to me. > > Regarding phis, what about diamonds, e.g.: > > > define i32 @f(i32 %x) { > br .., label %bb0, label %bb1 > bb0: > %cmp = icmp sge i32 %x, 0 ; x > 0 > br i1 %cmp, label %bb2, label %bb3 > bb1: > %x2 = add nsw
2017 Aug 09
2
Improving SCEV's behavior around IR level no-wrap
> On Aug 8, 2017, at 5:34 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi Pankaj, > > IIRC there was pushback on this proposal so I did not proceed further. > Are you blocked on this? > > [+CC Andy, who I remember had some objections.] > > — Sanjoy Off the top of my head, my concern is that expression comparison is no longer constant time,
2016 Jul 19
3
X86ISelLowering: Promote 'add nsw' to a wider type
Hi Sanjay, Some time ago you implemented a sext(add_nsw(x, C)) --> add(sext(x), C_sext) transformation in X86ISelLowering https://reviews.llvm.org/D13757 Is there any reason why this transformation is limited to sexts and doesn’t support zexts? Thanks, Artur -------------- next part -------------- An HTML attachment was scrubbed... URL:
2015 Jul 13
2
[LLVMdev] String attributes for function arguments and return values
Hi, On 13 Jul 2015, at 15:59, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote: ----- Original Message ----- From: "Artur Pilipenko" <apilipenko at azulsystems.com<mailto:apilipenko at azulsystems.com>> To: llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu> Cc: "Hal Finkel" <hfinkel at anl.gov<mailto:hfinkel at
2017 Mar 20
2
Is it a valid fp transformation?
I agree. There’s implementation-defined behavior on the conversion of (arg*58) to int, but that shouldn’t be at issue here. The transform of (float)x + 1 => (float)(x + 1) is bogus. > On Mar 20, 2017, at 10:41 AM, Sanjay Patel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Looks broken to me; I don't think there's UB in the original program. > > The fold in
2015 Apr 16
3
[LLVMdev] LazyValueInfo.getPredicateAt
Hi, Is it intentional that LazyValueInfo.getPredicateAt doesn't solve for the value and only takes assumptions into account? getPredicateAt gets lattice value from cache using getValueAt call: LVILatticeVal LazyValueInfoCache::getValueAt(Value *V, Instruction *CxtI) { ... LVILatticeVal Result; mergeAssumeBlockValueConstantRange(V, Result, CxtI); ... return Result; } Other