Displaying 20 results from an estimated 900 matches similar to: "Early CSE clobbering llvm.assume"
2016 Jun 10
2
Early CSE clobbering llvm.assume
Yeah, that change is completely unrelated, that is about correctness, this
is about optimization.
I'm working on a proposal to just fix assume at some point to deal with the
former issue.
The problem with this testcase is that all the ways assume is propagate
expect the variable in the assume to later be used.
<This is the main way assume constants are propagated>
bool
2016 Jun 10
3
Early CSE clobbering llvm.assume
Maybe. It may not fix it directly because you never use %1 or %2 again.
I haven't looked to see how good the lookup is.
On Fri, Jun 10, 2016, 3:45 PM Josh Klontz <josh.klontz at gmail.com> wrote:
> Thanks Daniel, with that knowledge I think I can at least work around the
> issue in my frontend.
>
> Ignoring GVN for a second though, and just looking at Early CSE, it seems
2016 Jun 11
3
Early CSE clobbering llvm.assume
My (dumb?) question would be: why is llvm.assume being handled any differently than llvm.assert ?
Other than one trapping and one not-trapping, they should be identical, in both cases they are giving
The optimizers information, and that shouldn't be any different from being inside an "if" statement with the same condition ?
--Peter Lawrence.
-------------- next part --------------
2016 Jun 11
4
Early CSE clobbering llvm.assume
Daniel,
Well then my next (dumb?) question is why aren’t we using source level assert information
For optimization ?
--Peter Lawrence.
From: Daniel Berlin [mailto:dberlin at dberlin.org]
Sent: Friday, June 10, 2016 5:39 PM
To: Lawrence, Peter <c_plawre at qca.qualcomm.com>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Early CSE clobbering llvm.assume
On Fri, Jun
2016 Jun 11
2
Early CSE clobbering llvm.assume
Daniel,
My point is this,
If (cond) ---- optimizer takes advantage of knowing cond == true within the “then” part
Assert(cond) ---- optimizer takes advantage of knowing cond == true for the rest of the scope
Assume(cond) ---- optimizer takes advantage of knowing cond == true for the rest of the scope
If we aren’t implementing these in a consistent manner (like using an intrinsic for
2016 Jun 12
2
Early CSE clobbering llvm.assume
On Fri, Jun 10, 2016 at 9:58 PM, Mehdi Amini via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> On Jun 10, 2016, at 7:00 PM, Lawrence, Peter via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Daniel,
> My point is this,
>
> If (cond) ---- optimizer takes advantage of knowing cond == true within
> the “then” part
> Assert(cond)
2016 Jun 12
2
Early CSE clobbering llvm.assume
What he said :)
It also, representationally, has a related issue our current assume does
in terms of figuring out the set of assumptions applied. Given an
instruction, in extended SSA, because " assume" produces a value used by
things, it's trivial to find the chain of assumptions you can use for it.
In a straight control flow representation, it requires finding which side
of the
2016 Jun 14
3
Early CSE clobbering llvm.assume
On Tue, Jun 14, 2016 at 10:36 AM, Lawrence, Peter <c_plawre at qca.qualcomm.com
> wrote:
> Daniel,
>
> What am I missing in the following chain of logic:
>
>
>
> As far as constant-prop, value-prop, range-prop, and general
> property-propagation,
>
>
>
> 1. the compiler/optimizer **has** to get it right for if-then-else and
> while-do or
2016 Jun 14
4
Early CSE clobbering llvm.assume
>
>
>> Sanjoy’s argument is faulty, if it were true we would also find our
>> handling of “assert” to be unacceptable
>>
>> but this is not the case, no one is arguing that we need to re-design
>> “assert”
>>
> Sure, but no one should make this argument anyway: assert is not for
> optimization. In fact, we don't really want it to be used for
2014 Dec 26
3
[LLVMdev] Correct usage of `llvm.assume` for loop vectorization alignment?
Using LLVM ToT and Hal's helpful slide deck [1], I've been trying to use
`llvm.assume` to communicate pointer alignment guarantees to vector load
and store instructions. For example, in [2] %5 and %9 are guaranteed to be
32-byte aligned. However, if I run this IR through `opt -O3 -datalayout
-S`, the vectorized loads and stores are still 1-byte aligned [3]. What's
going wrong? Do I
2016 Jun 14
4
Early CSE clobbering llvm.assume
Hal,
To simplify this discussion, lets first just focus on code without asserts and assumes,
I don’t follow your logic, you seem to be implying we don’t optimize property-propagation through “if-then” and “while-do” well ?
--Peter.
From: Hal Finkel [mailto:hfinkel at anl.gov]
Sent: Tuesday, June 14, 2016 11:12 AM
To: Lawrence, Peter <c_plawre at qca.qualcomm.com>
Cc: llvm-dev
2013 Nov 15
4
[LLVMdev] Limit loop vectorizer to SSE
Something like:
index 6db7f68..68564cb 100644
--- a/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -1208,6 +1208,8 @@ void InnerLoopVectorizer::vectorizeMemoryInstruction(Instr
Type *DataTy = VectorType::get(ScalarDataTy, VF);
Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand();
unsigned Alignment = LI ?
2013 Nov 15
6
[LLVMdev] Limit loop vectorizer to SSE
On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> wrote:
> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote:
> Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks!
>
> I think
2013 Nov 15
2
[LLVMdev] Limit loop vectorizer to SSE
Yes,
I was just about to send out:
DL->getABITypeAlignment(ScalarDataTy);
The question is:
“… ABI alignment for the target …"
is that
getPrefTypeAlignment
or
getABITypeAlignment
I would have thought the latter.
On Nov 15, 2013, at 4:12 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
>> From: "Arnold Schwaighofer"
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
----- Original Message -----
> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
> To: "Joshua Klontz" <josh.klontz at gmail.com>
> Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu>
> Sent: Friday, November 15, 2013 4:05:53 PM
> Subject: Re: [LLVMdev] Limit loop vectorizer to SSE
>
>
> Something like:
>
> index
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
Nadav,
I believe aligned accesses to unaligned pointers is precisely the issue.
Consider the function `add_u8S` before[1] and after[2] the loop vectorizer
pass. There is no alignment assumption associated with %kernel_data prior
to vectorization. I can't tell if it's the loop vectorizer or the codegen
at fault, but the alignment assumption seems to sneak in somewhere.
v/r,
Josh
[1]
2015 Mar 19
2
[LLVMdev] [LV] possible `vector.memcheck` regression when using `llvm.loop` and `llvm.mem.parallel_loop_access`
Adam,
Please find the attached test case (run with ToT opt -O3). As you can see,
`y_body` successfully is vectorized, though %33 and %46 are deemed MayAlias
despite their exclusive use in loads ands stores marked with
`llvm.mem.parallel_loop_access`.
Many Thanks,
Josh
On Thu, Mar 19, 2015 at 12:55 PM, Adam Nemet <anemet at apple.com> wrote:
>
> > On Mar 19, 2015, at 9:43 AM,
2013 Nov 15
2
[LLVMdev] Limit loop vectorizer to SSE
A fix for this is in r194876.
Thanks for reporting this!
On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote:
> Nadav,
>
> I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to
2013 May 10
2
[LLVMdev] Simple Loop Vectorize Question
Nadav,
Please forgive my ignorance, but 'opt -mcpu=corei7 -loop-vectorize -S
-debug double.ll' doesn't appear to make a difference. In fact it seems to
be ignored as garbage values for -mcpu don't raise an error. Am I
overlooking something else also?
Many Thanks,
Josh
On Thu, May 9, 2013 at 6:06 PM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Josh,
>
> Your
2014 Aug 09
3
[LLVMdev] Heuristic for choosing between MCJIT and Interpreter
I'm facing a situation where I have generated IR that only needs to be
executed once. I've noticed for simple IR it's faster to run the
interpreter on it, but for complex IR it's much better to JIT compile and
execute it. I'm seeking suggestions for a good heuristic to decide which
approach to take for any given IR. I'm leaning in favor of deciding based
on the