thr3ads.net - llvm dev - [llvm-dev] RFC: A change in InstCombine canonical form [Mar 2016]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2016-Mar-22 21:30 UTC

[llvm-dev] RFC: A change in InstCombine canonical form

On Tue, Mar 22, 2016 at 1:41 PM, Mehdi Amini <mehdi.amini at apple.com>
wrote:
> Sorry I should have been more clear (writing to many email in parallel)
>
> You're right. I was adding a +1 to you here.
>
> Especially I wrote "unless there is an acknowledgement that typeless
> pointers won't be there before a couple of years" with  the
PassManager in
> mind, and I was expecting from David some good indication of a timeframe
> for the typeless pointers.
> If the typeless pointer work is stalled or if it is not planned for LLVM
> 3.9,
>
It's neither stalled nor planned, as such.

> I agree with Philip to not block anything.
>
All I'm suggesting is that if there are people who want to fix these bugs,
I'd really appreciate them helping out on the typeless pointer work -
it's
totally parallelizable/shardable/shareable work & the project as a whole
seemed to agree it was the right direction. Why not just get it done?

The Pass Manager work is a bit different & harder to share at certain
points (I don't tihnk there's ever been a point at which someone could
ask
me "what can I do to help with the typeless pointer work" I
couldn't've
given some pretty large areas I wasn't anywhere near touching that they
could've gotten their teeth into) - I think it's reaching the point
where
multiple passes can be ported independently & seen some work like that in
Polly, etc. So at this point it seems like if people want to address the
issues the new pass manager is aimed at addressing, they could pitch in on
that effort.

That said, I'm not in a position (nor would I do so, even if I were) to
block other patches, just to encourage people to help out on the bigger
efforts in whatever way they can (either directly, or indirectly through
ensuring stopgaps/new work is done in a way that makes that future work
easier (where it's reasonable to judge what might be easier or harder,
etc), etc)

- Dave

>
> --
> Mehdi
>
>
> On Mar 22, 2016, at 1:37 PM, Philip Reames <listmail at
philipreames.com>
> wrote:
>
> But not what David was stating, unless I misread?  I was specifically
> responding to David's wording:
> "If we're talking about making an optimization able to ignore the
bitcast
> instructions - yes, that work is unnecessary & perhaps questionable
given
> the typeless pointer work. Not outright off limits, but the same time might
> be better invested in moving typeless pointers forward if the contributor
> is so inclined/able to shift in that direction."
>
> Both "perhaps questionable" and "not outright off
limits" seem to strongly
> imply such work should be discouraged.  I disagree with that view which is
> why I wrote my response.
>
> Philip
>
> On 03/22/2016 01:34 PM, Mehdi Amini wrote:
>
> This is roughly what I wrote...
>
> On Mar 22, 2016, at 1:31 PM, Philip Reames <listmail at
philipreames.com>
> wrote:
>
> I feel very strongly that blocking work on making optimization
> bitcast-ignorant on the typeless pointer work would be a major mistake.
> Unless we expected the typeless pointer work to be concluded within the
> near term (say 3-6 months maximum), we should not block any development
> which would be accepted in the typeless pointer work wasn't planned.
>
> In my view, this is one of the largest mistakes we've made with the
pass
> manager work, it has seriously cost us, and I don't want to see it
> happening again.
>
> Philip
>
> On 03/22/2016 01:09 PM, David Blaikie wrote:
>
> Ultimately everything is going to be made to not rely on the types of
> pointers - that's nearly equivalent to bitcast-ignorant (the difference
> being that the presence of an extra instruction (the bitcast) might trip up
> some optimizations - but the presence of the /type/ information implied by
> the bitcast should not trip up or be necessary for optimizations (two sides
> of the same coin))
>
> If we're talking about making an optimization able to ignore the
bitcast
> instructions - yes, that work is unnecessary & perhaps questionable
given
> the typeless pointer work. Not outright off limits, but the same time might
> be better invested in moving typeless pointers forward if the contributor
> is so inclined/able to shift in that direction.
>
> But if we're talking about the work to make the optimization not use
the
> type of pointers as a crutch - that work is a necessary precursor to the
> typeless pointer work and would be most welcome.
>
> - David
>
> On Tue, Mar 22, 2016 at 12:39 PM, Mehdi Amini < <mehdi.amini at
apple.com>
> mehdi.amini at apple.com> wrote:
>
>> I don't really mind, but the intermediate stage will not be very
nice:
>> that a lot of code / tests that needs to be written with bitcast, and
all
>> of that while they are deemed to disappear. The added value isn't
clear to
>> me considering the added work. I'm not sure it wouldn't add
more work for
>> all the cleanup required by the "typeless pointer", but
I'm not sure what's
>> involved here and if David thinks the intermediate steps of handling
bit
>> casts everywhere is not making it harder I'm fine with it.
>>
>> --
>> Mehdi
>>
>> On Mar 22, 2016, at 12:36 PM, Philip Reames < <listmail at
philipreames.com>
>> listmail at philipreames.com> wrote:
>>
>> I'd phrase this differently: being pointer-bitcast agnostic is a
step
>> towards support typeless pointers.  :)  We can either become bitcast
>> agnostic all in one big change or incrementally.  Personally, I'd
prefer
>> the later since it reduces the risk associated with enabling typeless
>> pointers in the end.
>>
>> Philip
>>
>> On 03/22/2016 12:16 PM, Mehdi Amini via llvm-dev wrote:
>>
>> I don't know enough about the tradeoff for 1, but 2 seems like a
bandaid
>> for something that is not a correctness issue neither a regression.
I'm not
>> sure it justifies "bandaid patches" while there is a clear
path forward,
>> i.e. typeless pointers, unless there is an acknowledgement that
typeless
>> pointers won't be there before a couple of years.
>>
>> --
>> Mehdi
>>
>> On Mar 22, 2016, at 11:32 AM, Ehsan Amiri < <ehsanamiri at
gmail.com>
>> ehsanamiri at gmail.com> wrote:
>>
>> Back to the discussion on the RFC, I still see some advantage in
>> following the proposed solution. I see two paths forward:
>>
>> 1- Change canonical form, possibly lower memcpy to non-integer load and
>> store in InstCombine. Then teach the backends to convert that to
integer
>> load and store if that is more profitable. Notice that we are talking
about
>> loads that have no use other than store. So it is a fairly simple
change in
>> the backends.
>>
>> 2- Do not change the canonical form. Then for this bug, we need to
teach
>> select speculation to see through bitcasts. We will probably need to
teach
>> other optimizations to see though bitcasts in the future as problems
are
>> uncovered. That is until typeless pointer work is complete. Once the
>> typeless pointer work is complete, we have some extra code in each
>> optimization for seeing through bitcasts which is possibly no longer
needed.
>>
>> Based on this I think (1) is the right thing to do. But there might be
>> other reasons for the current canonical form that I am not aware of.
Please
>> let me know what you think.
>>
>> Thanks
>> Ehsan
>>
>> On Wed, Mar 16, 2016 at 2:13 PM, David Blaikie < <dblaikie at
gmail.com>
>> dblaikie at gmail.com> wrote:
>>
>>>
>>>
>>> On Wed, Mar 16, 2016 at 11:00 AM, Ehsan Amiri < <ehsanamiri
at gmail.com>
>>> ehsanamiri at gmail.com> wrote:
>>>
>>>> David,
>>>>
>>>> Could you give us an update on the status of typeless pointer
work? How
>>>> much work is left and when you think it might be ready?
>>>>
>>>
>>> It's a bit of an onion peel, really - since it will eventually
involve
>>> generalizing/fixing every optimization that's currently leaning
on typed
>>> pointers to keep the performance while removing the crutch
they're
>>> currently leaning on. (in cases where bitcasts are literally just
getting
>>> in the way, those won't require cleaning up & should just
become "free
>>> performance wins" once we remove them, though)
>>>
>>> At the moment we can roundtrip every LLVM IR test case through
bitcode
>>> and textual IR (reading/writing both formats) while using only a
narrow
>>> whitelist of places that request the type of a pointer (things like
the
>>> verifier, the parser/printer where it actually needs the typed
pointer to
>>> verify it matches the explicit type, etc).
>>>
>>> The next thing on the list is probably figuring out the
byval/inalloca
>>> representation (add an explicit pointee type? just make the number
of bytes
>>> explicit with no type information?).
>>>
>>> Then start migrating optimizations over - doing the same sort of
testing
>>> I did for the IR/bitcode roundtripping - assert that the pointee
type is
>>> not accessed, whitelist places that need it until the bitcasts go
away, fix
>>> anything else... it'll still be a fair bit of work & I
don't really know
>>> how much. It should parallelize pretty well (doing any of this work
is
>>> really helpful, each optimization is indepednent, etc) if anyone
wants
>>> to/is able to help.
>>>
>>> - Dave
>>>
>>>
>>>>
>>>> Thanks
>>>> Ehsan
>>>>
>>>>
>>>> On Wed, Mar 16, 2016 at 1:12 PM, David Blaikie <
<dblaikie at gmail.com>
>>>> dblaikie at gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 16, 2016 at 8:34 AM, Mehdi Amini via llvm-dev
<
>>>>> <llvm-dev at lists.llvm.org>llvm-dev at
lists.llvm.org> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> How do it interact with the "typeless
pointers" work?
>>>>>>
>>>>>
>>>>> Right - the goal of the typeless pointer work is to fix all
these bugs
>>>>> related to "didn't look through bitcasts" in
optimizations. Sometimes
>>>>> that's going to mean more work (because the code is
leaning on the absence
>>>>> of bitcasts & the presence of convenient (but not
guaranteed) type
>>>>> information to inform optimization decisions) but if we
remove typed
>>>>> pointers while keeping optimization quality in the cases we
have today,
>>>>> then we should've also fixed the cases that were broken
because the type
>>>>> information didn't end up aligning to produce the
optimal output.
>>>>>
>>>>> & I know I've been off the typeless pointer stuff
for a few months
>>>>> working on llvm-dwp - but happy for any help (the next
immediate piece is
>>>>> probably figuring out teh right representation for byval
and inalloca -
>>>>> there were some code reviews sent out for that that
I'll need to come back
>>>>> around to - but also any optimizations people want to help
rework/improve
>>>>> would be great too & I can provide some
techniques/tools to help people
>>>>> approach those)
>>>>>
>>>>> - Dave
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> --
>>>>>> Mehdi
>>>>>>
>>>>>> On Mar 16, 2016, at 6:41 AM, Ehsan Amiri via llvm-dev
<
>>>>>> <llvm-dev at lists.llvm.org>llvm-dev at
lists.llvm.org> wrote:
>>>>>>
>>>>>> === PROBLEM === (See this bug
>>>>>> <https://llvm.org/bugs/show_bug.cgi?id=26445>
>>>>>> https://llvm.org/bugs/show_bug.cgi?id=26445)
>>>>>>
>>>>>> IR contains code for loading a float from float * and
storing it to a
>>>>>> float * address. After canonicalization of load in
InstCombine [1], new
>>>>>> bitcasts are added to the IR (see bottom of the email
for code samples).
>>>>>> This prevents select speculation in SROA to work. Also
after SROA we have
>>>>>> bitcasts from int32 to float. (Whereas originally after
instCombine,
>>>>>> bitcasts are only done on pointer types).
>>>>>>
>>>>>> === PROPOSED SOLUTION==>>>>>>
>>>>>> [1] implies that we need load canonicalization when we
load a value
>>>>>> only to store it again. The reason is to avoid
generating slightly
>>>>>> different code (due to different ways of adding
bitcasts), in different
>>>>>> situations. In all examples presented in [1] there is a
non-zero number of
>>>>>> bitcasts. I think when we load a value of type T from a
T* address and
>>>>>> store it as a type T value to one or more T* address
(and there is no other
>>>>>> use or store), we can redefine canonical form to mean
there should not be
>>>>>> any bitcasts. So we still have a canonical form, but
its definition is
>>>>>> slightly different.
>>>>>>
>>>>>> === REASONS FOR / AGAINST==>>>>>>
>>>>>> Hal Finkel warns that while this may be useful for
power pc, this may
>>>>>> hurt more than one other platform and become a very
large project. Despite
>>>>>> this he is fine with bringing up the issue to the
mailing list to get
>>>>>> feedback, mostly because this seems inline with our
future direction of
>>>>>> having a unique type for all pointers.  (Hal please
correct me if I
>>>>>> misunderstood your comment)
>>>>>>
>>>>>> This is a much simpler fix compared to alternatives.
(ignoring
>>>>>> potential regressions)
>>>>>>
>>>>>> === ALTERNATIVE SOLUTION ==>>>>>>
>>>>>> Fix select speculation in SROA to see through bitcasts.
Handle
>>>>>> remaining bitcasts during code gen. Other alternative
solutions are welcome.
>>>>>>
>>>>>> Should I implement the proposed solution or is it too
risky? I
>>>>>> understand that we may need to undo it if it breaks too
many things.
>>>>>> Comments are welcome.
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>>
<http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html>
>>>>>>
http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html
>>>>>> r226781  git commit id: b778cbc0c8
>>>>>>
>>>>>>
>>>>>>
>>>>>> Code Samples (only relevant part is copied):
>>>>>>
>>>>>> --------------------  Before Canonicalization (contains
call to
>>>>>> std::max): --------------------
>>>>>> entry:
>>>>>>   %max_value = alloca float, align 4
>>>>>>   %1 = load float, float* %input, align 4, !tbaa !1
>>>>>>   store float %1, float* %max_value, align 4, !tbaa !1
>>>>>>
>>>>>> for.body:
>>>>>>   %call = call dereferenceable(4) float*
>>>>>> @_ZSt3maxIfERKT_S2_S2_(float* dereferenceable(4)
%max_value, float*
>>>>>> dereferenceable(4) %arrayidx1)
>>>>>>   %3 = load float, float* %call, align 4, !tbaa !1
>>>>>>   store float %3, float* %max_value, align 4, !tbaa !1
>>>>>>
>>>>>> --------------------  After Canonicalization (contains
call to
>>>>>> std::max):--------------------
>>>>>>
>>>>>> entry:
>>>>>>   %max_value = alloca float, align 4
>>>>>>   %1 = bitcast float* %input to i32*
>>>>>>   %2 = load i32, i32* %1, align 4, !tbaa !1
>>>>>>   %3 = bitcast float* %max_value to i32*
>>>>>>   store i32 %2, i32* %3, align 4, !tbaa !1
>>>>>>
>>>>>> for.body:
>>>>>>   %call = call dereferenceable(4) float*
>>>>>> @_ZSt3maxIfERKT_S2_S2_(float* nonnull
dereferenceable(4) %max_value, float*
>>>>>> dereferenceable(4) %arrayidx1)
>>>>>>   %5 = bitcast float* %call to i32*
>>>>>>   %6 = load i32, i32* %5, align 4, !tbaa !1
>>>>>>   %7 = bitcast float* %max_value to i32*
>>>>>>   store i32 %6, i32* %7, align 4, !tbaa !1
>>>>>>
>>>>>> -------------------- After SROA (the call to std::max
is inlined
>>>>>> now):--------------------
>>>>>> entry:
>>>>>>   %max_value.sroa.0 = alloca i32
>>>>>>   %0 = bitcast float* %input to i32*
>>>>>>   %1 = load i32, i32* %0, align 4, !tbaa !1
>>>>>>   store i32 %1, i32* %max_value.sroa.0
>>>>>>
>>>>>> for.body:
>>>>>>   %max_value.sroa.0.0.max_value.sroa.0.0.6 = load i32,
i32*
>>>>>> %max_value.sroa.0
>>>>>>   %3 = bitcast i32
%max_value.sroa.0.0.max_value.sroa.0.0.6 to float
>>>>>>   %max_value.sroa.0.0.max_value.sroa_cast8 = bitcast
i32*
>>>>>> %max_value.sroa.0 to float*
>>>>>>   %__b.__a.i = select i1 %cmp.i, float* %arrayidx1,
float*
>>>>>> %max_value.sroa.0.0.max_value.sroa_cast8
>>>>>>   %5 = bitcast float* %__b.__a.i to i32*
>>>>>>   %6 = load i32, i32* %5, align 4, !tbaa !1
>>>>>>   store i32 %6, i32* %max_value.sroa.0
>>>>>>
>>>>>> -------------------- After SROA when Canonicalization
is turned
>>>>>> off--------------------
>>>>>> entry:
>>>>>>   %0 = load float, float* %input, align 4, !tbaa !1
>>>>>>
>>>>>> for.cond:                                         ;
preds >>>>>> %for.body, %entry
>>>>>>   %max_value.0 = phi float [ %0, %entry ], [
%.sroa.speculated,
>>>>>> %for.body ]
>>>>>>
>>>>>> for.body:
>>>>>>   %1 = load float, float* %arrayidx1, align 4, !tbaa !1
>>>>>>   %cmp.i = fcmp olt float %max_value.0, %1
>>>>>>   %.sroa.speculate.load.true = load float, float*
%arrayidx1, align
>>>>>> 4, !tbaa !1
>>>>>>   %.sroa.speculated = select i1 %cmp.i, float
>>>>>> %.sroa.speculate.load.true, float %max_value.0
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> <llvm-dev at lists.llvm.org>llvm-dev at
lists.llvm.org
>>>>>>
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> <llvm-dev at lists.llvm.org>llvm-dev at
lists.llvm.org
>>>>>>
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>>
>
>
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/48cc4d32/attachment-0001.html>

Mehdi Amini via llvm-dev

2016-Mar-22 21:42 UTC

head link

[llvm-dev] RFC: A change in InstCombine canonical form

> On Mar 22, 2016, at 2:30 PM, David Blaikie <dblaikie at gmail.com>
wrote:
> 
> 
> 
> On Tue, Mar 22, 2016 at 1:41 PM, Mehdi Amini <mehdi.amini at apple.com
<mailto:mehdi.amini at apple.com>> wrote:
> Sorry I should have been more clear (writing to many email in parallel)
> 
> You're right. I was adding a +1 to you here.
> 
> Especially I wrote "unless there is an acknowledgement that typeless
pointers won't be there before a couple of years" with  the PassManager
in mind, and I was expecting from David some good indication of a timeframe for
the typeless pointers.
> If the typeless pointer work is stalled or if it is not planned for LLVM
3.9,
> 
> It's neither stalled nor planned, as such.
>  
> I agree with Philip to not block anything.
> 
> All I'm suggesting is that if there are people who want to fix these
bugs, I'd really appreciate them helping out on the typeless pointer work -
it's totally parallelizable/shardable/shareable work & the project as a
whole seemed to agree it was the right direction. Why not just get it done?
Of course that's the right to go, however there might be a different amount
of work involved in the two case. For instance maybe handling bitcast everywhere
is 20 times less work than finishing the typeless pointers, but maybe it is only
2 times less work. I have no clue and this is the kind of information that I
think prevents someone who needs to have an issue solved from making the right
choice, and conservatively choosing to go with the "bandaid".

-- 
Mehdi


> 
> The Pass Manager work is a bit different & harder to share at certain
points (I don't tihnk there's ever been a point at which someone could
ask me "what can I do to help with the typeless pointer work" I
couldn't've given some pretty large areas I wasn't anywhere near
touching that they could've gotten their teeth into) - I think it's
reaching the point where multiple passes can be ported independently & seen
some work like that in Polly, etc. So at this point it seems like if people want
to address the issues the new pass manager is aimed at addressing, they could
pitch in on that effort.
> 
> That said, I'm not in a position (nor would I do so, even if I were) to
block other patches, just to encourage people to help out on the bigger efforts
in whatever way they can (either directly, or indirectly through ensuring
stopgaps/new work is done in a way that makes that future work easier (where
it's reasonable to judge what might be easier or harder, etc), etc)
> 
> - Dave
>  
> 
> -- 
> Mehdi
> 
> 
>> On Mar 22, 2016, at 1:37 PM, Philip Reames <listmail at
philipreames.com <mailto:listmail at philipreames.com>> wrote:
>> 
>> But not what David was stating, unless I misread?  I was specifically
responding to David's wording:
>> "If we're talking about making an optimization able to ignore
the bitcast instructions - yes, that work is unnecessary & perhaps
questionable given the typeless pointer work. Not outright off limits, but the
same time might be better invested in moving typeless pointers forward if the
contributor is so inclined/able to shift in that direction."
>> 
>> Both "perhaps questionable" and "not outright off
limits" seem to strongly imply such work should be discouraged.  I disagree
with that view which is why I wrote my response.
>> 
>> Philip
>> 
>> On 03/22/2016 01:34 PM, Mehdi Amini wrote:
>>> This is roughly what I wrote...
>>> 
>>>> On Mar 22, 2016, at 1:31 PM, Philip Reames <listmail at
philipreames.com <mailto:listmail at philipreames.com>> wrote:
>>>> 
>>>> I feel very strongly that blocking work on making optimization
bitcast-ignorant on the typeless pointer work would be a major mistake.  Unless
we expected the typeless pointer work to be concluded within the near term (say
3-6 months maximum), we should not block any development which would be accepted
in the typeless pointer work wasn't planned.
>>>> 
>>>> In my view, this is one of the largest mistakes we've made
with the pass manager work, it has seriously cost us, and I don't want to
see it happening again.
>>>> 
>>>> Philip
>>>> 
>>>> On 03/22/2016 01:09 PM, David Blaikie wrote:
>>>>> Ultimately everything is going to be made to not rely on
the types of pointers - that's nearly equivalent to bitcast-ignorant (the
difference being that the presence of an extra instruction (the bitcast) might
trip up some optimizations - but the presence of the /type/ information implied
by the bitcast should not trip up or be necessary for optimizations (two sides
of the same coin))
>>>>> 
>>>>> If we're talking about making an optimization able to
ignore the bitcast instructions - yes, that work is unnecessary & perhaps
questionable given the typeless pointer work. Not outright off limits, but the
same time might be better invested in moving typeless pointers forward if the
contributor is so inclined/able to shift in that direction.
>>>>> 
>>>>> But if we're talking about the work to make the
optimization not use the type of pointers as a crutch - that work is a necessary
precursor to the typeless pointer work and would be most welcome.
>>>>> 
>>>>> - David
>>>>> 
>>>>> On Tue, Mar 22, 2016 at 12:39 PM, Mehdi Amini <
<mailto:mehdi.amini at apple.com>mehdi.amini at apple.com
<mailto:mehdi.amini at apple.com>> wrote:
>>>>> I don't really mind, but the intermediate stage will
not be very nice: that a lot of code / tests that needs to be written with
bitcast, and all of that while they are deemed to disappear. The added value
isn't clear to me considering the added work. I'm not sure it
wouldn't add more work for all the cleanup required by the "typeless
pointer", but I'm not sure what's involved here and if David thinks
the intermediate steps of handling bit casts everywhere is not making it harder
I'm fine with it.
>>>>> 
>>>>> -- 
>>>>> Mehdi
>>>>> 
>>>>>> On Mar 22, 2016, at 12:36 PM, Philip Reames <
<mailto:listmail at philipreames.com>listmail at philipreames.com
<mailto:listmail at philipreames.com>> wrote:
>>>>>> 
>>>>>> I'd phrase this differently: being pointer-bitcast
agnostic is a step towards support typeless pointers.  :)  We can either become
bitcast agnostic all in one big change or incrementally.  Personally, I'd
prefer the later since it reduces the risk associated with enabling typeless
pointers in the end.
>>>>>> 
>>>>>> Philip
>>>>>> 
>>>>>> On 03/22/2016 12:16 PM, Mehdi Amini via llvm-dev wrote:
>>>>>>> I don't know enough about the tradeoff for 1,
but 2 seems like a bandaid for something that is not a correctness issue neither
a regression. I'm not sure it justifies "bandaid patches" while
there is a clear path forward, i.e. typeless pointers, unless there is an
acknowledgement that typeless pointers won't be there before a couple of
years.
>>>>>>> 
>>>>>>> -- 
>>>>>>> Mehdi
>>>>>>> 
>>>>>>>> On Mar 22, 2016, at 11:32 AM, Ehsan Amiri <
<mailto:ehsanamiri at gmail.com>ehsanamiri at gmail.com
<mailto:ehsanamiri at gmail.com>> wrote:
>>>>>>>> 
>>>>>>>> Back to the discussion on the RFC, I still see
some advantage in following the proposed solution. I see two paths forward:
>>>>>>>> 
>>>>>>>> 1- Change canonical form, possibly lower memcpy
to non-integer load and store in InstCombine. Then teach the backends to convert
that to integer load and store if that is more profitable. Notice that we are
talking about loads that have no use other than store. So it is a fairly simple
change in the backends.
>>>>>>>> 
>>>>>>>> 2- Do not change the canonical form. Then for
this bug, we need to teach select speculation to see through bitcasts. We will
probably need to teach other optimizations to see though bitcasts in the future
as problems are uncovered. That is until typeless pointer work is complete. Once
the typeless pointer work is complete, we have some extra code in each
optimization for seeing through bitcasts which is possibly no longer needed.
>>>>>>>> 
>>>>>>>> Based on this I think (1) is the right thing to
do. But there might be other reasons for the current canonical form that I am
not aware of. Please let me know what you think.
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> Ehsan
>>>>>>>> 
>>>>>>>> On Wed, Mar 16, 2016 at 2:13 PM, David Blaikie
< <mailto:dblaikie at gmail.com>dblaikie at gmail.com
<mailto:dblaikie at gmail.com>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, Mar 16, 2016 at 11:00 AM, Ehsan Amiri
< <mailto:ehsanamiri at gmail.com>ehsanamiri at gmail.com
<mailto:ehsanamiri at gmail.com>> wrote:
>>>>>>>> David,
>>>>>>>> 
>>>>>>>> Could you give us an update on the status of
typeless pointer work? How much work is left and when you think it might be
ready?
>>>>>>>> 
>>>>>>>> It's a bit of an onion peel, really - since
it will eventually involve generalizing/fixing every optimization that's
currently leaning on typed pointers to keep the performance while removing the
crutch they're currently leaning on. (in cases where bitcasts are literally
just getting in the way, those won't require cleaning up & should just
become "free performance wins" once we remove them, though)
>>>>>>>> 
>>>>>>>> At the moment we can roundtrip every LLVM IR
test case through bitcode and textual IR (reading/writing both formats) while
using only a narrow whitelist of places that request the type of a pointer
(things like the verifier, the parser/printer where it actually needs the typed
pointer to verify it matches the explicit type, etc).
>>>>>>>> 
>>>>>>>> The next thing on the list is probably figuring
out the byval/inalloca representation (add an explicit pointee type? just make
the number of bytes explicit with no type information?).
>>>>>>>> 
>>>>>>>> Then start migrating optimizations over - doing
the same sort of testing I did for the IR/bitcode roundtripping - assert that
the pointee type is not accessed, whitelist places that need it until the
bitcasts go away, fix anything else... it'll still be a fair bit of work
& I don't really know how much. It should parallelize pretty well (doing
any of this work is really helpful, each optimization is indepednent, etc) if
anyone wants to/is able to help.
>>>>>>>> 
>>>>>>>> - Dave
>>>>>>>>  
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> Ehsan
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, Mar 16, 2016 at 1:12 PM, David Blaikie
< <mailto:dblaikie at gmail.com>dblaikie at gmail.com
<mailto:dblaikie at gmail.com>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, Mar 16, 2016 at 8:34 AM, Mehdi Amini
via llvm-dev < <mailto:llvm-dev at lists.llvm.org>llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> How do it interact with the "typeless
pointers" work?
>>>>>>>> 
>>>>>>>> Right - the goal of the typeless pointer work
is to fix all these bugs related to "didn't look through bitcasts"
in optimizations. Sometimes that's going to mean more work (because the code
is leaning on the absence of bitcasts & the presence of convenient (but not
guaranteed) type information to inform optimization decisions) but if we remove
typed pointers while keeping optimization quality in the cases we have today,
then we should've also fixed the cases that were broken because the type
information didn't end up aligning to produce the optimal output.
>>>>>>>> 
>>>>>>>> & I know I've been off the typeless
pointer stuff for a few months working on llvm-dwp - but happy for any help (the
next immediate piece is probably figuring out teh right representation for byval
and inalloca - there were some code reviews sent out for that that I'll need
to come back around to - but also any optimizations people want to help
rework/improve would be great too & I can provide some techniques/tools to
help people approach those)
>>>>>>>> 
>>>>>>>> - Dave
>>>>>>>>  
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> Mehdi
>>>>>>>> 
>>>>>>>>> On Mar 16, 2016, at 6:41 AM, Ehsan Amiri
via llvm-dev < <mailto:llvm-dev at lists.llvm.org>llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>>>>>>> 
>>>>>>>>> === PROBLEM === (See this bug 
<https://llvm.org/bugs/show_bug.cgi?id=26445>https://llvm.org/bugs/show_bug.cgi?id=26445
<https://llvm.org/bugs/show_bug.cgi?id=26445>)
>>>>>>>>> 
>>>>>>>>> IR contains code for loading a float from
float * and storing it to a float * address. After canonicalization of load in
InstCombine [1], new bitcasts are added to the IR (see bottom of the email for
code samples). This prevents select speculation in SROA to work. Also after SROA
we have bitcasts from int32 to float. (Whereas originally after instCombine,
bitcasts are only done on pointer types).
>>>>>>>>> 
>>>>>>>>> === PROPOSED
SOLUTION==>>>>>>>>>
>>>>>>>>> [1] implies that we need load
canonicalization when we load a value only to store it again. The reason is to
avoid generating slightly different code (due to different ways of adding
bitcasts), in different situations. In all examples presented in [1] there is a
non-zero number of bitcasts. I think when we load a value of type T from a T*
address and store it as a type T value to one or more T* address (and there is
no other use or store), we can redefine canonical form to mean there should not
be any bitcasts. So we still have a canonical form, but its definition is
slightly different.
>>>>>>>>> 
>>>>>>>>> === REASONS FOR /
AGAINST==>>>>>>>>>
>>>>>>>>> Hal Finkel warns that while this may be
useful for power pc, this may hurt more than one other platform and become a
very large project. Despite this he is fine with bringing up the issue to the
mailing list to get feedback, mostly because this seems inline with our future
direction of having a unique type for all pointers.  (Hal please correct me if I
misunderstood your comment)
>>>>>>>>> 
>>>>>>>>> This is a much simpler fix compared to
alternatives. (ignoring potential regressions)
>>>>>>>>> 
>>>>>>>>> === ALTERNATIVE SOLUTION
==>>>>>>>>>
>>>>>>>>> Fix select speculation in SROA to see
through bitcasts. Handle remaining bitcasts during code gen. Other alternative
solutions are welcome.
>>>>>>>>> 
>>>>>>>>> Should I implement the proposed solution or
is it too risky? I understand that we may need to undo it if it breaks too many
things. Comments are welcome.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> [1] 
<http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html>http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html
<http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html> 
r226781  git commit id: b778cbc0c8
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Code Samples (only relevant part is
copied):
>>>>>>>>> 
>>>>>>>>> --------------------  Before
Canonicalization (contains call to std::max): --------------------
>>>>>>>>> entry:
>>>>>>>>>   %max_value = alloca float, align 4
>>>>>>>>>   %1 = load float, float* %input, align 4,
!tbaa !1
>>>>>>>>>   store float %1, float* %max_value, align
4, !tbaa !1
>>>>>>>>> 
>>>>>>>>> for.body:
>>>>>>>>>   %call = call dereferenceable(4) float*
@_ZSt3maxIfERKT_S2_S2_(float* dereferenceable(4) %max_value, float*
dereferenceable(4) %arrayidx1)
>>>>>>>>>   %3 = load float, float* %call, align 4,
!tbaa !1
>>>>>>>>>   store float %3, float* %max_value, align
4, !tbaa !1
>>>>>>>>> 
>>>>>>>>> --------------------  After
Canonicalization (contains call to std::max):--------------------
>>>>>>>>> 
>>>>>>>>> entry:
>>>>>>>>>   %max_value = alloca float, align 4
>>>>>>>>>   %1 = bitcast float* %input to i32*
>>>>>>>>>   %2 = load i32, i32* %1, align 4, !tbaa !1
>>>>>>>>>   %3 = bitcast float* %max_value to i32*
>>>>>>>>>   store i32 %2, i32* %3, align 4, !tbaa !1
>>>>>>>>> 
>>>>>>>>> for.body:
>>>>>>>>>   %call = call dereferenceable(4) float*
@_ZSt3maxIfERKT_S2_S2_(float* nonnull dereferenceable(4) %max_value, float*
dereferenceable(4) %arrayidx1)
>>>>>>>>>   %5 = bitcast float* %call to i32*
>>>>>>>>>   %6 = load i32, i32* %5, align 4, !tbaa !1
>>>>>>>>>   %7 = bitcast float* %max_value to i32*
>>>>>>>>>   store i32 %6, i32* %7, align 4, !tbaa !1
>>>>>>>>> 
>>>>>>>>> -------------------- After SROA (the call
to std::max is inlined now):--------------------
>>>>>>>>> entry:
>>>>>>>>>   %max_value.sroa.0 = alloca i32
>>>>>>>>>   %0 = bitcast float* %input to i32*
>>>>>>>>>   %1 = load i32, i32* %0, align 4, !tbaa !1
>>>>>>>>>   store i32 %1, i32* %max_value.sroa.0
>>>>>>>>> 
>>>>>>>>> for.body: 
>>>>>>>>>   %max_value.sroa.0.0.max_value.sroa.0.0.6
= load i32, i32* %max_value.sroa.0
>>>>>>>>>   %3 = bitcast i32
%max_value.sroa.0.0.max_value.sroa.0.0.6 to float
>>>>>>>>>   %max_value.sroa.0.0.max_value.sroa_cast8
= bitcast i32* %max_value.sroa.0 to float*
>>>>>>>>>   %__b.__a.i = select i1 %cmp.i, float*
%arrayidx1, float* %max_value.sroa.0.0.max_value.sroa_cast8
>>>>>>>>>   %5 = bitcast float* %__b.__a.i to i32*
>>>>>>>>>   %6 = load i32, i32* %5, align 4, !tbaa !1
>>>>>>>>>   store i32 %6, i32* %max_value.sroa.0
>>>>>>>>> 
>>>>>>>>> -------------------- After SROA when
Canonicalization is turned off--------------------
>>>>>>>>> entry:
>>>>>>>>>   %0 = load float, float* %input, align 4,
!tbaa !1
>>>>>>>>> 
>>>>>>>>> for.cond:                                  
; preds = %for.body, %entry
>>>>>>>>>   %max_value.0 = phi float [ %0, %entry ],
[ %.sroa.speculated, %for.body ]
>>>>>>>>> 
>>>>>>>>> for.body:
>>>>>>>>>   %1 = load float, float* %arrayidx1, align
4, !tbaa !1
>>>>>>>>>   %cmp.i = fcmp olt float %max_value.0, %1
>>>>>>>>>   %.sroa.speculate.load.true = load float,
float* %arrayidx1, align 4, !tbaa !1
>>>>>>>>>   %.sroa.speculated = select i1 %cmp.i,
float %.sroa.speculate.load.true, float %max_value.0
>>>>>>>>>
_______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>  <mailto:llvm-dev at
lists.llvm.org>llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>>>>>>>> 
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>>  <mailto:llvm-dev at
lists.llvm.org>llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>>>>>>> 
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/7f4e3571/attachment-0001.html>

Ehsan Amiri via llvm-dev

2016-Mar-22 21:44 UTC

head link

[llvm-dev] RFC: A change in InstCombine canonical form

Thanks.

*Phillip, *As Hal said I do not think (1) is a very large item. Please let
me know if I am mistaken.

*David *I think (1) is more inline with typeless pointer work than (2).
Contributing to typeless pointer work will be great, but given its unknown
time frame we cannot stop fixing existing problems. Of course, we should
follow an approach consistent with the long-term solution.

On Tue, Mar 22, 2016 at 5:30 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>
>
> On Tue, Mar 22, 2016 at 1:41 PM, Mehdi Amini <mehdi.amini at
apple.com>
> wrote:
>
>> Sorry I should have been more clear (writing to many email in parallel)
>>
>> You're right. I was adding a +1 to you here.
>>
>> Especially I wrote "unless there is an acknowledgement that
typeless
>> pointers won't be there before a couple of years" with  the
PassManager in
>> mind, and I was expecting from David some good indication of a
timeframe
>> for the typeless pointers.
>> If the typeless pointer work is stalled or if it is not planned for
LLVM
>> 3.9,
>>
>
> It's neither stalled nor planned, as such.
>
>
>> I agree with Philip to not block anything.
>>
>
> All I'm suggesting is that if there are people who want to fix these
bugs,
> I'd really appreciate them helping out on the typeless pointer work -
it's
> totally parallelizable/shardable/shareable work & the project as a
whole
> seemed to agree it was the right direction. Why not just get it done?
>
> The Pass Manager work is a bit different & harder to share at certain
> points (I don't tihnk there's ever been a point at which someone
could ask
> me "what can I do to help with the typeless pointer work" I
couldn't've
> given some pretty large areas I wasn't anywhere near touching that they
> could've gotten their teeth into) - I think it's reaching the point
where
> multiple passes can be ported independently & seen some work like that
in
> Polly, etc. So at this point it seems like if people want to address the
> issues the new pass manager is aimed at addressing, they could pitch in on
> that effort.
>
> That said, I'm not in a position (nor would I do so, even if I were) to
> block other patches, just to encourage people to help out on the bigger
> efforts in whatever way they can (either directly, or indirectly through
> ensuring stopgaps/new work is done in a way that makes that future work
> easier (where it's reasonable to judge what might be easier or harder,
> etc), etc)
>
> - Dave
>
>
>>
>> --
>> Mehdi
>>
>>
>> On Mar 22, 2016, at 1:37 PM, Philip Reames <listmail at
philipreames.com>
>> wrote:
>>
>> But not what David was stating, unless I misread?  I was specifically
>> responding to David's wording:
>> "If we're talking about making an optimization able to ignore
the bitcast
>> instructions - yes, that work is unnecessary & perhaps questionable
given
>> the typeless pointer work. Not outright off limits, but the same time
might
>> be better invested in moving typeless pointers forward if the
contributor
>> is so inclined/able to shift in that direction."
>>
>> Both "perhaps questionable" and "not outright off
limits" seem to
>> strongly imply such work should be discouraged.  I disagree with that
view
>> which is why I wrote my response.
>>
>> Philip
>>
>> On 03/22/2016 01:34 PM, Mehdi Amini wrote:
>>
>> This is roughly what I wrote...
>>
>> On Mar 22, 2016, at 1:31 PM, Philip Reames <listmail at
philipreames.com>
>> wrote:
>>
>> I feel very strongly that blocking work on making optimization
>> bitcast-ignorant on the typeless pointer work would be a major mistake.
>> Unless we expected the typeless pointer work to be concluded within the
>> near term (say 3-6 months maximum), we should not block any development
>> which would be accepted in the typeless pointer work wasn't
planned.
>>
>> In my view, this is one of the largest mistakes we've made with the
pass
>> manager work, it has seriously cost us, and I don't want to see it
>> happening again.
>>
>> Philip
>>
>> On 03/22/2016 01:09 PM, David Blaikie wrote:
>>
>> Ultimately everything is going to be made to not rely on the types of
>> pointers - that's nearly equivalent to bitcast-ignorant (the
difference
>> being that the presence of an extra instruction (the bitcast) might
trip up
>> some optimizations - but the presence of the /type/ information implied
by
>> the bitcast should not trip up or be necessary for optimizations (two
sides
>> of the same coin))
>>
>> If we're talking about making an optimization able to ignore the
bitcast
>> instructions - yes, that work is unnecessary & perhaps questionable
given
>> the typeless pointer work. Not outright off limits, but the same time
might
>> be better invested in moving typeless pointers forward if the
contributor
>> is so inclined/able to shift in that direction.
>>
>> But if we're talking about the work to make the optimization not
use the
>> type of pointers as a crutch - that work is a necessary precursor to
the
>> typeless pointer work and would be most welcome.
>>
>> - David
>>
>> On Tue, Mar 22, 2016 at 12:39 PM, Mehdi Amini < <mehdi.amini at
apple.com>
>> mehdi.amini at apple.com> wrote:
>>
>>> I don't really mind, but the intermediate stage will not be
very nice:
>>> that a lot of code / tests that needs to be written with bitcast,
and all
>>> of that while they are deemed to disappear. The added value
isn't clear to
>>> me considering the added work. I'm not sure it wouldn't add
more work for
>>> all the cleanup required by the "typeless pointer", but
I'm not sure what's
>>> involved here and if David thinks the intermediate steps of
handling bit
>>> casts everywhere is not making it harder I'm fine with it.
>>>
>>> --
>>> Mehdi
>>>
>>> On Mar 22, 2016, at 12:36 PM, Philip Reames <
>>> <listmail at philipreames.com>listmail at
philipreames.com> wrote:
>>>
>>> I'd phrase this differently: being pointer-bitcast agnostic is
a step
>>> towards support typeless pointers.  :)  We can either become
bitcast
>>> agnostic all in one big change or incrementally.  Personally,
I'd prefer
>>> the later since it reduces the risk associated with enabling
typeless
>>> pointers in the end.
>>>
>>> Philip
>>>
>>> On 03/22/2016 12:16 PM, Mehdi Amini via llvm-dev wrote:
>>>
>>> I don't know enough about the tradeoff for 1, but 2 seems like
a bandaid
>>> for something that is not a correctness issue neither a regression.
I'm not
>>> sure it justifies "bandaid patches" while there is a
clear path forward,
>>> i.e. typeless pointers, unless there is an acknowledgement that
typeless
>>> pointers won't be there before a couple of years.
>>>
>>> --
>>> Mehdi
>>>
>>> On Mar 22, 2016, at 11:32 AM, Ehsan Amiri < <ehsanamiri at
gmail.com>
>>> ehsanamiri at gmail.com> wrote:
>>>
>>> Back to the discussion on the RFC, I still see some advantage in
>>> following the proposed solution. I see two paths forward:
>>>
>>> 1- Change canonical form, possibly lower memcpy to non-integer load
and
>>> store in InstCombine. Then teach the backends to convert that to
integer
>>> load and store if that is more profitable. Notice that we are
talking about
>>> loads that have no use other than store. So it is a fairly simple
change in
>>> the backends.
>>>
>>> 2- Do not change the canonical form. Then for this bug, we need to
teach
>>> select speculation to see through bitcasts. We will probably need
to teach
>>> other optimizations to see though bitcasts in the future as
problems are
>>> uncovered. That is until typeless pointer work is complete. Once
the
>>> typeless pointer work is complete, we have some extra code in each
>>> optimization for seeing through bitcasts which is possibly no
longer needed.
>>>
>>> Based on this I think (1) is the right thing to do. But there might
be
>>> other reasons for the current canonical form that I am not aware
of. Please
>>> let me know what you think.
>>>
>>> Thanks
>>> Ehsan
>>>
>>> On Wed, Mar 16, 2016 at 2:13 PM, David Blaikie < <dblaikie at
gmail.com>
>>> dblaikie at gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Mar 16, 2016 at 11:00 AM, Ehsan Amiri <
<ehsanamiri at gmail.com>
>>>> ehsanamiri at gmail.com> wrote:
>>>>
>>>>> David,
>>>>>
>>>>> Could you give us an update on the status of typeless
pointer work?
>>>>> How much work is left and when you think it might be ready?
>>>>>
>>>>
>>>> It's a bit of an onion peel, really - since it will
eventually involve
>>>> generalizing/fixing every optimization that's currently
leaning on typed
>>>> pointers to keep the performance while removing the crutch
they're
>>>> currently leaning on. (in cases where bitcasts are literally
just getting
>>>> in the way, those won't require cleaning up & should
just become "free
>>>> performance wins" once we remove them, though)
>>>>
>>>> At the moment we can roundtrip every LLVM IR test case through
bitcode
>>>> and textual IR (reading/writing both formats) while using only
a narrow
>>>> whitelist of places that request the type of a pointer (things
like the
>>>> verifier, the parser/printer where it actually needs the typed
pointer to
>>>> verify it matches the explicit type, etc).
>>>>
>>>> The next thing on the list is probably figuring out the
byval/inalloca
>>>> representation (add an explicit pointee type? just make the
number of bytes
>>>> explicit with no type information?).
>>>>
>>>> Then start migrating optimizations over - doing the same sort
of
>>>> testing I did for the IR/bitcode roundtripping - assert that
the pointee
>>>> type is not accessed, whitelist places that need it until the
bitcasts go
>>>> away, fix anything else... it'll still be a fair bit of
work & I don't
>>>> really know how much. It should parallelize pretty well (doing
any of this
>>>> work is really helpful, each optimization is indepednent, etc)
if anyone
>>>> wants to/is able to help.
>>>>
>>>> - Dave
>>>>
>>>>
>>>>>
>>>>> Thanks
>>>>> Ehsan
>>>>>
>>>>>
>>>>> On Wed, Mar 16, 2016 at 1:12 PM, David Blaikie <
<dblaikie at gmail.com>
>>>>> dblaikie at gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 16, 2016 at 8:34 AM, Mehdi Amini via
llvm-dev <
>>>>>> <llvm-dev at lists.llvm.org>llvm-dev at
lists.llvm.org> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> How do it interact with the "typeless
pointers" work?
>>>>>>>
>>>>>>
>>>>>> Right - the goal of the typeless pointer work is to fix
all these
>>>>>> bugs related to "didn't look through
bitcasts" in optimizations. Sometimes
>>>>>> that's going to mean more work (because the code is
leaning on the absence
>>>>>> of bitcasts & the presence of convenient (but not
guaranteed) type
>>>>>> information to inform optimization decisions) but if we
remove typed
>>>>>> pointers while keeping optimization quality in the
cases we have today,
>>>>>> then we should've also fixed the cases that were
broken because the type
>>>>>> information didn't end up aligning to produce the
optimal output.
>>>>>>
>>>>>> & I know I've been off the typeless pointer
stuff for a few months
>>>>>> working on llvm-dwp - but happy for any help (the next
immediate piece is
>>>>>> probably figuring out teh right representation for
byval and inalloca -
>>>>>> there were some code reviews sent out for that that
I'll need to come back
>>>>>> around to - but also any optimizations people want to
help rework/improve
>>>>>> would be great too & I can provide some
techniques/tools to help people
>>>>>> approach those)
>>>>>>
>>>>>> - Dave
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> --
>>>>>>> Mehdi
>>>>>>>
>>>>>>> On Mar 16, 2016, at 6:41 AM, Ehsan Amiri via
llvm-dev <
>>>>>>> <llvm-dev at lists.llvm.org>llvm-dev at
lists.llvm.org> wrote:
>>>>>>>
>>>>>>> === PROBLEM === (See this bug
>>>>>>> <https://llvm.org/bugs/show_bug.cgi?id=26445>
>>>>>>> https://llvm.org/bugs/show_bug.cgi?id=26445)
>>>>>>>
>>>>>>> IR contains code for loading a float from float *
and storing it to
>>>>>>> a float * address. After canonicalization of load
in InstCombine [1], new
>>>>>>> bitcasts are added to the IR (see bottom of the
email for code samples).
>>>>>>> This prevents select speculation in SROA to work.
Also after SROA we have
>>>>>>> bitcasts from int32 to float. (Whereas originally
after instCombine,
>>>>>>> bitcasts are only done on pointer types).
>>>>>>>
>>>>>>> === PROPOSED SOLUTION==>>>>>>>
>>>>>>> [1] implies that we need load canonicalization when
we load a value
>>>>>>> only to store it again. The reason is to avoid
generating slightly
>>>>>>> different code (due to different ways of adding
bitcasts), in different
>>>>>>> situations. In all examples presented in [1] there
is a non-zero number of
>>>>>>> bitcasts. I think when we load a value of type T
from a T* address and
>>>>>>> store it as a type T value to one or more T*
address (and there is no other
>>>>>>> use or store), we can redefine canonical form to
mean there should not be
>>>>>>> any bitcasts. So we still have a canonical form,
but its definition is
>>>>>>> slightly different.
>>>>>>>
>>>>>>> === REASONS FOR /
AGAINST==>>>>>>>
>>>>>>> Hal Finkel warns that while this may be useful for
power pc, this
>>>>>>> may hurt more than one other platform and become a
very large project.
>>>>>>> Despite this he is fine with bringing up the issue
to the mailing list to
>>>>>>> get feedback, mostly because this seems inline with
our future direction of
>>>>>>> having a unique type for all pointers.  (Hal please
correct me if I
>>>>>>> misunderstood your comment)
>>>>>>>
>>>>>>> This is a much simpler fix compared to
alternatives. (ignoring
>>>>>>> potential regressions)
>>>>>>>
>>>>>>> === ALTERNATIVE SOLUTION
==>>>>>>>
>>>>>>> Fix select speculation in SROA to see through
bitcasts. Handle
>>>>>>> remaining bitcasts during code gen. Other
alternative solutions are welcome.
>>>>>>>
>>>>>>> Should I implement the proposed solution or is it
too risky? I
>>>>>>> understand that we may need to undo it if it breaks
too many things.
>>>>>>> Comments are welcome.
>>>>>>>
>>>>>>>
>>>>>>> [1]
>>>>>>>
<http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html>
>>>>>>>
http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html
>>>>>>> r226781  git commit id: b778cbc0c8
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Code Samples (only relevant part is copied):
>>>>>>>
>>>>>>> --------------------  Before Canonicalization
(contains call to
>>>>>>> std::max): --------------------
>>>>>>> entry:
>>>>>>>   %max_value = alloca float, align 4
>>>>>>>   %1 = load float, float* %input, align 4, !tbaa !1
>>>>>>>   store float %1, float* %max_value, align 4, !tbaa
!1
>>>>>>>
>>>>>>> for.body:
>>>>>>>   %call = call dereferenceable(4) float*
>>>>>>> @_ZSt3maxIfERKT_S2_S2_(float* dereferenceable(4)
%max_value, float*
>>>>>>> dereferenceable(4) %arrayidx1)
>>>>>>>   %3 = load float, float* %call, align 4, !tbaa !1
>>>>>>>   store float %3, float* %max_value, align 4, !tbaa
!1
>>>>>>>
>>>>>>> --------------------  After Canonicalization
(contains call to
>>>>>>> std::max):--------------------
>>>>>>>
>>>>>>> entry:
>>>>>>>   %max_value = alloca float, align 4
>>>>>>>   %1 = bitcast float* %input to i32*
>>>>>>>   %2 = load i32, i32* %1, align 4, !tbaa !1
>>>>>>>   %3 = bitcast float* %max_value to i32*
>>>>>>>   store i32 %2, i32* %3, align 4, !tbaa !1
>>>>>>>
>>>>>>> for.body:
>>>>>>>   %call = call dereferenceable(4) float*
>>>>>>> @_ZSt3maxIfERKT_S2_S2_(float* nonnull
dereferenceable(4) %max_value, float*
>>>>>>> dereferenceable(4) %arrayidx1)
>>>>>>>   %5 = bitcast float* %call to i32*
>>>>>>>   %6 = load i32, i32* %5, align 4, !tbaa !1
>>>>>>>   %7 = bitcast float* %max_value to i32*
>>>>>>>   store i32 %6, i32* %7, align 4, !tbaa !1
>>>>>>>
>>>>>>> -------------------- After SROA (the call to
std::max is inlined
>>>>>>> now):--------------------
>>>>>>> entry:
>>>>>>>   %max_value.sroa.0 = alloca i32
>>>>>>>   %0 = bitcast float* %input to i32*
>>>>>>>   %1 = load i32, i32* %0, align 4, !tbaa !1
>>>>>>>   store i32 %1, i32* %max_value.sroa.0
>>>>>>>
>>>>>>> for.body:
>>>>>>>   %max_value.sroa.0.0.max_value.sroa.0.0.6 = load
i32, i32*
>>>>>>> %max_value.sroa.0
>>>>>>>   %3 = bitcast i32
%max_value.sroa.0.0.max_value.sroa.0.0.6 to float
>>>>>>>   %max_value.sroa.0.0.max_value.sroa_cast8 =
bitcast i32*
>>>>>>> %max_value.sroa.0 to float*
>>>>>>>   %__b.__a.i = select i1 %cmp.i, float* %arrayidx1,
float*
>>>>>>> %max_value.sroa.0.0.max_value.sroa_cast8
>>>>>>>   %5 = bitcast float* %__b.__a.i to i32*
>>>>>>>   %6 = load i32, i32* %5, align 4, !tbaa !1
>>>>>>>   store i32 %6, i32* %max_value.sroa.0
>>>>>>>
>>>>>>> -------------------- After SROA when
Canonicalization is turned
>>>>>>> off--------------------
>>>>>>> entry:
>>>>>>>   %0 = load float, float* %input, align 4, !tbaa !1
>>>>>>>
>>>>>>> for.cond:                                         ;
preds >>>>>>> %for.body, %entry
>>>>>>>   %max_value.0 = phi float [ %0, %entry ], [
%.sroa.speculated,
>>>>>>> %for.body ]
>>>>>>>
>>>>>>> for.body:
>>>>>>>   %1 = load float, float* %arrayidx1, align 4,
!tbaa !1
>>>>>>>   %cmp.i = fcmp olt float %max_value.0, %1
>>>>>>>   %.sroa.speculate.load.true = load float, float*
%arrayidx1, align
>>>>>>> 4, !tbaa !1
>>>>>>>   %.sroa.speculated = select i1 %cmp.i, float
>>>>>>> %.sroa.speculate.load.true, float %max_value.0
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> <llvm-dev at lists.llvm.org>llvm-dev at
lists.llvm.org
>>>>>>>
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> <llvm-dev at lists.llvm.org>llvm-dev at
lists.llvm.org
>>>>>>>
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/d044cc07/attachment.html>

Philip Reames via llvm-dev

2016-Mar-22 22:42 UTC

head link

[llvm-dev] RFC: A change in InstCombine canonical form

On 03/22/2016 02:44 PM, Ehsan Amiri wrote:> Thanks.
>
> *Phillip, *As Hal said I do not think (1) is a very large item. Please 
> let me know if I am mistaken.I have no specific reason to believe it will be a large amount of work, 
but prior experience tells me changes to canonical form have a tendency 
of exposing unexpected issues.  To be clear, I am supportive of you 
implementing solution 1.>
> *David *I think (1) is more inline with typeless pointer work than 
> (2). Contributing to typeless pointer work will be great, but given 
> its unknown time frame we cannot stop fixing existing problems. Of 
> course, we should follow an approach consistent with the long-term 
> solution.
>
> On Tue, Mar 22, 2016 at 5:30 PM, David Blaikie <dblaikie at gmail.com 
> <mailto:dblaikie at gmail.com>> wrote:
>
>
>
>     On Tue, Mar 22, 2016 at 1:41 PM, Mehdi Amini
>     <mehdi.amini at apple.com <mailto:mehdi.amini at
apple.com>> wrote:
>
>         Sorry I should have been more clear (writing to many email in
>         parallel)
>
>         You're right. I was adding a +1 to you here.
>
>         Especially I wrote "unless there is an acknowledgement that
>         typeless pointers won't be there before a couple of years"
>         with  the PassManager in mind, and I was expecting from David
>         some good indication of a timeframe for the typeless pointers.
>         If the typeless pointer work is stalled or if it is not
>         planned for LLVM 3.9,
>
>
>     It's neither stalled nor planned, as such.
>
>         I agree with Philip to not block anything.
>
>
>     All I'm suggesting is that if there are people who want to fix
>     these bugs, I'd really appreciate them helping out on the typeless
>     pointer work - it's totally parallelizable/shardable/shareable
>     work & the project as a whole seemed to agree it was the right
>     direction. Why not just get it done?
>
>     The Pass Manager work is a bit different & harder to share at
>     certain points (I don't tihnk there's ever been a point at
which
>     someone could ask me "what can I do to help with the typeless
>     pointer work" I couldn't've given some pretty large areas
I wasn't
>     anywhere near touching that they could've gotten their teeth into)
>     - I think it's reaching the point where multiple passes can be
>     ported independently & seen some work like that in Polly, etc. So
>     at this point it seems like if people want to address the issues
>     the new pass manager is aimed at addressing, they could pitch in
>     on that effort.
>
>     That said, I'm not in a position (nor would I do so, even if I
>     were) to block other patches, just to encourage people to help out
>     on the bigger efforts in whatever way they can (either directly,
>     or indirectly through ensuring stopgaps/new work is done in a way
>     that makes that future work easier (where it's reasonable to judge
>     what might be easier or harder, etc), etc)
>
>     - Dave
>
>
>         -- 
>         Mehdi
>
>
>>         On Mar 22, 2016, at 1:37 PM, Philip Reames
>>         <listmail at philipreames.com
>>         <mailto:listmail at philipreames.com>> wrote:
>>
>>         But not what David was stating, unless I misread?  I was
>>         specifically responding to David's wording:
>>         "If we're talking about making an optimization able to
ignore
>>         the bitcast instructions - yes, that work is unnecessary &
>>         perhaps questionable given the typeless pointer work. Not
>>         outright off limits, but the same time might be better
>>         invested in moving typeless pointers forward if the
>>         contributor is so inclined/able to shift in that
direction."
>>
>>         Both "perhaps questionable" and "not outright
off limits"
>>         seem to strongly imply such work should be discouraged.  I
>>         disagree with that view which is why I wrote my response.
>>
>>         Philip
>>
>>         On 03/22/2016 01:34 PM, Mehdi Amini wrote:
>>>         This is roughly what I wrote...
>>>
>>>>         On Mar 22, 2016, at 1:31 PM, Philip Reames
>>>>         <listmail at philipreames.com
>>>>         <mailto:listmail at philipreames.com>> wrote:
>>>>
>>>>         I feel very strongly that blocking work on making
>>>>         optimization bitcast-ignorant on the typeless pointer
work
>>>>         would be a major mistake.  Unless we expected the
typeless
>>>>         pointer work to be concluded within the near term (say
3-6
>>>>         months maximum), we should not block any development
which
>>>>         would be accepted in the typeless pointer work
wasn't planned.
>>>>
>>>>         In my view, this is one of the largest mistakes
we've made
>>>>         with the pass manager work, it has seriously cost us,
and I
>>>>         don't want to see it happening again.
>>>>
>>>>         Philip
>>>>
>>>>         On 03/22/2016 01:09 PM, David Blaikie wrote:
>>>>>         Ultimately everything is going to be made to not
rely on
>>>>>         the types of pointers - that's nearly
equivalent to
>>>>>         bitcast-ignorant (the difference being that the
presence
>>>>>         of an extra instruction (the bitcast) might trip up
some
>>>>>         optimizations - but the presence of the /type/
information
>>>>>         implied by the bitcast should not trip up or be
necessary
>>>>>         for optimizations (two sides of the same coin))
>>>>>
>>>>>         If we're talking about making an optimization
able to
>>>>>         ignore the bitcast instructions - yes, that work is
>>>>>         unnecessary & perhaps questionable given the
typeless
>>>>>         pointer work. Not outright off limits, but the same
time
>>>>>         might be better invested in moving typeless
pointers
>>>>>         forward if the contributor is so inclined/able to
shift in
>>>>>         that direction.
>>>>>
>>>>>         But if we're talking about the work to make the
>>>>>         optimization not use the type of pointers as a
crutch -
>>>>>         that work is a necessary precursor to the typeless
pointer
>>>>>         work and would be most welcome.
>>>>>
>>>>>         - David
>>>>>
>>>>>         On Tue, Mar 22, 2016 at 12:39 PM, Mehdi Amini
>>>>>         <mehdi.amini at apple.com <mailto:mehdi.amini
at apple.com>> wrote:
>>>>>
>>>>>             I don't really mind, but the intermediate
stage will
>>>>>             not be very nice: that a lot of code / tests
that
>>>>>             needs to be written with bitcast, and all of
that
>>>>>             while they are deemed to disappear. The added
value
>>>>>             isn't clear to me considering the added
work. I'm not
>>>>>             sure it wouldn't add more work for all the
cleanup
>>>>>             required by the "typeless pointer",
but I'm not sure
>>>>>             what's involved here and if David thinks
the
>>>>>             intermediate steps of handling bit casts
everywhere is
>>>>>             not making it harder I'm fine with it.
>>>>>
>>>>>             -- 
>>>>>             Mehdi
>>>>>
>>>>>>             On Mar 22, 2016, at 12:36 PM, Philip Reames
>>>>>>             <listmail at philipreames.com
>>>>>>             <mailto:listmail at
philipreames.com>> wrote:
>>>>>>
>>>>>>             I'd phrase this differently: being
pointer-bitcast
>>>>>>             agnostic is a step towards support typeless
>>>>>>             pointers.  :) We can either become bitcast
agnostic
>>>>>>             all in one big change or incrementally.
Personally,
>>>>>>             I'd prefer the later since it reduces
the risk
>>>>>>             associated with enabling typeless pointers
in the end.
>>>>>>
>>>>>>             Philip
>>>>>>
>>>>>>             On 03/22/2016 12:16 PM, Mehdi Amini via
llvm-dev wrote:
>>>>>>>             I don't know enough about the
tradeoff for 1, but 2
>>>>>>>             seems like a bandaid for something that
is not a
>>>>>>>             correctness issue neither a regression.
I'm not sure
>>>>>>>             it justifies "bandaid
patches" while there is a
>>>>>>>             clear path forward, i.e. typeless
pointers, unless
>>>>>>>             there is an acknowledgement that
typeless pointers
>>>>>>>             won't be there before a couple of
years.
>>>>>>>
>>>>>>>             -- 
>>>>>>>             Mehdi
>>>>>>>
>>>>>>>>             On Mar 22, 2016, at 11:32 AM, Ehsan
Amiri
>>>>>>>>             <ehsanamiri at gmail.com
>>>>>>>>             <mailto:ehsanamiri at
gmail.com>> wrote:
>>>>>>>>
>>>>>>>>             Back to the discussion on the RFC,
I still see some
>>>>>>>>             advantage in following the proposed
solution. I see
>>>>>>>>             two paths forward:
>>>>>>>>
>>>>>>>>             1- Change canonical form, possibly
lower memcpy to
>>>>>>>>             non-integer load and store in
InstCombine. Then
>>>>>>>>             teach the backends to convert that
to integer load
>>>>>>>>             and store if that is more
profitable. Notice that
>>>>>>>>             we are talking about loads that
have no use other
>>>>>>>>             than store. So it is a fairly
simple change in the
>>>>>>>>             backends.
>>>>>>>>
>>>>>>>>             2- Do not change the canonical
form. Then for this
>>>>>>>>             bug, we need to teach select
speculation to see
>>>>>>>>             through bitcasts. We will probably
need to teach
>>>>>>>>             other optimizations to see though
bitcasts in the
>>>>>>>>             future as problems are uncovered.
That is until
>>>>>>>>             typeless pointer work is complete.
Once the
>>>>>>>>             typeless pointer work is complete,
we have some
>>>>>>>>             extra code in each optimization for
seeing through
>>>>>>>>             bitcasts which is possibly no
longer needed.
>>>>>>>>
>>>>>>>>             Based on this I think (1) is the
right thing to do.
>>>>>>>>             But there might be other reasons
for the current
>>>>>>>>             canonical form that I am not aware
of. Please let
>>>>>>>>             me know what you think.
>>>>>>>>
>>>>>>>>             Thanks
>>>>>>>>             Ehsan
>>>>>>>>
>>>>>>>>             On Wed, Mar 16, 2016 at 2:13 PM,
David Blaikie
>>>>>>>>             <dblaikie at gmail.com
<mailto:dblaikie at gmail.com>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 On Wed, Mar 16, 2016 at 11:00
AM, Ehsan Amiri
>>>>>>>>                 <ehsanamiri at gmail.com
>>>>>>>>                 <mailto:ehsanamiri at
gmail.com>> wrote:
>>>>>>>>
>>>>>>>>                     David,
>>>>>>>>
>>>>>>>>                     Could you give us an update
on the status
>>>>>>>>                     of typeless pointer work?
How much work is
>>>>>>>>                     left and when you think it
might be ready?
>>>>>>>>
>>>>>>>>
>>>>>>>>                 It's a bit of an onion
peel, really - since it
>>>>>>>>                 will eventually involve
generalizing/fixing
>>>>>>>>                 every optimization that's
currently leaning on
>>>>>>>>                 typed pointers to keep the
performance while
>>>>>>>>                 removing the crutch they're
currently leaning
>>>>>>>>                 on. (in cases where bitcasts
are literally just
>>>>>>>>                 getting in the way, those
won't require
>>>>>>>>                 cleaning up & should just
become "free
>>>>>>>>                 performance wins" once we
remove them, though)
>>>>>>>>
>>>>>>>>                 At the moment we can roundtrip
every LLVM IR
>>>>>>>>                 test case through bitcode and
textual IR
>>>>>>>>                 (reading/writing both formats)
while using only
>>>>>>>>                 a narrow whitelist of places
that request the
>>>>>>>>                 type of a pointer (things like
the verifier,
>>>>>>>>                 the parser/printer where it
actually needs the
>>>>>>>>                 typed pointer to verify it
matches the explicit
>>>>>>>>                 type, etc).
>>>>>>>>
>>>>>>>>                 The next thing on the list is
probably figuring
>>>>>>>>                 out the byval/inalloca
representation (add an
>>>>>>>>                 explicit pointee type? just
make the number of
>>>>>>>>                 bytes explicit with no type
information?).
>>>>>>>>
>>>>>>>>                 Then start migrating
optimizations over - doing
>>>>>>>>                 the same sort of testing I did
for the
>>>>>>>>                 IR/bitcode roundtripping -
assert that the
>>>>>>>>                 pointee type is not accessed,
whitelist places
>>>>>>>>                 that need it until the bitcasts
go away, fix
>>>>>>>>                 anything else... it'll
still be a fair bit of
>>>>>>>>                 work & I don't really
know how much. It should
>>>>>>>>                 parallelize pretty well (doing
any of this work
>>>>>>>>                 is really helpful, each
optimization is
>>>>>>>>                 indepednent, etc) if anyone
wants to/is able to
>>>>>>>>                 help.
>>>>>>>>
>>>>>>>>                 - Dave
>>>>>>>>
>>>>>>>>
>>>>>>>>                     Thanks
>>>>>>>>                     Ehsan
>>>>>>>>
>>>>>>>>
>>>>>>>>                     On Wed, Mar 16, 2016 at
1:12 PM, David
>>>>>>>>                     Blaikie <dblaikie at
gmail.com
>>>>>>>>                     <mailto:dblaikie at
gmail.com>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                         On Wed, Mar 16, 2016 at
8:34 AM, Mehdi
>>>>>>>>                         Amini via llvm-dev
>>>>>>>>                         <llvm-dev at
lists.llvm.org
>>>>>>>>                         <mailto:llvm-dev at
lists.llvm.org>> wrote:
>>>>>>>>
>>>>>>>>                             Hi,
>>>>>>>>
>>>>>>>>                             How do it interact
with the
>>>>>>>>                             "typeless
pointers" work?
>>>>>>>>
>>>>>>>>
>>>>>>>>                         Right - the goal of the
typeless
>>>>>>>>                         pointer work is to fix
all these bugs
>>>>>>>>                         related to
"didn't look through
>>>>>>>>                         bitcasts" in
optimizations. Sometimes
>>>>>>>>                         that's going to
mean more work (because
>>>>>>>>                         the code is leaning on
the absence of
>>>>>>>>                         bitcasts & the
presence of convenient
>>>>>>>>                         (but not guaranteed)
type information
>>>>>>>>                         to inform optimization
decisions) but
>>>>>>>>                         if we remove typed
pointers while
>>>>>>>>                         keeping optimization
quality in the
>>>>>>>>                         cases we have today,
then we should've
>>>>>>>>                         also fixed the cases
that were broken
>>>>>>>>                         because the type
information didn't end
>>>>>>>>                         up aligning to produce
the optimal output.
>>>>>>>>
>>>>>>>>                         & I know I've
been off the typeless
>>>>>>>>                         pointer stuff for a few
months working
>>>>>>>>                         on llvm-dwp - but happy
for any help
>>>>>>>>                         (the next immediate
piece is probably
>>>>>>>>                         figuring out teh right
representation
>>>>>>>>                         for byval and inalloca
- there were
>>>>>>>>                         some code reviews sent
out for that
>>>>>>>>                         that I'll need to
come back around to -
>>>>>>>>                         but also any
optimizations people want
>>>>>>>>                         to help rework/improve
would be great
>>>>>>>>                         too & I can provide
some
>>>>>>>>                         techniques/tools to
help people
>>>>>>>>                         approach those)
>>>>>>>>
>>>>>>>>                         - Dave
>>>>>>>>
>>>>>>>>
>>>>>>>>                             Thanks,
>>>>>>>>
>>>>>>>>                             -- 
>>>>>>>>                             Mehdi
>>>>>>>>
>>>>>>>>>                             On Mar 16,
2016, at 6:41 AM, Ehsan
>>>>>>>>>                             Amiri via
llvm-dev
>>>>>>>>>                             <llvm-dev at
lists.llvm.org
>>>>>>>>>                            
<mailto:llvm-dev at lists.llvm.org>>
>>>>>>>>>                             wrote:
>>>>>>>>>
>>>>>>>>>                             === PROBLEM ===
(See this bug
>>>>>>>>>                            
https://llvm.org/bugs/show_bug.cgi?id=26445)
>>>>>>>>>
>>>>>>>>>                             IR contains
code for loading a
>>>>>>>>>                             float from
float * and storing it
>>>>>>>>>                             to a float *
address. After
>>>>>>>>>                            
canonicalization of load in
>>>>>>>>>                             InstCombine
[1], new bitcasts are
>>>>>>>>>                             added to the IR
(see bottom of the
>>>>>>>>>                             email for code
samples). This
>>>>>>>>>                             prevents select
speculation in
>>>>>>>>>                             SROA to work.
Also after SROA we
>>>>>>>>>                             have bitcasts
from int32 to float.
>>>>>>>>>                             (Whereas
originally after
>>>>>>>>>                             instCombine,
bitcasts are only
>>>>>>>>>                             done on pointer
types).
>>>>>>>>>
>>>>>>>>>                             === PROPOSED
SOLUTION==>>>>>>>>>
>>>>>>>>>                             [1] implies
that we need load
>>>>>>>>>                            
canonicalization when we load a
>>>>>>>>>                             value only to
store it again. The
>>>>>>>>>                             reason is to
avoid generating
>>>>>>>>>                             slightly
different code (due to
>>>>>>>>>                             different ways
of adding
>>>>>>>>>                             bitcasts), in
different
>>>>>>>>>                             situations. In
all examples
>>>>>>>>>                             presented in
[1] there is a
>>>>>>>>>                             non-zero number
of bitcasts. I
>>>>>>>>>                             think when we
load a value of type
>>>>>>>>>                             T from a T*
address and store it
>>>>>>>>>                             as a type T
value to one or more
>>>>>>>>>                             T* address (and
there is no other
>>>>>>>>>                             use or store),
we can redefine
>>>>>>>>>                             canonical form
to mean there
>>>>>>>>>                             should not be
any bitcasts. So we
>>>>>>>>>                             still have a
canonical form, but
>>>>>>>>>                             its definition
is slightly different.
>>>>>>>>>
>>>>>>>>>                             === REASONS FOR
/ AGAINST==>>>>>>>>>
>>>>>>>>>                             Hal Finkel
warns that while this
>>>>>>>>>                             may be useful
for power pc, this
>>>>>>>>>                             may hurt more
than one other
>>>>>>>>>                             platform and
become a very large
>>>>>>>>>                             project.
Despite this he is fine
>>>>>>>>>                             with bringing
up the issue to the
>>>>>>>>>                             mailing list to
get feedback,
>>>>>>>>>                             mostly because
this seems inline
>>>>>>>>>                             with our future
direction of
>>>>>>>>>                             having a unique
type for all
>>>>>>>>>                             pointers. (Hal
please correct me
>>>>>>>>>                             if I
misunderstood your comment)
>>>>>>>>>
>>>>>>>>>                             This is a much
simpler fix
>>>>>>>>>                             compared to
alternatives.
>>>>>>>>>                             (ignoring
potential regressions)
>>>>>>>>>
>>>>>>>>>                             === ALTERNATIVE
SOLUTION ==>>>>>>>>>
>>>>>>>>>                             Fix select
speculation in SROA to
>>>>>>>>>                             see through
bitcasts. Handle
>>>>>>>>>                             remaining
bitcasts during code
>>>>>>>>>                             gen. Other
alternative solutions
>>>>>>>>>                             are welcome.
>>>>>>>>>
>>>>>>>>>                             Should I
implement the proposed
>>>>>>>>>                             solution or is
it too risky? I
>>>>>>>>>                             understand that
we may need to
>>>>>>>>>                             undo it if it
breaks too many
>>>>>>>>>                             things.
Comments are welcome.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             [1]
>>>>>>>>>                            
http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html
>>>>>>>>>                             r226781  git
commit id: b778cbc0c8
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             Code Samples
(only relevant part
>>>>>>>>>                             is copied):
>>>>>>>>>
>>>>>>>>>                            
-------------------- Before
>>>>>>>>>                            
Canonicalization (contains call to
>>>>>>>>>                             std::max):
--------------------
>>>>>>>>>                             entry:
>>>>>>>>>                               %max_value =
alloca float, align 4
>>>>>>>>>                               %1 = load
float, float* %input,
>>>>>>>>>                             align 4, !tbaa
!1
>>>>>>>>>                               store float
%1, float*
>>>>>>>>>                             %max_value,
align 4, !tbaa !1
>>>>>>>>>
>>>>>>>>>                             for.body:
>>>>>>>>>                               %call = call
dereferenceable(4)
>>>>>>>>>                             float*
>>>>>>>>>                            
@_ZSt3maxIfERKT_S2_S2_(float*
>>>>>>>>>                            
dereferenceable(4) %max_value,
>>>>>>>>>                             float*
dereferenceable(4) %arrayidx1)
>>>>>>>>>                               %3 = load
float, float* %call,
>>>>>>>>>                             align 4, !tbaa
!1
>>>>>>>>>                               store float
%3, float*
>>>>>>>>>                             %max_value,
align 4, !tbaa !1
>>>>>>>>>
>>>>>>>>>                            
-------------------- After
>>>>>>>>>                            
Canonicalization (contains call to
>>>>>>>>>                            
std::max):--------------------
>>>>>>>>>
>>>>>>>>>                             entry:
>>>>>>>>>                               %max_value =
alloca float, align 4
>>>>>>>>>                               %1 = bitcast
float* %input to i32*
>>>>>>>>>                               %2 = load
i32, i32* %1, align 4,
>>>>>>>>>                             !tbaa !1
>>>>>>>>>                               %3 = bitcast
float* %max_value
>>>>>>>>>                             to i32*
>>>>>>>>>                               store i32 %2,
i32* %3, align 4,
>>>>>>>>>                             !tbaa !1
>>>>>>>>>
>>>>>>>>>                             for.body:
>>>>>>>>>                               %call = call
dereferenceable(4)
>>>>>>>>>                             float*
>>>>>>>>>                            
@_ZSt3maxIfERKT_S2_S2_(float*
>>>>>>>>>                             nonnull
dereferenceable(4)
>>>>>>>>>                             %max_value,
float*
>>>>>>>>>                            
dereferenceable(4) %arrayidx1)
>>>>>>>>>                               %5 = bitcast
float* %call to i32*
>>>>>>>>>                               %6 = load
i32, i32* %5, align 4,
>>>>>>>>>                             !tbaa !1
>>>>>>>>>                               %7 = bitcast
float* %max_value
>>>>>>>>>                             to i32*
>>>>>>>>>                               store i32 %6,
i32* %7, align 4,
>>>>>>>>>                             !tbaa !1
>>>>>>>>>
>>>>>>>>>                            
-------------------- After SROA
>>>>>>>>>                             (the call to
std::max is inlined
>>>>>>>>>                            
now):--------------------
>>>>>>>>>                             entry:
>>>>>>>>>                            
%max_value.sroa.0 = alloca i32
>>>>>>>>>                               %0 = bitcast
float* %input to i32*
>>>>>>>>>                               %1 = load
i32, i32* %0, align 4,
>>>>>>>>>                             !tbaa !1
>>>>>>>>>                               store i32 %1,
i32* %max_value.sroa.0
>>>>>>>>>
>>>>>>>>>                             for.body:
>>>>>>>>>                            
%max_value.sroa.0.0.max_value.sroa.0.0.6
>>>>>>>>>                             = load i32,
i32* %max_value.sroa.0
>>>>>>>>>                               %3 = bitcast
i32
>>>>>>>>>                            
%max_value.sroa.0.0.max_value.sroa.0.0.6
>>>>>>>>>                             to float
>>>>>>>>>                            
%max_value.sroa.0.0.max_value.sroa_cast8
>>>>>>>>>                             = bitcast i32*
%max_value.sroa.0
>>>>>>>>>                             to float*
>>>>>>>>>                               %__b.__a.i =
select i1 %cmp.i,
>>>>>>>>>                             float*
%arrayidx1, float*
>>>>>>>>>                            
%max_value.sroa.0.0.max_value.sroa_cast8
>>>>>>>>>                               %5 = bitcast
float* %__b.__a.i
>>>>>>>>>                             to i32*
>>>>>>>>>                               %6 = load
i32, i32* %5, align 4,
>>>>>>>>>                             !tbaa !1
>>>>>>>>>                               store i32 %6,
i32* %max_value.sroa.0
>>>>>>>>>
>>>>>>>>>                            
-------------------- After SROA
>>>>>>>>>                             when
Canonicalization is turned
>>>>>>>>>                            
off--------------------
>>>>>>>>>                             entry:
>>>>>>>>>                               %0 = load
float, float* %input,
>>>>>>>>>                             align 4, !tbaa
!1
>>>>>>>>>
>>>>>>>>>                             for.cond: ;
preds = %for.body, %entry
>>>>>>>>>                               %max_value.0
= phi float [ %0,
>>>>>>>>>                             %entry ], [
%.sroa.speculated,
>>>>>>>>>                             %for.body ]
>>>>>>>>>
>>>>>>>>>                             for.body:
>>>>>>>>>                               %1 = load
float, float*
>>>>>>>>>                             %arrayidx1,
align 4, !tbaa !1
>>>>>>>>>                               %cmp.i = fcmp
olt float
>>>>>>>>>                             %max_value.0,
%1
>>>>>>>>>                            
%.sroa.speculate.load.true = load
>>>>>>>>>                             float, float*
%arrayidx1, align 4,
>>>>>>>>>                             !tbaa !1
>>>>>>>>>                            
%.sroa.speculated = select i1
>>>>>>>>>                             %cmp.i, float
>>>>>>>>>                            
%.sroa.speculate.load.true, float
>>>>>>>>>                             %max_value.0
>>>>>>>>>                            
_______________________________________________
>>>>>>>>>                             LLVM Developers
mailing list
>>>>>>>>>                             llvm-dev at
lists.llvm.org
>>>>>>>>>                            
<mailto:llvm-dev at lists.llvm.org>
>>>>>>>>>                            
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>>
>>>>>>>>                            
_______________________________________________
>>>>>>>>                             LLVM Developers
mailing list
>>>>>>>>                             llvm-dev at
lists.llvm.org
>>>>>>>>                             <mailto:llvm-dev
at lists.llvm.org>
>>>>>>>>                            
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>            
_______________________________________________
>>>>>>>             LLVM Developers mailing list
>>>>>>>             llvm-dev at lists.llvm.org
<mailto:llvm-dev at lists.llvm.org>
>>>>>>>            
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/1030d65c/attachment-0001.html>

James Y Knight via llvm-dev

2016-Mar-22 22:58 UTC

head link

[llvm-dev] RFC: A change in InstCombine canonical form

On Tue, Mar 22, 2016 at 5:44 PM, Ehsan Amiri via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Thanks.
>
> *Phillip, *As Hal said I do not think (1) is a very large item. Please
> let me know if I am mistaken.
>
> *David *I think (1) is more inline with typeless pointer work than (2).
> Contributing to typeless pointer work will be great, but given its unknown
> time frame we cannot stop fixing existing problems. Of course, we should
> follow an approach consistent with the long-term solution.
>
It seems to me that the question to ask is what would be the best state of
the code, assuming that the typeless pointers work had already been done.
Is it the current canonical form? Or the newly proposed one?

I think it'd be the current one? If so, I'd suggest that proposal #2 is
more compatible with the typeless pointer work. That is because (if done
properly), code which knows how to look through pointer bitcast nodes is
something which will be much more easily deletable once pointer bitcast
nodes cease to exist, than to change the canonicalization of memcpy back
and remove any backend code which was added only to compensate for that
change.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/9e6d4adb/attachment.html>

Mehdi Amini via llvm-dev

2016-Mar-22 23:33 UTC

head link

[llvm-dev] RFC: A change in InstCombine canonical form

> On Mar 22, 2016, at 4:15 PM, Ehsan Amiri via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> James,
> 
> I think (1) reduces the number of "do-not-see-through-bitcast"
bugs that we need to uncover and fix between now and the time that typeless
pointer is available. That means it is likely that we have multiple such fixes
in the code and then we have to remove each one of them. (And means each one of
those has to be done properly to be easily remove-able).
> 
> Changing canonicaliztion of memcpy, will be removing a couple of lines of
code. I am not sure about the size of backend changes to optimize load-store
patterns. But I expect it to be small.
Are you saying that the canonicalization you want to change is temporary till we
get the typeless pointers?

-- 
Mehdi


> 
> 
> On Tue, Mar 22, 2016 at 6:58 PM, James Y Knight <jyknight at google.com
<mailto:jyknight at google.com>> wrote:
> 
> 
> On Tue, Mar 22, 2016 at 5:44 PM, Ehsan Amiri via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> Thanks. 
> 
> Phillip, As Hal said I do not think (1) is a very large item. Please let me
know if I am mistaken.
> 
> David I think (1) is more inline with typeless pointer work than (2).
Contributing to typeless pointer work will be great, but given its unknown time
frame we cannot stop fixing existing problems. Of course, we should follow an
approach consistent with the long-term solution.
> 
> It seems to me that the question to ask is what would be the best state of
the code, assuming that the typeless pointers work had already been done. Is it
the current canonical form? Or the newly proposed one?
> 
> I think it'd be the current one? If so, I'd suggest that proposal
#2 is more compatible with the typeless pointer work. That is because (if done
properly), code which knows how to look through pointer bitcast nodes is
something which will be much more easily deletable once pointer bitcast nodes
cease to exist, than to change the canonicalization of memcpy back and remove
any backend code which was added only to compensate for that change.
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/08d9f2fd/attachment.html>

Ehsan Amiri via llvm-dev

2016-Mar-22 23:38 UTC

head link

[llvm-dev] RFC: A change in InstCombine canonical form

On Tue, Mar 22, 2016 at 7:33 PM, Mehdi Amini <mehdi.amini at apple.com>
wrote:
>
> On Mar 22, 2016, at 4:15 PM, Ehsan Amiri via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> James,
>
> I think (1) reduces the number of "do-not-see-through-bitcast"
bugs that
> we need to uncover and fix between now and the time that typeless pointer
> is available. That means it is likely that we have multiple such fixes in
> the code and then we have to remove each one of them. (And means each one
> of those has to be done properly to be easily remove-able).
>
> Changing canonicaliztion of memcpy, will be removing a couple of lines of
> code. I am not sure about the size of backend changes to optimize
> load-store patterns. But I expect it to be small.
>
>
> Are you saying that the canonicalization you want to change is temporary
> till we get the typeless pointers?
>
>I think typeless pointer, will automatically makes it obsolete. Remember, I
proposed to make a change "when we load a value of type T from a type T*
memory address". There will not be a type T * memory address, once typeless
pointer work is in.



> --
> Mehdi
>
>
>
>
>
> On Tue, Mar 22, 2016 at 6:58 PM, James Y Knight <jyknight at
google.com>
> wrote:
>
>>
>>
>> On Tue, Mar 22, 2016 at 5:44 PM, Ehsan Amiri via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Thanks.
>>>
>>> *Phillip, *As Hal said I do not think (1) is a very large item.
Please
>>> let me know if I am mistaken.
>>>
>>> *David *I think (1) is more inline with typeless pointer work than
(2).
>>> Contributing to typeless pointer work will be great, but given its
unknown
>>> time frame we cannot stop fixing existing problems. Of course, we
should
>>> follow an approach consistent with the long-term solution.
>>>
>>
>> It seems to me that the question to ask is what would be the best state
>> of the code, assuming that the typeless pointers work had already been
>> done. Is it the current canonical form? Or the newly proposed one?
>>
>> I think it'd be the current one? If so, I'd suggest that
proposal #2 is
>> more compatible with the typeless pointer work. That is because (if
done
>> properly), code which knows how to look through pointer bitcast nodes
is
>> something which will be much more easily deletable once pointer bitcast
>> nodes cease to exist, than to change the canonicalization of memcpy
back
>> and remove any backend code which was added only to compensate for that
>> change.
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/1febbe40/attachment.html>

Philip Reames via llvm-dev

2016-Mar-23 03:57 UTC

head link

[llvm-dev] RFC: A change in InstCombine canonical form

On 03/22/2016 06:50 PM, Ehsan Amiri wrote:>
>
> On Tue, Mar 22, 2016 at 6:42 PM, Philip Reames 
> <listmail at philipreames.com <mailto:listmail at
philipreames.com>> wrote:
>
>
>
>     On 03/22/2016 02:44 PM, Ehsan Amiri wrote:
>>     Thanks.
>>
>>     *Phillip, *As Hal said I do not think (1) is a very large item.
>>     Please let me know if I am mistaken.
>     I have no specific reason to believe it will be a large amount of
>     work, but prior experience tells me changes to canonical form have
>     a tendency of exposing unexpected issues.  To be clear, I am
>     supportive of you implementing solution 1.
>>
>
> Thanks Phillip. Do you mean exposing bugs that were unknown before? or 
> something else?
>
> If the problem is that some transformations will start to have 
> problems because they do not know how to deal with the new canonical 
> form, that's alarming. If the chances of exposing such problem is not 
> too high, then we can possibly go ahead with (1) and fallback to (2) 
> if (1) starts to be complicated. But if the risk is too high, then we 
> may want to prefer (2) to (1).Most are not correctness bugs, but performance bugs.  Later passes 
(particular CGP and later) do not do well when presented arbitrary 
non-canonical IR.  They won't crash, but they may not be as effective as 
expected and may miss obvious performance wins.  When you change what 
canonical is, you sometimes miss the existing sweat spot and have to 
adjust the code to handle the new canonical form.

My suggestion would be to implement your change, run a reasonable set of 
performance tests, fix any problems, and iterate.  Just don't be 
surprised if what seems like a simple patch snowballs into a series of 
mostly minor patches.  I'm not trying to warn you off or scare you; I 
just want to make sure you have reasonable expectations going
in.>
> Any suggestions?
>
>
>
>
>
>
>>     *David *I think (1) is more inline with typeless pointer work
>>     than (2). Contributing to typeless pointer work will be great,
>>     but given its unknown time frame we cannot stop fixing existing
>>     problems. Of course, we should follow an approach consistent with
>>     the long-term solution.
>>
>>     On Tue, Mar 22, 2016 at 5:30 PM, David Blaikie
>>     <dblaikie at gmail.com <mailto:dblaikie at gmail.com>>
wrote:
>>
>>
>>
>>         On Tue, Mar 22, 2016 at 1:41 PM, Mehdi Amini
>>         <mehdi.amini at apple.com <mailto:mehdi.amini at
apple.com>> wrote:
>>
>>             Sorry I should have been more clear (writing to many
>>             email in parallel)
>>
>>             You're right. I was adding a +1 to you here.
>>
>>             Especially I wrote "unless there is an acknowledgement
>>             that typeless pointers won't be there before a couple
of
>>             years" with  the PassManager in mind, and I was
expecting
>>             from David some good indication of a timeframe for the
>>             typeless pointers.
>>             If the typeless pointer work is stalled or if it is not
>>             planned for LLVM 3.9,
>>
>>
>>         It's neither stalled nor planned, as such.
>>
>>             I agree with Philip to not block anything.
>>
>>
>>         All I'm suggesting is that if there are people who want to
>>         fix these bugs, I'd really appreciate them helping out on
the
>>         typeless pointer work - it's totally
>>         parallelizable/shardable/shareable work & the project as a
>>         whole seemed to agree it was the right direction. Why not
>>         just get it done?
>>
>>         The Pass Manager work is a bit different & harder to share
at
>>         certain points (I don't tihnk there's ever been a point
at
>>         which someone could ask me "what can I do to help with the
>>         typeless pointer work" I couldn't've given some
pretty large
>>         areas I wasn't anywhere near touching that they
could've
>>         gotten their teeth into) - I think it's reaching the point
>>         where multiple passes can be ported independently & seen
some
>>         work like that in Polly, etc. So at this point it seems like
>>         if people want to address the issues the new pass manager is
>>         aimed at addressing, they could pitch in on that effort.
>>
>>         That said, I'm not in a position (nor would I do so, even
if
>>         I were) to block other patches, just to encourage people to
>>         help out on the bigger efforts in whatever way they can
>>         (either directly, or indirectly through ensuring stopgaps/new
>>         work is done in a way that makes that future work easier
>>         (where it's reasonable to judge what might be easier or
>>         harder, etc), etc)
>>
>>         - Dave
>>
>>
>>             -- 
>>             Mehdi
>>
>>
>>>             On Mar 22, 2016, at 1:37 PM, Philip Reames
>>>             <listmail at philipreames.com
>>>             <mailto:listmail at philipreames.com>> wrote:
>>>
>>>             But not what David was stating, unless I misread?  I
was
>>>             specifically responding to David's wording:
>>>             "If we're talking about making an optimization
able to
>>>             ignore the bitcast instructions - yes, that work is
>>>             unnecessary & perhaps questionable given the
typeless
>>>             pointer work. Not outright off limits, but the same
time
>>>             might be better invested in moving typeless pointers
>>>             forward if the contributor is so inclined/able to shift
>>>             in that direction."
>>>
>>>             Both "perhaps questionable" and "not
outright off
>>>             limits" seem to strongly imply such work should be
>>>             discouraged. I disagree with that view which is why I
>>>             wrote my response.
>>>
>>>             Philip
>>>
>>>             On 03/22/2016 01:34 PM, Mehdi Amini wrote:
>>>>             This is roughly what I wrote...
>>>>
>>>>>             On Mar 22, 2016, at 1:31 PM, Philip Reames
>>>>>             <listmail at philipreames.com
>>>>>             <mailto:listmail at philipreames.com>>
wrote:
>>>>>
>>>>>             I feel very strongly that blocking work on
making
>>>>>             optimization bitcast-ignorant on the typeless
pointer
>>>>>             work would be a major mistake. Unless we
expected the
>>>>>             typeless pointer work to be concluded within
the near
>>>>>             term (say 3-6 months maximum), we should not
block any
>>>>>             development which would be accepted in the
typeless
>>>>>             pointer work wasn't planned.
>>>>>
>>>>>             In my view, this is one of the largest mistakes
we've
>>>>>             made with the pass manager work, it has
seriously cost
>>>>>             us, and I don't want to see it happening
again.
>>>>>
>>>>>             Philip
>>>>>
>>>>>             On 03/22/2016 01:09 PM, David Blaikie wrote:
>>>>>>             Ultimately everything is going to be made
to not rely
>>>>>>             on the types of pointers - that's
nearly equivalent
>>>>>>             to bitcast-ignorant (the difference being
that the
>>>>>>             presence of an extra instruction (the
bitcast) might
>>>>>>             trip up some optimizations - but the
presence of the
>>>>>>             /type/ information implied by the bitcast
should not
>>>>>>             trip up or be necessary for optimizations
(two sides
>>>>>>             of the same coin))
>>>>>>
>>>>>>             If we're talking about making an
optimization able to
>>>>>>             ignore the bitcast instructions - yes, that
work is
>>>>>>             unnecessary & perhaps questionable
given the typeless
>>>>>>             pointer work. Not outright off limits, but
the same
>>>>>>             time might be better invested in moving
typeless
>>>>>>             pointers forward if the contributor is so
>>>>>>             inclined/able to shift in that direction.
>>>>>>
>>>>>>             But if we're talking about the work to
make the
>>>>>>             optimization not use the type of pointers
as a crutch
>>>>>>             - that work is a necessary precursor to the
typeless
>>>>>>             pointer work and would be most welcome.
>>>>>>
>>>>>>             - David
>>>>>>
>>>>>>             On Tue, Mar 22, 2016 at 12:39 PM, Mehdi
Amini
>>>>>>             <mehdi.amini at apple.com
>>>>>>             <mailto:mehdi.amini at apple.com>>
wrote:
>>>>>>
>>>>>>                 I don't really mind, but the
intermediate stage
>>>>>>                 will not be very nice: that a lot of
code / tests
>>>>>>                 that needs to be written with bitcast,
and all of
>>>>>>                 that while they are deemed to
disappear. The
>>>>>>                 added value isn't clear to me
considering the
>>>>>>                 added work. I'm not sure it
wouldn't add more
>>>>>>                 work for all the cleanup required by
the
>>>>>>                 "typeless pointer", but
I'm not sure what's
>>>>>>                 involved here and if David thinks the
>>>>>>                 intermediate steps of handling bit
casts
>>>>>>                 everywhere is not making it harder
I'm fine with it.
>>>>>>
>>>>>>                 -- 
>>>>>>                 Mehdi
>>>>>>
>>>>>>>                 On Mar 22, 2016, at 12:36 PM,
Philip Reames
>>>>>>>                 <listmail at philipreames.com
>>>>>>>                 <mailto:listmail at
philipreames.com>> wrote:
>>>>>>>
>>>>>>>                 I'd phrase this differently:
being
>>>>>>>                 pointer-bitcast agnostic is a step
towards
>>>>>>>                 support typeless pointers.  :) We
can either
>>>>>>>                 become bitcast agnostic all in one
big change or
>>>>>>>                 incrementally. Personally, I'd
prefer the later
>>>>>>>                 since it reduces the risk
associated with
>>>>>>>                 enabling typeless pointers in the
end.
>>>>>>>
>>>>>>>                 Philip
>>>>>>>
>>>>>>>                 On 03/22/2016 12:16 PM, Mehdi Amini
via llvm-dev
>>>>>>>                 wrote:
>>>>>>>>                 I don't know enough about
the tradeoff for 1,
>>>>>>>>                 but 2 seems like a bandaid for
something that
>>>>>>>>                 is not a correctness issue
neither a
>>>>>>>>                 regression. I'm not sure it
justifies "bandaid
>>>>>>>>                 patches" while there is a
clear path forward,
>>>>>>>>                 i.e. typeless pointers, unless
there is an
>>>>>>>>                 acknowledgement that typeless
pointers won't be
>>>>>>>>                 there before a couple of years.
>>>>>>>>
>>>>>>>>                 -- 
>>>>>>>>                 Mehdi
>>>>>>>>
>>>>>>>>>                 On Mar 22, 2016, at 11:32
AM, Ehsan Amiri
>>>>>>>>>                 <ehsanamiri at gmail.com
>>>>>>>>>                 <mailto:ehsanamiri at
gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>>                 Back to the discussion on
the RFC, I still see
>>>>>>>>>                 some advantage in following
the proposed
>>>>>>>>>                 solution. I see two paths
forward:
>>>>>>>>>
>>>>>>>>>                 1- Change canonical form,
possibly lower
>>>>>>>>>                 memcpy to non-integer load
and store in
>>>>>>>>>                 InstCombine. Then teach the
backends to
>>>>>>>>>                 convert that to integer
load and store if that
>>>>>>>>>                 is more profitable. Notice
that we are talking
>>>>>>>>>                 about loads that have no
use other than store.
>>>>>>>>>                 So it is a fairly simple
change in the backends.
>>>>>>>>>
>>>>>>>>>                 2- Do not change the
canonical form. Then for
>>>>>>>>>                 this bug, we need to teach
select speculation
>>>>>>>>>                 to see through bitcasts. We
will probably need
>>>>>>>>>                 to teach other
optimizations to see though
>>>>>>>>>                 bitcasts in the future as
problems are
>>>>>>>>>                 uncovered. That is until
typeless pointer work
>>>>>>>>>                 is complete. Once the
typeless pointer work is
>>>>>>>>>                 complete, we have some
extra code in each
>>>>>>>>>                 optimization for seeing
through bitcasts which
>>>>>>>>>                 is possibly no longer
needed.
>>>>>>>>>
>>>>>>>>>                 Based on this I think (1)
is the right thing
>>>>>>>>>                 to do. But there might be
other reasons for
>>>>>>>>>                 the current canonical form
that I am not aware
>>>>>>>>>                 of. Please let me know what
you think.
>>>>>>>>>
>>>>>>>>>                 Thanks
>>>>>>>>>                 Ehsan
>>>>>>>>>
>>>>>>>>>                 On Wed, Mar 16, 2016 at
2:13 PM, David Blaikie
>>>>>>>>>                 <dblaikie at gmail.com
>>>>>>>>>                 <mailto:dblaikie at
gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                     On Wed, Mar 16, 2016 at
11:00 AM, Ehsan
>>>>>>>>>                     Amiri <ehsanamiri at
gmail.com
>>>>>>>>>                     <mailto:ehsanamiri
at gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>>                         David,
>>>>>>>>>
>>>>>>>>>                         Could you give us
an update on the
>>>>>>>>>                         status of typeless
pointer work? How
>>>>>>>>>                         much work is left
and when you think
>>>>>>>>>                         it might be ready?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                     It's a bit of an
onion peel, really -
>>>>>>>>>                     since it will
eventually involve
>>>>>>>>>                     generalizing/fixing
every optimization
>>>>>>>>>                     that's currently
leaning on typed pointers
>>>>>>>>>                     to keep the performance
while removing the
>>>>>>>>>                     crutch they're
currently leaning on. (in
>>>>>>>>>                     cases where bitcasts
are literally just
>>>>>>>>>                     getting in the way,
those won't require
>>>>>>>>>                     cleaning up &
should just become "free
>>>>>>>>>                     performance wins"
once we remove them, though)
>>>>>>>>>
>>>>>>>>>                     At the moment we can
roundtrip every LLVM
>>>>>>>>>                     IR test case through
bitcode and textual
>>>>>>>>>                     IR (reading/writing
both formats) while
>>>>>>>>>                     using only a narrow
whitelist of places
>>>>>>>>>                     that request the type
of a pointer (things
>>>>>>>>>                     like the verifier, the
parser/printer
>>>>>>>>>                     where it actually needs
the typed pointer
>>>>>>>>>                     to verify it matches
the explicit type, etc).
>>>>>>>>>
>>>>>>>>>                     The next thing on the
list is probably
>>>>>>>>>                     figuring out the
byval/inalloca
>>>>>>>>>                     representation (add an
explicit pointee
>>>>>>>>>                     type? just make the
number of bytes
>>>>>>>>>                     explicit with no type
information?).
>>>>>>>>>
>>>>>>>>>                     Then start migrating
optimizations over -
>>>>>>>>>                     doing the same sort of
testing I did for
>>>>>>>>>                     the IR/bitcode
roundtripping - assert that
>>>>>>>>>                     the pointee type is not
accessed,
>>>>>>>>>                     whitelist places that
need it until the
>>>>>>>>>                     bitcasts go away, fix
anything else...
>>>>>>>>>                     it'll still be a
fair bit of work & I
>>>>>>>>>                     don't really know
how much. It should
>>>>>>>>>                     parallelize pretty well
(doing any of this
>>>>>>>>>                     work is really helpful,
each optimization
>>>>>>>>>                     is indepednent, etc) if
anyone wants to/is
>>>>>>>>>                     able to help.
>>>>>>>>>
>>>>>>>>>                     - Dave
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                         Thanks
>>>>>>>>>                         Ehsan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                         On Wed, Mar 16,
2016 at 1:12 PM, David
>>>>>>>>>                         Blaikie
<dblaikie at gmail.com
>>>>>>>>>                         <mailto:dblaikie
at gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             On Wed, Mar 16,
2016 at 8:34 AM,
>>>>>>>>>                             Mehdi Amini via
llvm-dev
>>>>>>>>>                             <llvm-dev at
lists.llvm.org
>>>>>>>>>                            
<mailto:llvm-dev at lists.llvm.org>>
>>>>>>>>>                             wrote:
>>>>>>>>>
>>>>>>>>>                                 Hi,
>>>>>>>>>
>>>>>>>>>                                 How do it
interact with the
>>>>>>>>>                                
"typeless pointers" work?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             Right - the
goal of the typeless
>>>>>>>>>                             pointer work is
to fix all these
>>>>>>>>>                             bugs related to
"didn't look
>>>>>>>>>                             through
bitcasts" in
>>>>>>>>>                             optimizations.
Sometimes that's
>>>>>>>>>                             going to mean
more work (because
>>>>>>>>>                             the code is
leaning on the absence
>>>>>>>>>                             of bitcasts
& the presence of
>>>>>>>>>                             convenient (but
not guaranteed)
>>>>>>>>>                             type
information to inform
>>>>>>>>>                             optimization
decisions) but if we
>>>>>>>>>                             remove typed
pointers while
>>>>>>>>>                             keeping
optimization quality in
>>>>>>>>>                             the cases we
have today, then we
>>>>>>>>>                             should've
also fixed the cases
>>>>>>>>>                             that were
broken because the type
>>>>>>>>>                             information
didn't end up aligning
>>>>>>>>>                             to produce the
optimal output.
>>>>>>>>>
>>>>>>>>>                             & I know
I've been off the
>>>>>>>>>                             typeless
pointer stuff for a few
>>>>>>>>>                             months working
on llvm-dwp - but
>>>>>>>>>                             happy for any
help (the next
>>>>>>>>>                             immediate piece
is probably
>>>>>>>>>                             figuring out
teh right
>>>>>>>>>                             representation
for byval and
>>>>>>>>>                             inalloca -
there were some code
>>>>>>>>>                             reviews sent
out for that that
>>>>>>>>>                             I'll need
to come back around to -
>>>>>>>>>                             but also any
optimizations people
>>>>>>>>>                             want to help
rework/improve would
>>>>>>>>>                             be great too
& I can provide some
>>>>>>>>>                            
techniques/tools to help people
>>>>>>>>>                             approach those)
>>>>>>>>>
>>>>>>>>>                             - Dave
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                                 Thanks,
>>>>>>>>>
>>>>>>>>>                                 -- 
>>>>>>>>>                                 Mehdi
>>>>>>>>>
>>>>>>>>>>                                 On Mar
16, 2016, at 6:41 AM,
>>>>>>>>>>                                 Ehsan
Amiri via llvm-dev
>>>>>>>>>>                                
<llvm-dev at lists.llvm.org
>>>>>>>>>>                                
<mailto:llvm-dev at lists.llvm.org>>
>>>>>>>>>>                                 wrote:
>>>>>>>>>>
>>>>>>>>>>                                 ===
PROBLEM === (See this bug
>>>>>>>>>>                                
https://llvm.org/bugs/show_bug.cgi?id=26445)
>>>>>>>>>>
>>>>>>>>>>                                 IR
contains code for loading
>>>>>>>>>>                                 a float
from float * and
>>>>>>>>>>                                 storing
it to a float *
>>>>>>>>>>                                
address. After
>>>>>>>>>>                                
canonicalization of load in
>>>>>>>>>>                                
InstCombine [1], new bitcasts
>>>>>>>>>>                                 are
added to the IR (see
>>>>>>>>>>                                 bottom
of the email for code
>>>>>>>>>>                                
samples). This prevents
>>>>>>>>>>                                 select
speculation in SROA to
>>>>>>>>>>                                 work.
Also after SROA we have
>>>>>>>>>>                                
bitcasts from int32 to float.
>>>>>>>>>>                                
(Whereas originally after
>>>>>>>>>>                                
instCombine, bitcasts are
>>>>>>>>>>                                 only
done on pointer types).
>>>>>>>>>>
>>>>>>>>>>                                 ===
PROPOSED SOLUTION==>>>>>>>>>>
>>>>>>>>>>                                 [1]
implies that we need load
>>>>>>>>>>                                
canonicalization when we load
>>>>>>>>>>                                 a value
only to store it
>>>>>>>>>>                                 again.
The reason is to avoid
>>>>>>>>>>                                
generating slightly different
>>>>>>>>>>                                 code
(due to different ways
>>>>>>>>>>                                 of
adding bitcasts), in
>>>>>>>>>>                                
different situations. In all
>>>>>>>>>>                                
examples presented in [1]
>>>>>>>>>>                                 there
is a non-zero number of
>>>>>>>>>>                                
bitcasts. I think when we
>>>>>>>>>>                                 load a
value of type T from a
>>>>>>>>>>                                 T*
address and store it as a
>>>>>>>>>>                                 type T
value to one or more
>>>>>>>>>>                                 T*
address (and there is no
>>>>>>>>>>                                 other
use or store), we can
>>>>>>>>>>                                
redefine canonical form to
>>>>>>>>>>                                 mean
there should not be any
>>>>>>>>>>                                
bitcasts. So we still have a
>>>>>>>>>>                                
canonical form, but its
>>>>>>>>>>                                
definition is slightly different.
>>>>>>>>>>
>>>>>>>>>>                                 ===
REASONS FOR / AGAINST==>>>>>>>>>>
>>>>>>>>>>                                 Hal
Finkel warns that while
>>>>>>>>>>                                 this
may be useful for power
>>>>>>>>>>                                 pc,
this may hurt more than
>>>>>>>>>>                                 one
other platform and become
>>>>>>>>>>                                 a very
large project. Despite
>>>>>>>>>>                                 this he
is fine with bringing
>>>>>>>>>>                                 up the
issue to the mailing
>>>>>>>>>>                                 list to
get feedback, mostly
>>>>>>>>>>                                 because
this seems inline
>>>>>>>>>>                                 with
our future direction of
>>>>>>>>>>                                 having
a unique type for all
>>>>>>>>>>                                
pointers. (Hal please correct
>>>>>>>>>>                                 me if I
misunderstood your
>>>>>>>>>>                                
comment)
>>>>>>>>>>
>>>>>>>>>>                                 This is
a much simpler fix
>>>>>>>>>>                                
compared to alternatives.
>>>>>>>>>>                                
(ignoring potential regressions)
>>>>>>>>>>
>>>>>>>>>>                                 ===
ALTERNATIVE SOLUTION ==>>>>>>>>>>
>>>>>>>>>>                                 Fix
select speculation in
>>>>>>>>>>                                 SROA to
see through bitcasts.
>>>>>>>>>>                                 Handle
remaining bitcasts
>>>>>>>>>>                                 during
code gen. Other
>>>>>>>>>>                                
alternative solutions are
>>>>>>>>>>                                
welcome.
>>>>>>>>>>
>>>>>>>>>>                                 Should
I implement the
>>>>>>>>>>                                
proposed solution or is it
>>>>>>>>>>                                 too
risky? I understand that
>>>>>>>>>>                                 we may
need to undo it if it
>>>>>>>>>>                                 breaks
too many things.
>>>>>>>>>>                                
Comments are welcome.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                 [1]
>>>>>>>>>>                                
http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html
>>>>>>>>>>                                 r226781
git commit id:
>>>>>>>>>>                                
b778cbc0c8
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                 Code
Samples (only relevant
>>>>>>>>>>                                 part is
copied):
>>>>>>>>>>
>>>>>>>>>>                                
-------------------- Before
>>>>>>>>>>                                
Canonicalization (contains
>>>>>>>>>>                                 call to
std::max):
>>>>>>>>>>                                
--------------------
>>>>>>>>>>                                 entry:
>>>>>>>>>>                                  
%max_value = alloca float,
>>>>>>>>>>                                 align 4
>>>>>>>>>>                                   %1 =
load float, float*
>>>>>>>>>>                                 %input,
align 4, !tbaa !1
>>>>>>>>>>                                   store
float %1, float*
>>>>>>>>>>                                
%max_value, align 4, !tbaa !1
>>>>>>>>>>
>>>>>>>>>>                                
for.body:
>>>>>>>>>>                                   %call
= call
>>>>>>>>>>                                
dereferenceable(4) float*
>>>>>>>>>>                                
@_ZSt3maxIfERKT_S2_S2_(float*
>>>>>>>>>>                                
dereferenceable(4)
>>>>>>>>>>                                
%max_value, float*
>>>>>>>>>>                                
dereferenceable(4) %arrayidx1)
>>>>>>>>>>                                   %3 =
load float, float*
>>>>>>>>>>                                 %call,
align 4, !tbaa !1
>>>>>>>>>>                                   store
float %3, float*
>>>>>>>>>>                                
%max_value, align 4, !tbaa !1
>>>>>>>>>>
>>>>>>>>>>                                
-------------------- After
>>>>>>>>>>                                
Canonicalization (contains
>>>>>>>>>>                                 call to
>>>>>>>>>>                                
std::max):--------------------
>>>>>>>>>>
>>>>>>>>>>                                 entry:
>>>>>>>>>>                                  
%max_value = alloca float,
>>>>>>>>>>                                 align 4
>>>>>>>>>>                                   %1 =
bitcast float* %input
>>>>>>>>>>                                 to i32*
>>>>>>>>>>                                   %2 =
load i32, i32* %1,
>>>>>>>>>>                                 align
4, !tbaa !1
>>>>>>>>>>                                   %3 =
bitcast float*
>>>>>>>>>>                                
%max_value to i32*
>>>>>>>>>>                                   store
i32 %2, i32* %3,
>>>>>>>>>>                                 align
4, !tbaa !1
>>>>>>>>>>
>>>>>>>>>>                                
for.body:
>>>>>>>>>>                                   %call
= call
>>>>>>>>>>                                
dereferenceable(4) float*
>>>>>>>>>>                                
@_ZSt3maxIfERKT_S2_S2_(float*
>>>>>>>>>>                                 nonnull
dereferenceable(4)
>>>>>>>>>>                                
%max_value, float*
>>>>>>>>>>                                
dereferenceable(4) %arrayidx1)
>>>>>>>>>>                                   %5 =
bitcast float* %call
>>>>>>>>>>                                 to i32*
>>>>>>>>>>                                   %6 =
load i32, i32* %5,
>>>>>>>>>>                                 align
4, !tbaa !1
>>>>>>>>>>                                   %7 =
bitcast float*
>>>>>>>>>>                                
%max_value to i32*
>>>>>>>>>>                                   store
i32 %6, i32* %7,
>>>>>>>>>>                                 align
4, !tbaa !1
>>>>>>>>>>
>>>>>>>>>>                                
-------------------- After
>>>>>>>>>>                                 SROA
(the call to std::max is
>>>>>>>>>>                                 inlined
>>>>>>>>>>                                
now):--------------------
>>>>>>>>>>                                 entry:
>>>>>>>>>>                                
%max_value.sroa.0 = alloca i32
>>>>>>>>>>                                   %0 =
bitcast float* %input
>>>>>>>>>>                                 to i32*
>>>>>>>>>>                                   %1 =
load i32, i32* %0,
>>>>>>>>>>                                 align
4, !tbaa !1
>>>>>>>>>>                                   store
i32 %1, i32*
>>>>>>>>>>                                
%max_value.sroa.0
>>>>>>>>>>
>>>>>>>>>>                                
for.body:
>>>>>>>>>>                                
%max_value.sroa.0.0.max_value.sroa.0.0.6
>>>>>>>>>>                                 = load
i32, i32*
>>>>>>>>>>                                
%max_value.sroa.0
>>>>>>>>>>                                   %3 =
bitcast i32
>>>>>>>>>>                                
%max_value.sroa.0.0.max_value.sroa.0.0.6
>>>>>>>>>>                                 to
float
>>>>>>>>>>                                
%max_value.sroa.0.0.max_value.sroa_cast8
>>>>>>>>>>                                 =
bitcast i32*
>>>>>>>>>>                                
%max_value.sroa.0 to float*
>>>>>>>>>>                                  
%__b.__a.i = select i1
>>>>>>>>>>                                 %cmp.i,
float* %arrayidx1,
>>>>>>>>>>                                 float*
>>>>>>>>>>                                
%max_value.sroa.0.0.max_value.sroa_cast8
>>>>>>>>>>                                   %5 =
bitcast float*
>>>>>>>>>>                                
%__b.__a.i to i32*
>>>>>>>>>>                                   %6 =
load i32, i32* %5,
>>>>>>>>>>                                 align
4, !tbaa !1
>>>>>>>>>>                                   store
i32 %6, i32*
>>>>>>>>>>                                
%max_value.sroa.0
>>>>>>>>>>
>>>>>>>>>>                                
-------------------- After
>>>>>>>>>>                                 SROA
when Canonicalization is
>>>>>>>>>>                                 turned
off--------------------
>>>>>>>>>>                                 entry:
>>>>>>>>>>                                   %0 =
load float, float*
>>>>>>>>>>                                 %input,
align 4, !tbaa !1
>>>>>>>>>>
>>>>>>>>>>                                
for.cond: ; preds >>>>>>>>>>                      
%for.body, %entry
>>>>>>>>>>                                  
%max_value.0 = phi float [
>>>>>>>>>>                                 %0,
%entry ], [
>>>>>>>>>>                                
%.sroa.speculated, %for.body ]
>>>>>>>>>>
>>>>>>>>>>                                
for.body:
>>>>>>>>>>                                   %1 =
load float, float*
>>>>>>>>>>                                
%arrayidx1, align 4, !tbaa !1
>>>>>>>>>>                                  
%cmp.i = fcmp olt float
>>>>>>>>>>                                
%max_value.0, %1
>>>>>>>>>>                                
%.sroa.speculate.load.true >>>>>>>>>>             
load float, float*
>>>>>>>>>>                                
%arrayidx1, align 4, !tbaa !1
>>>>>>>>>>                                
%.sroa.speculated = select i1
>>>>>>>>>>                                 %cmp.i,
float
>>>>>>>>>>                                
%.sroa.speculate.load.true,
>>>>>>>>>>                                 float
%max_value.0
>>>>>>>>>>                                
_______________________________________________
>>>>>>>>>>                                 LLVM
Developers mailing list
>>>>>>>>>>                                
llvm-dev at lists.llvm.org
>>>>>>>>>>                                
<mailto:llvm-dev at lists.llvm.org>
>>>>>>>>>>                                
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                                
_______________________________________________
>>>>>>>>>                                 LLVM
Developers mailing list
>>>>>>>>>                                 llvm-dev at
lists.llvm.org
>>>>>>>>>                                
<mailto:llvm-dev at lists.llvm.org>
>>>>>>>>>                                
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
_______________________________________________
>>>>>>>>                 LLVM Developers mailing list
>>>>>>>>                 llvm-dev at lists.llvm.org
>>>>>>>>                 <mailto:llvm-dev at
lists.llvm.org>
>>>>>>>>                
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/dc9d4c6a/attachment-0001.html>

Ehsan Amiri via llvm-dev

2016-Mar-23 15:58 UTC

head link

[llvm-dev] RFC: A change in InstCombine canonical form

OK. I will do some experiments with (1) on Power PC. Will update this email
chain about the results.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160323/f05c1bad/attachment.html>

Seemingly Similar Threads

Search for more apparently analagous threads

llvm dev - Mar 2016 - RFC: A change in InstCombine canonical form

[llvm-dev] RFC: A change in InstCombine canonical form

[llvm-dev] RFC: A change in InstCombine canonical form

[llvm-dev] RFC: A change in InstCombine canonical form

[llvm-dev] RFC: A change in InstCombine canonical form

[llvm-dev] RFC: A change in InstCombine canonical form

[llvm-dev] RFC: A change in InstCombine canonical form

[llvm-dev] RFC: A change in InstCombine canonical form

[llvm-dev] RFC: A change in InstCombine canonical form

[llvm-dev] RFC: A change in InstCombine canonical form

Seemingly Similar Threads