thr3ads.net - llvm dev - [LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering? [Jul 2013]

If this information is useful, please help other people find it:
Share via:

Quentin Colombet

2013-Jul-01 20:33 UTC

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

On Jul 1, 2013, at 11:52 AM, Eli Friedman <eli.friedman at gmail.com>
wrote:
> On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at
apple.com> wrote:
> Hi,
> 
> ** Problematic **
> I am looking for advices to share some logic between DAG combine and target
lowering.
> 
> Basically, I need to know if a bitcast that is about to be inserted during
target specific isel lowering will be eliminated during DAG combine.
> 
> Let me know if there is another, better supported, approach for this kind
of problems.
> 
> ** Motivating Example **
> The motivating example comes form the lowering of vector code on armv7.
> More specifically, the build_vector node is lowered to a target specific
ARMISD::build_vector where all the parameters are bitcasted to floating point
types.
> 
> This works well, unless the inserted bitcasts survive until instruction
selection. In that case, they incur moves between integer unit and floating
point unit that may result in inefficient code.
> 
> Attached motivating_example.ll shows such a case:
> llc -O3 -mtriple thumbv7-apple-ios3 motivating_example.ll -o -
> 	ldr	r0, [r1]
> 	ldr	r1, [r2]
> 	vmov	s1, r1
> 	vmov	s0, r0
> Here each ldr, vmov sequences could have been replaced by a simple vld1.32.
> 
> ** Proposed Solution **
> Lower to more vector friendly code (using a sequence of insert_vector_elt),
when bit casts will not be free.
> The attached patch demonstrates that, but is missing the proper check to
know what DAG combine will do (see TODO).
> 
> I think you're approaching this backwards: the obvious thing to do is
to generate the insert_vector_elt sequence unconditionally, and DAGCombine that
sequence to a build_vector when appropriate.Hi Eli,

I have started to look into the direction you gave me.

I may have miss something but I do not see how the proposed direction solves the
issue. Indeed to be able to DAGCombine a insert_vector_elt sequences into a
ARMISD::build_vector, I still need to know if it would be profitable, i.e., if
DAGCombine will remove the bitcasts that combining/lowering is about to insert.

Since target specific DAGCombine are also done in TargetLowering I do not have
access to more DAGCombine logic (at least DAGCombineInfo is not providing the
require information).

What did I miss?

Thanks,

-Quentin
> 
> -Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/5a04d7c5/attachment.html>

Eli Friedman

2013-Jul-01 20:45 UTC

head link

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

On Mon, Jul 1, 2013 at 1:33 PM, Quentin Colombet <qcolombet at
apple.com>wrote:
> On Jul 1, 2013, at 11:52 AM, Eli Friedman <eli.friedman at gmail.com>
wrote:
>
> On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at
apple.com>
> wrote:
>
>> Hi,
>>
>> ** Problematic **
>> I am looking for advices to share some logic between DAG combine and
>> target lowering.
>>
>> Basically, I need to know if a bitcast that is about to be inserted
>> during target specific isel lowering will be eliminated during DAG
combine.
>>
>> Let me know if there is another, better supported, approach for this
kind
>> of problems.
>>
>> ** Motivating Example **
>> The motivating example comes form the lowering of vector code on armv7.
>> More specifically, the build_vector node is lowered to a target
specific
>> ARMISD::build_vector where all the parameters are bitcasted to floating
>> point types.
>>
>> This works well, unless the inserted bitcasts survive until instruction
>> selection. In that case, they incur moves between integer unit and
floating
>> point unit that may result in inefficient code.
>>
>> Attached motivating_example.ll shows such a case:
>> llc -O3 -mtriple thumbv7-apple-ios3 motivating_example.ll -o -
>> ldr r0, [r1]
>> ldr r1, [r2]
>> vmov s1, r1
>> vmov s0, r0
>> Here each ldr, vmov sequences could have been replaced by a simple
>> vld1.32.
>>
>> ** Proposed Solution **
>> Lower to more vector friendly code (using a sequence of
>> insert_vector_elt), when bit casts will not be free.
>> The attached patch demonstrates that, but is missing the proper check
to
>> know what DAG combine will do (see TODO).
>>
>
> I think you're approaching this backwards: the obvious thing to do is
to
> generate the insert_vector_elt sequence unconditionally, and DAGCombine
> that sequence to a build_vector when appropriate.
>
> Hi Eli,
>
> I have started to look into the direction you gave me.
>
> I may have miss something but I do not see how the proposed direction
> solves the issue. Indeed to be able to DAGCombine a insert_vector_elt
> sequences into a ARMISD::build_vector, I still need to know if it would be
> profitable, i.e., if DAGCombine will remove the bitcasts that
> combining/lowering is about to insert.
>
> Since target specific DAGCombine are also done in TargetLowering I do not
> have access to more DAGCombine logic (at least DAGCombineInfo is not
> providing the require information).
>
> What did I miss?
>
>Err, wait, sorry, my fault; I missed that you only insert the bitcasts on
the other side of the branch.

You should be able to do it the other way, though: generate the
build_vector unconditionally, and pull insert_vector_elts out of it in a
DAGCombine.  (At this point, you know whether DAGCombine will remove the
bit casts because if it could, it would have already done it.)

-Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/dd67c722/attachment.html>

Quentin Colombet

2013-Jul-01 20:58 UTC

head link

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

On Jul 1, 2013, at 1:45 PM, Eli Friedman <eli.friedman at gmail.com>
wrote:
> On Mon, Jul 1, 2013 at 1:33 PM, Quentin Colombet <qcolombet at
apple.com> wrote:
> On Jul 1, 2013, at 11:52 AM, Eli Friedman <eli.friedman at gmail.com>
wrote:
> 
>> On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at
apple.com> wrote:
>> Hi,
>> 
>> ** Problematic **
>> I am looking for advices to share some logic between DAG combine and
target lowering.
>> 
>> Basically, I need to know if a bitcast that is about to be inserted
during target specific isel lowering will be eliminated during DAG combine.
>> 
>> Let me know if there is another, better supported, approach for this
kind of problems.
>> 
>> ** Motivating Example **
>> The motivating example comes form the lowering of vector code on armv7.
>> More specifically, the build_vector node is lowered to a target
specific ARMISD::build_vector where all the parameters are bitcasted to floating
point types.
>> 
>> This works well, unless the inserted bitcasts survive until instruction
selection. In that case, they incur moves between integer unit and floating
point unit that may result in inefficient code.
>> 
>> Attached motivating_example.ll shows such a case:
>> llc -O3 -mtriple thumbv7-apple-ios3 motivating_example.ll -o -
>> 	ldr	r0, [r1]
>> 	ldr	r1, [r2]
>> 	vmov	s1, r1
>> 	vmov	s0, r0
>> Here each ldr, vmov sequences could have been replaced by a simple
vld1.32.
>> 
>> ** Proposed Solution **
>> Lower to more vector friendly code (using a sequence of
insert_vector_elt), when bit casts will not be free.
>> The attached patch demonstrates that, but is missing the proper check
to know what DAG combine will do (see TODO).
>> 
>> I think you're approaching this backwards: the obvious thing to do
is to generate the insert_vector_elt sequence unconditionally, and DAGCombine
that sequence to a build_vector when appropriate.
> 
> Hi Eli,
> 
> I have started to look into the direction you gave me.
> 
> I may have miss something but I do not see how the proposed direction
solves the issue. Indeed to be able to DAGCombine a insert_vector_elt sequences
into a ARMISD::build_vector, I still need to know if it would be profitable,
i.e., if DAGCombine will remove the bitcasts that combining/lowering is about to
insert.
> 
> Since target specific DAGCombine are also done in TargetLowering I do not
have access to more DAGCombine logic (at least DAGCombineInfo is not providing
the require information).
> 
> What did I miss?
> 
> 
> Err, wait, sorry, my fault; I missed that you only insert the bitcasts on
the other side of the branch.
> 
> You should be able to do it the other way, though: generate the
build_vector unconditionally, and pull insert_vector_elts out of it in a
DAGCombine.  (At this point, you know whether DAGCombine will remove the bit
casts because if it could, it would have already done it.)Make sense.
I will try that.

Thanks!

-Quentin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/3d5acf26/attachment.html>

Quentin Colombet

2013-Jul-02 00:48 UTC

head link

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

On Jul 1, 2013, at 1:45 PM, Eli Friedman <eli.friedman at gmail.com>
wrote:
> On Mon, Jul 1, 2013 at 1:33 PM, Quentin Colombet <qcolombet at
apple.com> wrote:
> On Jul 1, 2013, at 11:52 AM, Eli Friedman <eli.friedman at gmail.com>
wrote:
> 
>> On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at
apple.com> wrote:
>> Hi,
>> 
>> ** Problematic **
>> I am looking for advices to share some logic between DAG combine and
target lowering.
>> 
>> Basically, I need to know if a bitcast that is about to be inserted
during target specific isel lowering will be eliminated during DAG combine.
>> 
>> Let me know if there is another, better supported, approach for this
kind of problems.
>> 
>> ** Motivating Example **
>> The motivating example comes form the lowering of vector code on armv7.
>> More specifically, the build_vector node is lowered to a target
specific ARMISD::build_vector where all the parameters are bitcasted to floating
point types.
>> 
>> This works well, unless the inserted bitcasts survive until instruction
selection. In that case, they incur moves between integer unit and floating
point unit that may result in inefficient code.
>> 
>> Attached motivating_example.ll shows such a case:
>> llc -O3 -mtriple thumbv7-apple-ios3 motivating_example.ll -o -
>> 	ldr	r0, [r1]
>> 	ldr	r1, [r2]
>> 	vmov	s1, r1
>> 	vmov	s0, r0
>> Here each ldr, vmov sequences could have been replaced by a simple
vld1.32.
>> 
>> ** Proposed Solution **
>> Lower to more vector friendly code (using a sequence of
insert_vector_elt), when bit casts will not be free.
>> The attached patch demonstrates that, but is missing the proper check
to know what DAG combine will do (see TODO).
>> 
>> I think you're approaching this backwards: the obvious thing to do
is to generate the insert_vector_elt sequence unconditionally, and DAGCombine
that sequence to a build_vector when appropriate.
> 
> Hi Eli,
> 
> I have started to look into the direction you gave me.
> 
> I may have miss something but I do not see how the proposed direction
solves the issue. Indeed to be able to DAGCombine a insert_vector_elt sequences
into a ARMISD::build_vector, I still need to know if it would be profitable,
i.e., if DAGCombine will remove the bitcasts that combining/lowering is about to
insert.
> 
> Since target specific DAGCombine are also done in TargetLowering I do not
have access to more DAGCombine logic (at least DAGCombineInfo is not providing
the require information).
> 
> What did I miss?
> 
> 
> Err, wait, sorry, my fault; I missed that you only insert the bitcasts on
the other side of the branch.
> 
> You should be able to do it the other way, though: generate the
build_vector unconditionally, and pull insert_vector_elts out of it in a
DAGCombine.  (At this point, you know whether DAGCombine will remove the bit
casts because if it could, it would have already done it.)
> Thanks again Eli for the direction.

That works great!

I have sent a proposal with that solution.

-Quentin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/c035c391/attachment.html>

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Jul 2013 - [LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

Possibly Parallel Threads