thr3ads.net - llvm dev - [LLVMdev] Question about Pre-RA-schedule in LLVM3.3 [Dec 2013]

If this information is useful, please help other people find it:
Share via:

Haishan

2013-Dec-16 14:03 UTC

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

At 2013-12-15 22:43:34,"Caldarale, Charles R" <Chuck.Caldarale at
unisys.com> wrote:>> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu]
>> On Behalf Of Haishan
>> Subject: [LLVMdev] Question about Pre-RA-schedule in LLVM3.3
>
>> My clang version is 3.3 and debug build.
>
>> //test.c
>> int a[6] = {1, 2, 3, 4, 5, 6}
>> int main() {
>>  a[0] = a[5];
>>  a[1] = a[4];
>>  a[2] = a[5];
>> }
>> //end test.c
>> Then test.dump is generated by using the objdump tool.
>> //test.dump
>> ldr  r1, [r0, #20]
>> str  r1, [r0]
>> ldr  r1, [r0, #16]
>> str  r1, [r0, #4]
>> ldr  r1, [r0, #12]
>> str  r1, [r0, #8]
>> bx  lr
>> //end test.dump
>
>It appears you have a typo in the above, since the generated array reference
offsets do not correspond to the code in test.c.  Presumably, the last array
reference in test.c was really from a[3], not a[5].I'm sorry for making a
mistake in the above test.c.
And your presumption is right.>
>> However, for 3th and 4th instructions, they should be allocated
different
>> register from the second instruction.
>
>Why?
>
> - Chuck
>Thank you for your answer.If 3th and 4th instructions are allocated different
register from the second instruction.
Then the same machine register dependence will disappear, 
this sequence instructions would be executed with less stalls and cycles.
However, in the latest version of LLVM, the Pre-RA-sched builds a scheduling
graph(original graph) which is shown following.
//original graph
----> data flow
====> control flow
load1 ----> store1 ====> load2 ----> store2 ====> load3 ---->
store3
//end original graph
So, Pre-RA-sched is unable to schedule apart load/store instruction pair.
Due to LiveRange in the Register Allocation stage, all load/store instruction
pair are allocated the same register.

If we change the control flow in the above original graph, the modified graph is
shown following.
//modified graph
----> data flow
====> control flow
load1 ----> store1 ====> store2 ====> store3
load2 ----> store2
load3 ----> store3
//end modified graph

I think the Pre-RA-sched is able to schedule apart load/store instruction pairs.
Then each instruction pair uses different register.
The order of scheduled instruction of test.c may be load1, load2, load3, store1,
store2, store3.Best Wishes- Haishan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131216/b09b5faa/attachment.html>

Andrew Trick

2013-Dec-21 02:41 UTC

head link

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

The flag -enable-aa-sched-mi should do what you want you want in the
MachineScheduler pass.

If you want to do it in the selection DAG, there is a subtarget hook that might
do it:

TargetSubtargetInfo::useAA()

LLVM won’t generate the schedule you want anyway for Intel core processors, but
the alias analysis can be useful in general.

-Andy

On Dec 16, 2013, at 6:03 AM, Haishan <hndxvon at 163.com> wrote:
> At 2013-12-15 22:43:34,"Caldarale, Charles R" <Chuck.Caldarale
at unisys.com> wrote:
> >> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu]
> >> On Behalf Of Haishan
> >> Subject: [LLVMdev] Question about Pre-RA-schedule in LLVM3.3
> >
> >> My clang version is 3.3 and debug build.
> >
> >> //test.c
> >> int a[6] = {1, 2, 3, 4, 5, 6}
> >> int main() {
> >>  a[0] = a[5];
> >>  a[1] = a[4];
> >>  a[2] = a[5];
> >> }
> >> //end test.c
> >> Then test.dump is generated by using the objdump tool.
> >> //test.dump
> >> ldr  r1, [r0, #20]
> >> str  r1, [r0]
> >> ldr  r1, [r0, #16]
> >> str  r1, [r0, #4]
> >> ldr  r1, [r0, #12]
> >> str  r1, [r0, #8]
> >> bx  lr
> >> //end test.dump
> >
> >It appears you have a typo in the above, since the generated array
reference offsets do not correspond to the code in test.c.  Presumably, the last
array reference in test.c was really from a[3], not a[5].
> I'm sorry for making a mistake in the above test.c.
> And your presumption is right.
> >
> >> However, for 3th and 4th instructions, they should be allocated
different
> >> register from the second instruction.
> >
> >Why?
> >
> > - Chuck
> >
> Thank you for your answer.
> If 3th and 4th instructions are allocated different register from the
second instruction.
> Then the same machine register dependence will disappear, 
> this sequence instructions would be executed with less stalls and cycles.
> However, in the latest version of LLVM, the Pre-RA-sched builds a
scheduling graph(original graph) which is shown following.
> //original graph
> ----> data flow
> ====> control flow
> load1 ----> store1 ====> load2 ----> store2 ====> load3
----> store3
> //end original graph
> So, Pre-RA-sched is unable to schedule apart load/store instruction pair.
> Due to LiveRange in the Register Allocation stage, all load/store
instruction pair are allocated the same register.
> 
> If we change the control flow in the above original graph, the modified
graph is shown following.
> //modified graph
> ----> data flow
> ====> control flow
> load1 ----> store1 ====> store2 ====> store3
> load2 ----> store2
> load3 ----> store3
> //end modified graph
> 
> I think the Pre-RA-sched is able to schedule apart load/store instruction
pairs.
> Then each instruction pair uses different register.
> The order of scheduled instruction of test.c may be load1, load2, load3,
store1, store2, store3.
> Best Wishes
> - Haishan
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131220/1ce12617/attachment.html>

Hal Finkel

2013-Dec-21 02:54 UTC

head link

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

----- Original Message -----> From: "Andrew Trick" <atrick at apple.com>
> To: "Haishan" <hndxvon at 163.com>
> Cc: "llvmdev" <llvmdev at cs.uiuc.edu>
> Sent: Friday, December 20, 2013 8:41:40 PM
> Subject: Re: [LLVMdev] Question about Pre-RA-schedule in LLVM3.3
> 
> 
> The flag -enable-aa-sched-mi should do what you want you want in the
> MachineScheduler pass.
> 
> 
> If you want to do it in the selection DAG, there is a subtarget hook
> that might do it:
> 
> 
> 
> TargetSubtargetInfo::useAA()
To be clear, returning true from useAA enables the use of AA both in the
selection DAG and also in the MachineScheduler pass.

 -Hal
> 
> 
> LLVM won’t generate the schedule you want anyway for Intel core
> processors, but the alias analysis can be useful in general.
> 
> 
> -Andy
> 
> 
> 
> 
> On Dec 16, 2013, at 6:03 AM, Haishan < hndxvon at 163.com > wrote:
> 
> 
> 
> At 2013-12-15 22:43:34,"Caldarale, Charles R" <
> Chuck.Caldarale at unisys.com > wrote:
> >> From: llvmdev-bounces at cs.uiuc.edu [
> >> mailto:llvmdev-bounces at cs.uiuc.edu ]
> >> On Behalf Of Haishan
> >> Subject: [LLVMdev] Question about Pre-RA-schedule in LLVM3.3
> >
> >> My clang version is 3.3 and debug build.
> >
> >> //test.c
> >> int a[6] = {1, 2, 3, 4, 5, 6}
> >> int main() {
> >>  a[0] = a[5];
> >>  a[1] = a[4];
> >>  a[2] = a[5];
> >> }
> >> //end test.c
> >> Then test.dump is generated by using the objdump tool.
> >> //test.dump
> >> ldr  r1, [r0, #20]
> >> str  r1, [r0]
> >> ldr  r1, [r0, #16]
> >> str  r1, [r0, #4]
> >> ldr  r1, [r0, #12]
> >> str  r1, [r0, #8]
> >> bx  lr
> >> //end test.dump
> >
> >It appears you have a typo in the above, since the generated array
> > reference offsets do not correspond to the code in test.c.
> >  Presumably, the last array reference in test.c was really from
> > a[3], not a[5]. I'm sorry for making a mistake in the above
test.c.
> And your presumption is right. >
> >> However, for 3th and 4th instructions, they should be allocated
> >>  different
> >> register from the second instruction.
> >
> >Why?
> >
> > - Chuck
> > Thank you for your answer. If 3th and 4th instructions are
> > allocated different register from the second instruction.
> Then the same machine register dependence will disappear,
> this sequence instructions would be executed with less stalls and
> cycles.
> However, in the latest version of LLVM, the Pre-RA-sched builds a
> scheduling graph(original graph) which is shown following.
> //original graph
> ----> data flow
> ====> control flow
> load1 ----> store1 ====> load2 ----> store2 ====> load3
----> store3
> //end original graph
> So, Pre-RA-sched is unable to schedule apart load/store instruction
> pair.
> Due to LiveRange in the Register Allocation stage, all load/store
> instruction pair are allocated the same register.
> 
> If we change the control flow in the above original graph, the
> modified graph is shown following.
> //modified graph
> ----> data flow
> ====> control flow
> load1 ----> store1 ====> store2 ====> store3
> load2 ----> store2
> load3 ----> store3
> //end modified graph
> 
> I think the Pre-RA-sched is able to schedule apart load/store
> instruction pairs.
> Then each instruction pair uses different register.
> The order of scheduled instruction of test.c may be load1, load2,
> load3, store1, store2, store3. Best Wishes - Haishan
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Dec 2013 - [LLVMdev] Question about Pre-RA-schedule in LLVM3.3

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

Seemingly Similar Threads