hameeza ahmed via llvm-dev
2017-Aug-26 18:14 UTC
[llvm-dev] Register Allocation and Scheduling Issues
Hello, I have defined 8 registers in registerinfo.td file in the following order: R_0, R_1, R_2, R_3, R_4, R_5, R_6, R_7 But the generated assembly code only uses 2 registers. How to enable it to use all 8? Also can i control the ordering like after R_0 can i use R_5 without changes in registerinfo.td? What changes are required here? either in scheduling or register allocation phases? P_2048B_LOAD_DWORD R_0, Pword ptr [rip + b] P_2048B_LOAD_DWORD R_1, Pword ptr [rip + c] P_2048B_VADD R_0, R_1, R_0 P_2048B_STORE_DWORD Pword ptr [rip + a], R_0 P_2048B_LOAD_DWORD R_0, Pword ptr [rip + b+2048] P_2048B_LOAD_DWORD R_1, Pword ptr [rip + c+2048] P_2048B_VADD R_0, R_1, R_0 P_2048B_STORE_DWORD Pword ptr [rip + a+2048], R_0 P_2048B_LOAD_DWORD R_0, Pword ptr [rip + b+4096] P_2048B_LOAD_DWORD R_1, Pword ptr [rip + c+4096] P_2048B_VADD R_0, R_1, R_0 P_2048B_STORE_DWORD Pword ptr [rip + a+4096], R_0 P_2048B_LOAD_DWORD R_0, Pword ptr [rip + b+6144] P_2048B_LOAD_DWORD R_1, Pword ptr [rip + c+6144] P_2048B_VADD R_0, R_1, R_0 P_2048B_STORE_DWORD Pword ptr [rip + a+6144], R_0 Please help. I am stuck here. Thank You -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20170826/7b299b31/attachment.html>
Tim Northover via llvm-dev
2017-Aug-26 19:31 UTC
[llvm-dev] Register Allocation and Scheduling Issues
On 26 August 2017 at 11:14, hameeza ahmed <hahmed2305 at gmail.com> wrote:> Hello, > > I have defined 8 registers in registerinfo.td file in the following order: > R_0, R_1, R_2, R_3, R_4, R_5, R_6, R_7 > > But the generated assembly code only uses 2 registers. How to enable it to > use all 8?What are your thoughts on what might be the issue? Have you considered the advantages and disadvantages of using multiple registers for the code you're testing? Cheers. Tim.
hameeza ahmed via llvm-dev
2017-Aug-26 19:43 UTC
[llvm-dev] Register Allocation and Scheduling Issues
Actually my hardware is designed such that there are 32 lanes. each has 8 registers. the assembly code should be emitted keeping this fact. I defined the registers as follows in .td in the following order; L_0_R_0, L_0_R_1, L_0_R_2, L_0_R_3, L_0_R_4, L_0_R_5, L_0_R_6, L_0_R_7, L_1_R_0, L_1_R_1, L_1_R_2, L_1_R_3, L_1_R_4, L_1_R_5, L_1_R_6, L_1_R_7, ................... L_31_R_0, L_31_R_1, L_31_R_2, L_31_R_3, L_31_R_4, L_31_R_5, L_31_R_6, L_31_R_7, Now when i assemble the vec sum code by my implemented instructions and default x86 scheduling & register allocation. it is only using L_0. But it should use all the lanes? how to achieve this. Something as follows: currently it is emitting as follows: P_2048B_LOAD_DWORD L_0_R_0, Pword ptr [rip + b] P_2048B_LOAD_DWORD L_0_R_1, Pword ptr [rip + c] P_2048B_VADD L_0_R_0, L_0_R_1, L_0_R_0 P_2048B_STORE_DWORD Pword ptr [rip + a], L_0_R_0 P_2048B_LOAD_DWORD L_0_R_0, Pword ptr [rip + b+2048] P_2048B_LOAD_DWORD L_0_R_1, Pword ptr [rip + c+2048] P_2048B_VADD L_0_R_0, L_0_R_1, L_0_R_0 P_2048B_STORE_DWORD Pword ptr [rip + a+2048], L_0_R_0 P_2048B_LOAD_DWORD L_0_R_0, Pword ptr [rip + b+4096] P_2048B_LOAD_DWORD L_0_R_1, Pword ptr [rip + c+4096] P_2048B_VADD L_0_R_0, L_0_R_1, L_0_R_0 P_2048B_STORE_DWORD Pword ptr [rip + a+4096], L_0_R_0 P_2048B_LOAD_DWORD L_0_R_0, Pword ptr [rip + b+6144] P_2048B_LOAD_DWORD L_0_R_1, Pword ptr [rip + c+6144] P_2048B_VADD L_0_R_0, L_0_R_1, L_0_R_0 P_2048B_STORE_DWORD Pword ptr [rip + a+6144], L_0_R_0 It should emit as follows: P_2048B_LOAD_DWORD L_0_R_0, Pword ptr [rip + b] P_2048B_LOAD_DWORD L_0_R_1, Pword ptr [rip + c] P_2048B_VADD L_0_R_0, L_0_R_1, L_0_R_0 P_2048B_STORE_DWORD Pword ptr [rip + a], L_0_R_0 P_2048B_LOAD_DWORD L_1_R_0, Pword ptr [rip + b+2048] P_2048B_LOAD_DWORD L_1_R_1, Pword ptr [rip + c+2048] P_2048B_VADD L_1_R_0, L_1_R_1, L_1_R_0 P_2048B_STORE_DWORD Pword ptr [rip + a+2048], L_1_R_0 P_2048B_LOAD_DWORD L_2_R_0, Pword ptr [rip + b+4096] P_2048B_LOAD_DWORD L_2_R_1, Pword ptr [rip + c+4096] P_2048B_VADD L_2_R_0, L_2_R_1, L_2_R_0 P_2048B_STORE_DWORD Pword ptr [rip + a+4096], L_2_R_0 P_2048B_LOAD_DWORD L_3_R_0, Pword ptr [rip + b+6144] P_2048B_LOAD_DWORD L_3_R_1, Pword ptr [rip + c+6144] P_2048B_VADD L_3_R_0, L_3_R_1, L_3_R_0 P_2048B_STORE_DWORD Pword ptr [rip + a+6144], L_3_R_0 does it involve changing the register live intervals? or scheduling? please help. i am trying hard but unable to solve this. On Sun, Aug 27, 2017 at 12:31 AM, Tim Northover <t.p.northover at gmail.com> wrote:> On 26 August 2017 at 11:14, hameeza ahmed <hahmed2305 at gmail.com> wrote: > > Hello, > > > > I have defined 8 registers in registerinfo.td file in the following > order: > > R_0, R_1, R_2, R_3, R_4, R_5, R_6, R_7 > > > > But the generated assembly code only uses 2 registers. How to enable it > to > > use all 8? > > What are your thoughts on what might be the issue? Have you considered > the advantages and disadvantages of using multiple registers for the > code you're testing? > > Cheers. > > Tim. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20170827/dca62ab4/attachment.html>