Justin Holewinski
2011-May-13 11:21 UTC
[LLVMdev] [ptx] Propose a register class naming convention change
On Fri, May 13, 2011 at 5:11 AM, Dan Bailey <drb at dneg.com> wrote:> That's fine with me. Unless there's a particular reason for it I would > suggest perhaps changing the immediate syntax as well to swap it round, so > it would be Immi32, Immi64, Immf32, etc. It doesn't bother me that much the > way it currently is, but when there are lots of operations taking a register > and an immediate, representing them in the same way might be a little more > consistent? > > Personally, I think I also might prefer an underscore to make it more > readable for new users (Reg_u32, Reg_pred, Imm_i32, Imm_f32, etc). That's > maybe just my own preference, so feel free to do it as you've suggested! > > DanI've been considering the way registers are represented in the PTX back-end quite a bit lately, and I think we need to re-consider the way we handle registers in the PTX back-end. As is, we assume a fixed register set of typed and sized registers, which is more-or-less what the LLVM code generation framework expects. However, PTX is really a special-case target in that the register space is "infinite" and not really typed (yes, PTX allows register types, but I do not believe that is mandatory). The infinite nature of the register space gives us a few problems: 1. We are currently constrained by the number of registers we specify in PTXRegisterInfo.td 2. The LLVM register allocators are not really solving the right problem 3. We miss opportunities for register re-use I'm sure there are more, but those are the ones I am thinking of now. To solve (1) (and (3) to some degree), I propose we get rid of register types and instead use .b{16, 32, 64} and .pred as our register classes. I cannot think of a case where specifying a register class (u32, f32, etc.) is required. In fact, manually modifying my own PTX code to always use .b* registers has not affected anything. This would both simplify the back-end and allow the LLVM register allocator to re-use registers across different data types (may or may not be a win depending on how good the ptxas register allocator is). Solving (2) seems to be a much more difficult problem. The current implementation of register allocation assumes a fixed register space, and allocates registers as best as it can while introducing spill code when it has to. For PTX, the problem is a bit different. Instead, we should assume an infinite register space and *minimize* the number of registers required *without* introducing spill code. It is the responsibility of ptxas to do the final register allocation and spill code creation. I see two potential solutions to this: 1. Keep the current fixed register space and emit spill code that really just adds an additional register and copies data between registers for spills 2. Implement a new register allocation strategy that ties into the existing infrastructure to satisfy our requirements Solution (1) seems the easiest to implement, but I worry that ptxas may not be able to interpret what is really happening. I believe doing PTX-level register allocation is at least partially responsible for the speed-ups I have observed when comparing against nvcc-generated code. That leaves (2) as the preferred method, but I do not know enough about the inner-workings of the LLVM register allocations to properly assess how difficult this would be. Any thoughts? By the way, I'm perfectly okay with the name change :)> > > Che-Liang Chiou wrote: > >> Hi, >> >> Current register class naming has a confusing prefix letter 'R' (it is >> my bad), such as the first 'R' of RRegu32 (for unsigned 32-bit >> registers). >> >> I propose a 'Reg' + type name naming convention for register classes; such >> as: >> Regu16, Regu32, Regf32, Regf64 >> With one exception for predicate registers (capitalized first letter of >> 'pred'): >> RegPred >> >> Since predicate registers are special in the way that they can't be >> passed as arguments or load from/store to memory, I think a little >> name convention exception for it is okay. >> >> What do you think? >> >> If no objection, I will start making the change. >> >> Regards, >> Che-Liang >> >> >> >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110513/1d54a870/attachment.html>
Dan Bailey
2011-May-13 14:40 UTC
[LLVMdev] [ptx] Propose a register class naming convention change
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=UTF-8" http-equiv="Content-Type"> </head> <body bgcolor="#ffffff" text="#000000"> Justin Holewinski wrote: <blockquote cite="mid:BANLkTi=Y9EFmWRu-9dQxydq8zTyF7tEbJw@mail.gmail.com" type="cite"> <div class="gmail_quote">On Fri, May 13, 2011 at 5:11 AM, Dan Bailey <span dir="ltr"><<a moz-do-not-send="true" href="mailto:drb@dneg.com">drb@dneg.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">That's fine with me. Unless there's a particular reason for it I would suggest perhaps changing the immediate syntax as well to swap it round, so it would be Immi32, Immi64, Immf32, etc. It doesn't bother me that much the way it currently is, but when there are lots of operations taking a register and an immediate, representing them in the same way might be a little more consistent?<br> <br> Personally, I think I also might prefer an underscore to make it more readable for new users (Reg_u32, Reg_pred, Imm_i32, Imm_f32, etc). That's maybe just my own preference, so feel free to do it as you've suggested!<br> <font color="#888888"> <br> Dan</font></blockquote> <div><br> </div> <div>I've been considering the way registers are represented in the PTX back-end quite a bit lately, and I think we need to re-consider the way we handle registers in the PTX back-end. As is, we assume a fixed register set of typed and sized registers, which is more-or-less what the LLVM code generation framework expects. However, PTX is really a special-case target in that the register space is "infinite" and not really typed (yes, PTX allows register types, but I do not believe that is mandatory). The infinite nature of the register space gives us a few problems:</div> <div> <ol> <li>We are currently constrained by the number of registers we specify in PTXRegisterInfo.td</li> <li>The LLVM register allocators are not really solving the right problem</li> <li>We miss opportunities for register re-use</li> </ol> <div>I'm sure there are more, but those are the ones I am thinking of now.</div> </div> <div><br> </div> <div>To solve (1) (and (3) to some degree), I propose we get rid of register types and instead use .b{16, 32, 64} and .pred as our register classes. I cannot think of a case where specifying a register class (u32, f32, etc.) is required. In fact, manually modifying my own PTX code to always use .b* registers has not affected anything. This would both simplify the back-end and allow the LLVM register allocator to re-use registers across different data types (may or may not be a win depending on how good the ptxas register allocator is).</div> </div> </blockquote> Yep, definitely let's do this! I tried to do something similar before, but didn't realise the operand types and register types didn't have to match. That'll surely improve our register reuse. The only minor disadvantage I can see is that the resulting ptx will be a little cryptic to debug, but that's not an issue.<br> <br> As for the register allocation, I'm not familiar enough to be able to comment on the feasibility either, but the second option sounds like the preferred one.<br> <br> Dan<br> <blockquote cite="mid:BANLkTi=Y9EFmWRu-9dQxydq8zTyF7tEbJw@mail.gmail.com" type="cite"> <div class="gmail_quote"> <div>Solving (2) seems to be a much more difficult problem. The current implementation of register allocation assumes a fixed register space, and allocates registers as best as it can while introducing spill code when it has to. For PTX, the problem is a bit different. Instead, we should assume an infinite register space and *minimize* the number of registers required *without* introducing spill code. It is the responsibility of ptxas to do the final register allocation and spill code creation. I see two potential solutions to this:</div> <div> <ol> <li>Keep the current fixed register space and emit spill code that really just adds an additional register and copies data between registers for spills</li> <li>Implement a new register allocation strategy that ties into the existing infrastructure to satisfy our requirements</li> </ol> <div>Solution (1) seems the easiest to implement, but I worry that ptxas may not be able to interpret what is really happening. I believe doing PTX-level register allocation is at least partially responsible for the speed-ups I have observed when comparing against nvcc-generated code. That leaves (2) as the preferred method, but I do not know enough about the inner-workings of the LLVM register allocations to properly assess how difficult this would be.</div> </div> <div><br> </div> <div>Any thoughts?</div> <div><br> </div> <div>By the way, I'm perfectly okay with the name change :)</div> <div> </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> <div> <div class="h5"><br> <br> Che-Liang Chiou wrote:<br> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> Hi,<br> <br> Current register class naming has a confusing prefix letter 'R' (it is<br> my bad), such as the first 'R' of RRegu32 (for unsigned 32-bit<br> registers).<br> <br> I propose a 'Reg' + type name naming convention for register classes; such as:<br> Regu16, Regu32, Regf32, Regf64<br> With one exception for predicate registers (capitalized first letter of 'pred'):<br> RegPred<br> <br> Since predicate registers are special in the way that they can't be<br> passed as arguments or load from/store to memory, I think a little<br> name convention exception for it is okay.<br> <br> What do you think?<br> <br> If no objection, I will start making the change.<br> <br> Regards,<br> Che-Liang<br> <br> <br> </blockquote> </div> </div> </blockquote> </div> <br> <br clear="all"> <br> -- <br> <br> <div>Thanks,</div> <div><br> </div> <div>Justin Holewinski</div> <br> </blockquote> </body> </html>
Justin Holewinski
2011-May-13 20:39 UTC
[LLVMdev] [ptx] Propose a register class naming convention change
2011/5/13 Dan Bailey <drb at dneg.com>> Justin Holewinski wrote: > > On Fri, May 13, 2011 at 5:11 AM, Dan Bailey <drb at dneg.com> wrote: > >> That's fine with me. Unless there's a particular reason for it I would >> suggest perhaps changing the immediate syntax as well to swap it round, so >> it would be Immi32, Immi64, Immf32, etc. It doesn't bother me that much the >> way it currently is, but when there are lots of operations taking a register >> and an immediate, representing them in the same way might be a little more >> consistent? >> >> Personally, I think I also might prefer an underscore to make it more >> readable for new users (Reg_u32, Reg_pred, Imm_i32, Imm_f32, etc). That's >> maybe just my own preference, so feel free to do it as you've suggested! >> >> Dan > > > I've been considering the way registers are represented in the PTX > back-end quite a bit lately, and I think we need to re-consider the way we > handle registers in the PTX back-end. As is, we assume a fixed register set > of typed and sized registers, which is more-or-less what the LLVM code > generation framework expects. However, PTX is really a special-case target > in that the register space is "infinite" and not really typed (yes, PTX > allows register types, but I do not believe that is mandatory). The > infinite nature of the register space gives us a few problems: > > 1. We are currently constrained by the number of registers we specify > in PTXRegisterInfo.td > 2. The LLVM register allocators are not really solving the right > problem > 3. We miss opportunities for register re-use > > I'm sure there are more, but those are the ones I am thinking of now. > > To solve (1) (and (3) to some degree), I propose we get rid of register > types and instead use .b{16, 32, 64} and .pred as our register classes. I > cannot think of a case where specifying a register class (u32, f32, etc.) is > required. In fact, manually modifying my own PTX code to always use .b* > registers has not affected anything. This would both simplify the back-end > and allow the LLVM register allocator to re-use registers across different > data types (may or may not be a win depending on how good the ptxas register > allocator is). > > Yep, definitely let's do this! I tried to do something similar before, but > didn't realise the operand types and register types didn't have to match. > That'll surely improve our register reuse. The only minor disadvantage I can > see is that the resulting ptx will be a little cryptic to debug, but that's > not an issue. >I'm probably going to get started on this over the weekend. Che-Liang, since I will be re-writing most of the register code anyway, I'll go ahead and change to the new naming convention.> > As for the register allocation, I'm not familiar enough to be able to > comment on the feasibility either, but the second option sounds like the > preferred one. > > Dan > > Solving (2) seems to be a much more difficult problem. The current > implementation of register allocation assumes a fixed register space, and > allocates registers as best as it can while introducing spill code when it > has to. For PTX, the problem is a bit different. Instead, we should assume > an infinite register space and *minimize* the number of registers required > *without* introducing spill code. It is the responsibility of ptxas to do > the final register allocation and spill code creation. I see two potential > solutions to this: > > 1. Keep the current fixed register space and emit spill code that > really just adds an additional register and copies data between registers > for spills > 2. Implement a new register allocation strategy that ties into the > existing infrastructure to satisfy our requirements > > Solution (1) seems the easiest to implement, but I worry that ptxas may not > be able to interpret what is really happening. I believe doing PTX-level > register allocation is at least partially responsible for the speed-ups I > have observed when comparing against nvcc-generated code. That leaves (2) > as the preferred method, but I do not know enough about the inner-workings > of the LLVM register allocations to properly assess how difficult this would > be. > > Any thoughts? > > By the way, I'm perfectly okay with the name change :) > > >> >> >> Che-Liang Chiou wrote: >> >>> Hi, >>> >>> Current register class naming has a confusing prefix letter 'R' (it is >>> my bad), such as the first 'R' of RRegu32 (for unsigned 32-bit >>> registers). >>> >>> I propose a 'Reg' + type name naming convention for register classes; >>> such as: >>> Regu16, Regu32, Regf32, Regf64 >>> With one exception for predicate registers (capitalized first letter of >>> 'pred'): >>> RegPred >>> >>> Since predicate registers are special in the way that they can't be >>> passed as arguments or load from/store to memory, I think a little >>> name convention exception for it is okay. >>> >>> What do you think? >>> >>> If no objection, I will start making the change. >>> >>> Regards, >>> Che-Liang >>> >>> >>> >> > > > -- > > Thanks, > > Justin Holewinski > >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110513/c728ea06/attachment.html>
Possibly Parallel Threads
- [LLVMdev] [ptx] Propose a register class naming convention change
- [LLVMdev] [ptx] Propose a register class naming convention change
- [LLVMdev] [ptx] Propose a register class naming convention change
- [LLVMdev] [ptx] Propose a register class naming convention change
- [LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken