Duncan Sands
2012-Nov-09 12:55 UTC
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
Hi Dmitry, > I'm attaching a patch that should fix the issue mentioned above. It> simply makes the same check seen in the same file for global > variables: > > emitPTXAddressSpace(PTy->getAddressSpace(), O); > if (GVar->getAlignment() == 0) > O << " .align " << (int) TD->getPrefTypeAlignment(ETy); > else > O << " .align " << GVar->getAlignment();it's not quite the same because your patch uses the ABI alignment, while in this snippet it is the preferred alignment (which is usually the same as the ABI alignment, but may be bigger).> Could you please review and commit? Do you think it needs a test case?Yes, it needs a testcase. Ciao, Duncan.
Dmitry N. Mikushin
2012-Nov-09 13:17 UTC
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
Hi Duncan, You're right, global variables use preferred alignment. And - yes, preferred alignment in this case is bigger: 8 instead of 4. NVIDIA's prop. compiler gives 4. However, since CUDA 5.0 ptx modules are linkable with each other, I think alignments for externally visible functions and data should all follow ABI rules. Is there a guide on making tests? I have ~5 pending patches never accepted simply because I'm not familiar with LLVM test system :-/ - D. 2012/11/9 Duncan Sands <baldrick at free.fr>:> Hi Dmitry, > > >> I'm attaching a patch that should fix the issue mentioned above. It >> >> simply makes the same check seen in the same file for global >> variables: >> >> emitPTXAddressSpace(PTy->getAddressSpace(), O); >> if (GVar->getAlignment() == 0) >> O << " .align " << (int) TD->getPrefTypeAlignment(ETy); >> else >> O << " .align " << GVar->getAlignment(); > > > it's not quite the same because your patch uses the ABI alignment, while > in this snippet it is the preferred alignment (which is usually the same > as the ABI alignment, but may be bigger). > > >> Could you please review and commit? Do you think it needs a test case? > > > Yes, it needs a testcase. > > Ciao, Duncan. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Duncan Sands
2012-Nov-09 13:23 UTC
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
Hi Dmitry,> You're right, global variables use preferred alignment. And - yes, > preferred alignment in this case is bigger: 8 instead of 4. NVIDIA's > prop. compiler gives 4. However, since CUDA 5.0 ptx modules are > linkable with each other, I think alignments for externally visible > functions and data should all follow ABI rules.giving it an alignment of 8 does follow ABI rules: everything that is 8 byte aligned has an address that is a multiple of 4, i.e. it is also 4 byte aligned. It would be wrong to assume that external globals have the preferred alignment: they can only be assumed to have the ABI alignment. But as we are aligning this variable we can give it the alignment we like as long as it is at least 4. In short, I think using the preferred alignment is correct in this context.> Is there a guide on making tests? I have ~5 pending patches never > accepted simply because I'm not familiar with LLVM test system :-/Your test would go in test/CodeGen/PTX/. It should use FileCheck, take a look at test/CodeGen/X86/zext-trunc.ll for an example. As there are no PTX tests at all for the moment, you need to create a lit.local.cfg file in PTX/. Imitate the X86 one. Ciao, Duncan.> > - D. > > 2012/11/9 Duncan Sands <baldrick at free.fr>: >> Hi Dmitry, >> >> >>> I'm attaching a patch that should fix the issue mentioned above. It >>> >>> simply makes the same check seen in the same file for global >>> variables: >>> >>> emitPTXAddressSpace(PTy->getAddressSpace(), O); >>> if (GVar->getAlignment() == 0) >>> O << " .align " << (int) TD->getPrefTypeAlignment(ETy); >>> else >>> O << " .align " << GVar->getAlignment(); >> >> >> it's not quite the same because your patch uses the ABI alignment, while >> in this snippet it is the preferred alignment (which is usually the same >> as the ABI alignment, but may be bigger). >> >> >>> Could you please review and commit? Do you think it needs a test case? >> >> >> Yes, it needs a testcase. >> >> Ciao, Duncan. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Possibly Parallel Threads
- [LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
- [LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
- [LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
- [LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
- [LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params