Manman Ren
2012-Oct-24 05:00 UTC
[LLVMdev] [llvm-commits] ABI: how to let the backend know that an aggregate should be allocated on stack
Byval does not work for me, it will try to split the struct to fit into available core registers and the rest on stack. Sent from my iPhone On Oct 23, 2012, at 5:01 PM, Alex Rosenberg <alexr at leftfield.org> wrote:> In llvm-gcc, this decision was handled near llvm-arm.cpp:2737 in llvm_arm_aggregate_partially_passed_in_regs(). Basically, available registers would be counted up and if the HA didn't fit, it went byval instead. > > I agree that we should unify this sort of logic in one place. I'm not sure that onstack is the best interim step toward that. Does byval work here? > > Alex > > On Oct 23, 2012, at 11:22 AM, manman ren <mren at apple.com> wrote: > >> >> Hi All, >> >> I am trying to handle the Homogeneous Aggregate for ARM-VFP according to the spec: >> C.1.vfp If the argument is a VFP CPRC and there are sufficient consecutive VFP registers of the appropriate type unallocated then the argument is allocated to the lowest-numbered sequence of such registers. >> >> C.2.vfp If the argument is a VFP CPRC then any VFP registers that are unallocated are marked as unavailable. The NSAA is adjusted upwards until it is correctly aligned for the argument and the argument is copied to the stack at the adjusted NSAA. The NSAA is further incremented by the size of the argument. The argument has now been allocated. >> >> We currently expand the Homogeneous Aggregate in Clang, but that does not conform to the standard when we have a few VFP registers available but not enough. >> >> In that case, the beginning members of HA will be allocated to VFP, and the rest will go on stack. >> >> To fix the problem, it will be great if we can let the backend know the HA will be on stack and later VPF CPRCs will be on stack as well. >> There are some discussions on this, at least from the comments in TargetInfo.cpp: >> // This assumption is optimistic, as there could be free registers available >> // when we need to pass this argument in memory, and LLVM could try to pass >> // the argument in the free register. This does not seem to happen currently, >> // but this code would be much safer if we could mark the argument with >> // 'onstack'. See PR12193. >> >> I am just wondering whether it is necessary to add onstack flag and is there any issue related to that? >> >> Another option, suggested by Daniel, is to convert HA to a convenient similar type the backend won't pass in registers. >> I tried to pass a struct with vector types, but the backend will expand the struct >> See llvm::ComputeValueVTs >> // Given a struct type, recursively traverse the elements. >> >> I tried to use indirect in Clang, it does not work out as I wish. >> >> Any suggestion on how to fix this is highly appreciated! >> >> Thanks, >> Manman >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121023/0c2d0959/attachment.html>
Alex Rosenberg
2012-Oct-24 19:03 UTC
[LLVMdev] [llvm-commits] ABI: how to let the backend know that an aggregate should be allocated on stack
At the time, the ARM target didn't actually handle byval. Now it does. You should be able to get the old struct passing capability if you don't apply an attribute at all. Alex On Oct 23, 2012, at 10:00 PM, Manman Ren wrote:> > Byval does not work for me, it will try to split the struct to fit into available core registers and the rest on stack. > > Sent from my iPhone > > On Oct 23, 2012, at 5:01 PM, Alex Rosenberg <alexr at leftfield.org> wrote: > >> In llvm-gcc, this decision was handled near llvm-arm.cpp:2737 in llvm_arm_aggregate_partially_passed_in_regs(). Basically, available registers would be counted up and if the HA didn't fit, it went byval instead. >> >> I agree that we should unify this sort of logic in one place. I'm not sure that onstack is the best interim step toward that. Does byval work here? >> >> Alex >> >> On Oct 23, 2012, at 11:22 AM, manman ren <mren at apple.com> wrote: >> >>> >>> Hi All, >>> >>> I am trying to handle the Homogeneous Aggregate for ARM-VFP according to the spec: >>> C.1.vfp If the argument is a VFP CPRC and there are sufficient consecutive VFP registers of the appropriate type unallocated then the argument is allocated to the lowest-numbered sequence of such registers. >>> >>> C.2.vfp If the argument is a VFP CPRC then any VFP registers that are unallocated are marked as unavailable. The NSAA is adjusted upwards until it is correctly aligned for the argument and the argument is copied to the stack at the adjusted NSAA. The NSAA is further incremented by the size of the argument. The argument has now been allocated. >>> >>> We currently expand the Homogeneous Aggregate in Clang, but that does not conform to the standard when we have a few VFP registers available but not enough. >>> >>> In that case, the beginning members of HA will be allocated to VFP, and the rest will go on stack. >>> >>> To fix the problem, it will be great if we can let the backend know the HA will be on stack and later VPF CPRCs will be on stack as well. >>> There are some discussions on this, at least from the comments in TargetInfo.cpp: >>> // This assumption is optimistic, as there could be free registers available >>> // when we need to pass this argument in memory, and LLVM could try to pass >>> // the argument in the free register. This does not seem to happen currently, >>> // but this code would be much safer if we could mark the argument with >>> // 'onstack'. See PR12193. >>> >>> I am just wondering whether it is necessary to add onstack flag and is there any issue related to that? >>> >>> Another option, suggested by Daniel, is to convert HA to a convenient similar type the backend won't pass in registers. >>> I tried to pass a struct with vector types, but the backend will expand the struct >>> See llvm::ComputeValueVTs >>> // Given a struct type, recursively traverse the elements. >>> >>> I tried to use indirect in Clang, it does not work out as I wish. >>> >>> Any suggestion on how to fix this is highly appreciated! >>> >>> Thanks, >>> Manman >>> _______________________________________________ >>> llvm-commits mailing list >>> llvm-commits at cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
manman ren
2012-Oct-24 19:07 UTC
[LLVMdev] [llvm-commits] ABI: how to let the backend know that an aggregate should be allocated on stack
On Oct 24, 2012, at 12:03 PM, Alex Rosenberg <alexr at leftfield.org> wrote:> At the time, the ARM target didn't actually handle byval. Now it does. > > You should be able to get the old struct passing capability if you don't apply an attribute at all.Indirect with byval being false will pass the whole struct via stack, but it will occupy one Core register for passing the address. Thanks, Manman> > Alex > > On Oct 23, 2012, at 10:00 PM, Manman Ren wrote: > >> >> Byval does not work for me, it will try to split the struct to fit into available core registers and the rest on stack. >> >> Sent from my iPhone >> >> On Oct 23, 2012, at 5:01 PM, Alex Rosenberg <alexr at leftfield.org> wrote: >> >>> In llvm-gcc, this decision was handled near llvm-arm.cpp:2737 in llvm_arm_aggregate_partially_passed_in_regs(). Basically, available registers would be counted up and if the HA didn't fit, it went byval instead. >>> >>> I agree that we should unify this sort of logic in one place. I'm not sure that onstack is the best interim step toward that. Does byval work here? >>> >>> Alex >>> >>> On Oct 23, 2012, at 11:22 AM, manman ren <mren at apple.com> wrote: >>> >>>> >>>> Hi All, >>>> >>>> I am trying to handle the Homogeneous Aggregate for ARM-VFP according to the spec: >>>> C.1.vfp If the argument is a VFP CPRC and there are sufficient consecutive VFP registers of the appropriate type unallocated then the argument is allocated to the lowest-numbered sequence of such registers. >>>> >>>> C.2.vfp If the argument is a VFP CPRC then any VFP registers that are unallocated are marked as unavailable. The NSAA is adjusted upwards until it is correctly aligned for the argument and the argument is copied to the stack at the adjusted NSAA. The NSAA is further incremented by the size of the argument. The argument has now been allocated. >>>> >>>> We currently expand the Homogeneous Aggregate in Clang, but that does not conform to the standard when we have a few VFP registers available but not enough. >>>> >>>> In that case, the beginning members of HA will be allocated to VFP, and the rest will go on stack. >>>> >>>> To fix the problem, it will be great if we can let the backend know the HA will be on stack and later VPF CPRCs will be on stack as well. >>>> There are some discussions on this, at least from the comments in TargetInfo.cpp: >>>> // This assumption is optimistic, as there could be free registers available >>>> // when we need to pass this argument in memory, and LLVM could try to pass >>>> // the argument in the free register. This does not seem to happen currently, >>>> // but this code would be much safer if we could mark the argument with >>>> // 'onstack'. See PR12193. >>>> >>>> I am just wondering whether it is necessary to add onstack flag and is there any issue related to that? >>>> >>>> Another option, suggested by Daniel, is to convert HA to a convenient similar type the backend won't pass in registers. >>>> I tried to pass a struct with vector types, but the backend will expand the struct >>>> See llvm::ComputeValueVTs >>>> // Given a struct type, recursively traverse the elements. >>>> >>>> I tried to use indirect in Clang, it does not work out as I wish. >>>> >>>> Any suggestion on how to fix this is highly appreciated! >>>> >>>> Thanks, >>>> Manman >>>> _______________________________________________ >>>> llvm-commits mailing list >>>> llvm-commits at cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >
Rafael EspĂndola
2012-Oct-24 19:40 UTC
[LLVMdev] [llvm-commits] ABI: how to let the backend know that an aggregate should be allocated on stack
On 24 October 2012 01:00, Manman Ren <mren at apple.com> wrote:> > Byval does not work for me, it will try to split the struct to fit into > available core registers and the rest on stack.That is a strange byval implementation. Maybe the llvm ARM backend should be changed to always pass byval on the stack? Clang can create regular (integer, fp) arguments for the registers. Cheers, Rafael
Tim Northover
2012-Oct-24 20:03 UTC
[LLVMdev] [llvm-commits] ABI: how to let the backend know that an aggregate should be allocated on stack
> That is a strange byval implementation. Maybe the llvm ARM backend > should be changed to always pass byval on the stack? Clang can create > regular (integer, fp) arguments for the registers.The problem is that the ABI says the argument *should* be split between registers and stack. The relevant callbacks in clang only get to suggest one type (+ a padding dummy going before if they want); they can't (currently) say "put the first 4 bytes here and the rest there". Given that constraint "byval" is probably the sanest option since it's special anyway. That could be changed of course, but I'm not convinced Clang would be improved for it. Tim.
Bob Wilson
2012-Oct-26 14:16 UTC
[LLVMdev] [llvm-commits] ABI: how to let the backend know that an aggregate should be allocated on stack
On Oct 24, 2012, at 12:40 PM, Rafael EspĂndola <rafael.espindola at gmail.com> wrote:> On 24 October 2012 01:00, Manman Ren <mren at apple.com> wrote: >> >> Byval does not work for me, it will try to split the struct to fit into >> available core registers and the rest on stack. > > That is a strange byval implementation. Maybe the llvm ARM backend > should be changed to always pass byval on the stack? Clang can create > regular (integer, fp) arguments for the registers.The current definition of the byval attribute in LangRef says nothing about requiring passing the argument on the stack. It just says it "should really be passed by value". When discussing the alignment, it does refer to a stack slot, but it isn't at all clear that it is required to be on the stack.>From looking at the PowerPC backend, I got the impression that it does not interpret the byval attribute to mean that an argument must go on the stack. It could be entirely in registers or split between registers and stack. For Intel, on the other hand, there seem to be many cases where byval is intentionally used as a substitute for the "on stack" attribute that Manman was looking for.It would be good to clarify the intention of this in the docs.
Maybe Matching Threads
- [LLVMdev] [llvm-commits] ABI: how to let the backend know that an aggregate should be allocated on stack
- [LLVMdev] [llvm-commits] ABI: how to let the backend know that an aggregate should be allocated on stack
- [LLVMdev] [llvm-commits] ABI: how to let the backend know that an aggregate should be allocated on stack
- [LLVMdev] [llvm-commits] ABI: how to let the backend know that an aggregate should be allocated on stack
- [LLVMdev] ABI: how to let the backend know that an aggregate should be allocated on stack