Dan Gohman
2012-Aug-22 20:15 UTC
[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
Hello, Currently LLVM expects front-ends to lower struct assignments into either individual scalar loads and stores, or calls to @llvm.memcpy. For structs with lots of fields, it can take a lot of scalar loads and stores, so @llvm.memcpy is used instead. Unfortunately, using @llvm.memcpy does not permit full TBAA information to be preserved. Also, it unnecessarily copies any padding bytes between fields, which can lead to unnecessary copying in the case where the optimizer or codegen decide to split it back up into individual loads and stores. Chris wrote up some ideas about the struct padding part of this problem [0]; this proposal extends that proposal and adds the capability to represent TBAA information for the members of the fields in a struct assignment. Here's an example showing the basic problem: struct bar { char x; float y; double z; }; void copy_bar(struct bar *a, struct bar *b) { *a = *b; } We get this IR: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8, i1 false) This works, but it doesn't retain the information that the bytes between fields x and y don't really need to be copied, and it doesn't inform the optimizer that there are three fields with TBAA-relevant types being copied. The solution I propose here is to have front-ends describe the copy using metadata. For example: %struct.foo = type { i8, float, double } […] call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8, i1 false), !struct.assignment !4 […] !0 = metadata !{metadata !"Simple C/C++ TBAA"} !1 = metadata !{metadata !"omnipotent char", metadata !0} !2 = metadata !{metadata !"float", metadata !1} !3 = metadata !{metadata !"double", metadata !1} !4 = metadata !{ %struct.foo* null, metadata !5 } !5 = metadata !{ metadata !1, metadata !2, metadata !3 } Metadata nodes !0 through !3 are regular TBAA nodes as are already in use. Metadata node !4 here is a top-level description of the memcpy. Its first operand is a null pointer, which is there just for its type. It specifies a (pointer to an) IR-level struct type the memcpy can be thought of as copying. The second operand is and MDNode which describes the TBAA values for the fields. The indices of the operands in that MDNode directly correspond to the indices of the members in the IR-level struct type. With this information, optimizer and codegen can more aggressively optimize the memcpy. In particular, it would be possible for the optimizer to expand the memcpy into a series of loads and stores with complete TBAA information. Also, the optimize could determine where the padding is by examining the struct layout of the IR-level struct definition. Note that this is not a proposal for struct-access-path aware TBAA, or even full struct value TBAA. This is just a way to preserve basic scalar TBAA for individual members of structs in a struct assignment. Comments and questions are welcome. Dan [0] http://nondot.org/sabre/LLVMNotes/BetterStructureCopyOptimization.txt
Hal Finkel
2012-Aug-25 05:50 UTC
[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
On Wed, 22 Aug 2012 13:15:30 -0700 Dan Gohman <gohman at apple.com> wrote:> Hello, > > Currently LLVM expects front-ends to lower struct assignments into > either individual scalar loads and stores, or calls to @llvm.memcpy. > For structs with lots of fields, it can take a lot of scalar loads > and stores, so @llvm.memcpy is used instead. Unfortunately, using > @llvm.memcpy does not permit full TBAA information to be preserved. > Also, it unnecessarily copies any padding bytes between fields, which > can lead to unnecessary copying in the case where the optimizer or > codegen decide to split it back up into individual loads and stores. > > Chris wrote up some ideas about the struct padding part of this > problem [0]; this proposal extends that proposal and adds the > capability to represent TBAA information for the members of the > fields in a struct assignment. > > Here's an example showing the basic problem: > > struct bar { > char x; > float y; > double z; > }; > void copy_bar(struct bar *a, struct bar *b) { > *a = *b; > } > > We get this IR: > > call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8, > i1 false) > > This works, but it doesn't retain the information that the bytes > between fields x and y don't really need to be copied, and it doesn't > inform the optimizer that there are three fields with TBAA-relevant > types being copied. > > The solution I propose here is to have front-ends describe the copy > using metadata. For example: > > %struct.foo = type { i8, float, double } > […] > call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8, > i1 false), !struct.assignment !4 […]I think that it would make more sense to name this !struct.tbaa -- it seems logically similar to existing TBAA metadata (in that it is attached to the relevant load/store instruction). -Hal> !0 = metadata !{metadata !"Simple C/C++ TBAA"} > !1 = metadata !{metadata !"omnipotent char", metadata !0} > !2 = metadata !{metadata !"float", metadata !1} > !3 = metadata !{metadata !"double", metadata !1} > !4 = metadata !{ %struct.foo* null, metadata !5 } > !5 = metadata !{ metadata !1, metadata !2, metadata !3 } > > Metadata nodes !0 through !3 are regular TBAA nodes as are already in > use. > > Metadata node !4 here is a top-level description of the memcpy. Its > first operand is a null pointer, which is there just for its type. It > specifies a (pointer to an) IR-level struct type the memcpy can be > thought of as copying. The second operand is and MDNode which > describes the TBAA values for the fields. The indices of the operands > in that MDNode directly correspond to the indices of the members in > the IR-level struct type. > > With this information, optimizer and codegen can more aggressively > optimize the memcpy. In particular, it would be possible for the > optimizer to expand the memcpy into a series of loads and stores with > complete TBAA information. Also, the optimize could determine where > the padding is by examining the struct layout of the IR-level struct > definition. > > Note that this is not a proposal for struct-access-path aware TBAA, or > even full struct value TBAA. This is just a way to preserve basic > scalar TBAA for individual members of structs in a struct assignment. > > Comments and questions are welcome. > > Dan > > [0] > http://nondot.org/sabre/LLVMNotes/BetterStructureCopyOptimization.txt > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
Dan Gohman
2012-Aug-27 18:41 UTC
[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
On Aug 24, 2012, at 10:50 PM, Hal Finkel <hfinkel at anl.gov> wrote:> On Wed, 22 Aug 2012 13:15:30 -0700 > Dan Gohman <gohman at apple.com> wrote: > >> call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8, >> i1 false), !struct.assignment !4 […] > > I think that it would make more sense to name this !struct.tbaa -- it > seems logically similar to existing TBAA metadata (in that it is > attached to the relevant load/store instruction).How about !tbaa.struct, kicking off a prefix-namespace idiom? On the other hand, TBAA is only half the story here; the other half is describing struct padding. I don't have strong opinions here; does anyone else? Dan
Krzysztof Parzyszek
2012-Aug-30 20:34 UTC
[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
On 8/22/2012 3:15 PM, Dan Gohman wrote:> Here's an example showing the basic problem: > > struct bar { > char x; > float y; > double z; > }; > void copy_bar(struct bar *a, struct bar *b) { > *a = *b; > } > > We get this IR: > > call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8, i1 false) > > This works, but it doesn't retain the information that the bytes between fields > x and y don't really need to be copied, and it doesn't inform the optimizer > that there are three fields with TBAA-relevant types being copied. > > The solution I propose here is to have front-ends describe the copy using > metadata. For example:Why not simply have something like this? %1 = load bar* %b store %1, bar* %a -K -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Chris Lattner
2012-Aug-31 22:13 UTC
[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
On Aug 30, 2012, at 1:34 PM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:> On 8/22/2012 3:15 PM, Dan Gohman wrote: >> Here's an example showing the basic problem: >> >> struct bar { >> char x; >> float y; >> double z; >> }; >> void copy_bar(struct bar *a, struct bar *b) { >> *a = *b; >> } >> >> We get this IR: >> >> call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8, i1 false) >> >> This works, but it doesn't retain the information that the bytes between fields >> x and y don't really need to be copied, and it doesn't inform the optimizer >> that there are three fields with TBAA-relevant types being copied. >> >> The solution I propose here is to have front-ends describe the copy using >> metadata. For example: > > Why not simply have something like this? > > %1 = load bar* %b > store %1, bar* %aThat preserves an IR type, but not source level types. IR types are insufficient for TBAA or "hole" information. -Chris
Possibly Parallel Threads
- [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
- [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
- [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
- [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information (new version)
- [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information