Chris Lattner
2012-Aug-28 05:15 UTC
[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
On Aug 27, 2012, at 12:58 PM, Hal Finkel <hfinkel at anl.gov> wrote:> On Mon, 27 Aug 2012 11:41:47 -0700 > Dan Gohman <gohman at apple.com> wrote: >> On Aug 24, 2012, at 10:50 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> >>> On Wed, 22 Aug 2012 13:15:30 -0700 >>> Dan Gohman <gohman at apple.com> wrote: >>> >>>> call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 >>>> 8, i1 false), !struct.assignment !4 […] >>> >>> I think that it would make more sense to name this !struct.tbaa -- >>> it seems logically similar to existing TBAA metadata (in that it is >>> attached to the relevant load/store instruction). >> >> How about !tbaa.struct, kicking off a prefix-namespace idiom? > > Fine by me.Another alternative would be to separate them into two different MDNodes: one for TBAA info, one for padding info.>> On the other hand, TBAA is only half the story here; the other half is >> describing struct padding. > > By this you mean that it is possible to determine which parts of the > copy are padding because we now have direct access to the pointer type > information, right? Out of curiosity, is there a reason not to change > the memcpy to take non-integer pointers? If memcpy took the > struct pointer without needing the bitcast, then we'd have this > information directly.This is an interesting idea, particularly given that you're representing struct information with a null pointer. I think that the limiting issue here is that intrinsics cannot encode/mangle general structs in their type signature. What do you think Dan? -Chris
Dan Gohman
2012-Aug-28 18:46 UTC
[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
On Aug 27, 2012, at 10:15 PM, Chris Lattner <clattner at apple.com> wrote:> On Aug 27, 2012, at 12:58 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> On Mon, 27 Aug 2012 11:41:47 -0700 >> Dan Gohman <gohman at apple.com> wrote: >>> On Aug 24, 2012, at 10:50 PM, Hal Finkel <hfinkel at anl.gov> wrote: >>> >>>> On Wed, 22 Aug 2012 13:15:30 -0700 >>>> Dan Gohman <gohman at apple.com> wrote: >>>> >>>>> call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 >>>>> 8, i1 false), !struct.assignment !4 […] >>>> >>>> I think that it would make more sense to name this !struct.tbaa -- >>>> it seems logically similar to existing TBAA metadata (in that it is >>>> attached to the relevant load/store instruction). >>> >>> How about !tbaa.struct, kicking off a prefix-namespace idiom? >> >> Fine by me. > > Another alternative would be to separate them into two different MDNodes: one for TBAA info, one for padding info. > >>> On the other hand, TBAA is only half the story here; the other half is >>> describing struct padding. >> >> By this you mean that it is possible to determine which parts of the >> copy are padding because we now have direct access to the pointer type >> information, right? Out of curiosity, is there a reason not to change >> the memcpy to take non-integer pointers? If memcpy took the >> struct pointer without needing the bitcast, then we'd have this >> information directly. > > This is an interesting idea, particularly given that you're representing struct information with a null pointer. I think that the limiting issue here is that intrinsics cannot encode/mangle general structs in their type signature. What do you think Dan?It's an interesting idea. I can see both sides. The current (proposed) approach is that @llvm.memcpy does a full memcpy, unless decorated by metadata, which can specify that some of the bytes are undefined. There is an appeal in having a base IR which is simple, with metadata for making it more aggressive in tricky ways. I can also see the appeal of overloading @llvm.memcpy with different pointee types. An interesting issue is the semantics in the case where the pointee types have holes. For example: %struct = type { i8, i32 } call void @llvm.memcpy.mangledstuff(%struct* %dst, %struct* %src, i64 1, i64 4) It should always be valid to lower @llvm.memcpy to an actual memcpy call, but we also want it to be valid to lower this to member-wise loads and stores. Consequently, bytes corresponding to padding are set to some kind of undef. That's probably fine, as long as everyone's ok with that. Another interesting question is how should the size argument work. Should it be a byte count, or should it be scaled by the size of the pointee type? Scaling it seems to be intuitive, and it avoids awkwardness in the case where the size somehow isn't a multiple of the pointee size. Similarly, how should the alignment argument work? In this case, scaling it by the alignment of the pointee seems awkward. In the example above, I showed the size scaled and the alignment left unscaled. Dan
Hal Finkel
2012-Aug-29 21:45 UTC
[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
On Tue, 28 Aug 2012 11:46:39 -0700 Dan Gohman <gohman at apple.com> wrote:> > On Aug 27, 2012, at 10:15 PM, Chris Lattner <clattner at apple.com> > wrote: > > > On Aug 27, 2012, at 12:58 PM, Hal Finkel <hfinkel at anl.gov> wrote: > >> On Mon, 27 Aug 2012 11:41:47 -0700 > >> Dan Gohman <gohman at apple.com> wrote: > >>> On Aug 24, 2012, at 10:50 PM, Hal Finkel <hfinkel at anl.gov> wrote: > >>> > >>>> On Wed, 22 Aug 2012 13:15:30 -0700 > >>>> Dan Gohman <gohman at apple.com> wrote: > >>>> > >>>>> call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 > >>>>> 8, i1 false), !struct.assignment !4 […] > >>>> > >>>> I think that it would make more sense to name this !struct.tbaa > >>>> -- it seems logically similar to existing TBAA metadata (in that > >>>> it is attached to the relevant load/store instruction). > >>> > >>> How about !tbaa.struct, kicking off a prefix-namespace idiom? > >> > >> Fine by me. > > > > Another alternative would be to separate them into two different > > MDNodes: one for TBAA info, one for padding info. > > > >>> On the other hand, TBAA is only half the story here; the other > >>> half is describing struct padding. > >> > >> By this you mean that it is possible to determine which parts of > >> the copy are padding because we now have direct access to the > >> pointer type information, right? Out of curiosity, is there a > >> reason not to change the memcpy to take non-integer pointers? If > >> memcpy took the struct pointer without needing the bitcast, then > >> we'd have this information directly. > > > > This is an interesting idea, particularly given that you're > > representing struct information with a null pointer. I think that > > the limiting issue here is that intrinsics cannot encode/mangle > > general structs in their type signature. What do you think Dan? > > It's an interesting idea. I can see both sides. > > The current (proposed) approach is that @llvm.memcpy does a full > memcpy, unless decorated by metadata, which can specify that some of > the bytes are undefined. There is an appeal in having a base IR which > is simple, with metadata for making it more aggressive in tricky ways. > > I can also see the appeal of overloading @llvm.memcpy with different > pointee types. An interesting issue is the semantics in the case > where the pointee types have holes. For example: > > %struct = type { i8, i32 } > > call void @llvm.memcpy.mangledstuff(%struct* %dst, %struct* %src, > i64 1, i64 4) > > It should always be valid to lower @llvm.memcpy to an actual memcpy > call, but we also want it to be valid to lower this to member-wise > loads and stores. Consequently, bytes corresponding to padding are > set to some kind of undef. That's probably fine, as long as > everyone's ok with that. > > Another interesting question is how should the size argument work. > Should it be a byte count, or should it be scaled by the size of the > pointee type? Scaling it seems to be intuitive, and it avoids > awkwardness in the case where the size somehow isn't a multiple of > the pointee size.I agree that it seems intuitive, but what would we do with the current intrinsic? Having a rule like, "the size is specified in bytes if the pointer type is an integer, and is specified in elements otherwise" seems sure to yield many bugs. -Hal> > Similarly, how should the alignment argument work? In this case, > scaling it by the alignment of the pointee seems awkward. In the > example above, I showed the size scaled and the alignment left > unscaled. > > Dan >-- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
Reasonably Related Threads
- [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
- [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
- [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
- [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information (new version)
- [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information (new version)