Ivan Kosarev via llvm-dev
2017-Oct-31 08:37 UTC
[llvm-dev] An ambiguity in TBAA info format
On 31/10/17 01:48, Hal Finkel wrote:> On 10/30/2017 04:57 PM, Ivan Kosarev via llvm-dev wrote: >> Hello, >> >> Consider these two TBAA access tags: >> >> !1 = !{!5, !5, i64 0} >> !3 = !{!7, !7, i64 0} >> >> !5 = !{!"A", !9} >> !7 = !{!"B", !9} > > I'd find this email less confusing if you'd write out all of your > examples using the current TBAA format (not using some forms that will > be auto-upgraded). I don't believe we generate any TBAA fields with > only two operands currently. Instead, we'd generate here: > > !5 = !{!"A", !9, i64 0} > > In the current scheme, scalar edges in the type-aliasing graph look > just like structure fields (just all at offset zero).Correct, but the point is, we still support such two-field forms in input files so the new format shouldn't rely on them.> >> >> The tag !1 describes an access to an object of type "A" and !3 >> describes an access to object of type "B". >> >> Both the type descriptors, !5 and !7, refer to node !9 as their type >> group. A definition of that node could look like this: >> >> !9 = !{"omnipotent char", ...} >> >> We know that these two accesses should be considered no-alias as >> neither of them encloses the other; the least common type group for >> them is !9. TypeBasedAAResult::Aliases() and >> MDNode::getMostGenericTBAA() respond accordingly and all is good. >> >> Then, let's change the definition for the node !9: >> >> !9 = !{"int", ...} >> >> Now it doesn't look like a type group, but rather a structure member. >> And nodes !5 and !7 now look as descriptors for structure types, with >> their offset fields added during auto-upgrade: >> >> !5 = !{!"A", !9, i64 0} >> !7 = !{!"B", !9, i64 0} >> >> We know that, being interpreted as structure accesses, they still >> should be considered no-alias. However, the least common type group >> for these types is likely to be the "omnipotent char" node, but >> certainly not the type of the field, which is "int". > > I'm not sure what "likely" means in this context. > MDNode::getMostGenericTBAA does not currently seem to have a special > case for the first member of structure types. Instead, it does not > deal with them at all. MDNode::getMostGenericTBAA looks as the access > type, which should be a scalar, and collects the paths from those > types up to the root. Then it returns a scalar access tag for the type > which is common, and most distant from the root, along those paths. > > Are you trying to extend this to do something else with the > struct/member information on the accesses?Yes, sorry I didn't mention that MDNode::getMostGenericTBAA() currently only considers final access types. This is supposed to get us closer to the support for aggregate accesses, but at this moment I'm trying to fix MDNode::getMostGenericTBAA() to handle struct-path accesses. We do this in TypeBasedAAResult::Aliases() and I see no reasons why we shouldn't do the same on merging of access tags. Another goal is to use the same function for matching couples of tags and finding most generic tags for them. Here's what it would look like: https://reviews.llvm.org/differential/diff/120871/ But for this we need to able to distinct members from parent types. Thanks,> > -Hal > >> >> The problem is that since the formats for the member-of-structure and >> member-of-type-group relationships match, >> MDNode::getMostGenericTBAA() cannot disambiguate between them and >> always treat first members of structure types as type groups. >> >> To resolve this issue I'm thinking of changing the format of type >> nodes so that all of them, except root ones, refer to their type >> groups with their first operand. The scalar types "A" and "B" >> mentioned above would then be rewritten as: >> >> !5 = !{!9, !"A"} >> !7 = !{!9, !"B"} >> >> !9 = !{..., "omnipotent char"} >> >> and their structure versions would read: >> >> !5 = !{!9, !"A", !11, i64 0} >> !7 = !{!9, !"B", !11, i64 0} >> >> !11 = !{!9, "int"} >> >> The new format can be easily recognized by considering the type of >> the first operand: a string would mean the old format and a metadata >> node would suggest the new convention. >> >> The question to the community is, are there any reasons that wouldn't >> work or not desirable? Or, are there better alternatives to the >> proposed solution? >> >> As usual, any comments are highly appreciated. >> >> Thanks, >> >
Ivan Kosarev via llvm-dev
2017-Oct-31 15:32 UTC
[llvm-dev] An ambiguity in TBAA info format
On 31/10/17 10:37, Ivan Kosarev wrote:> >>> We know that, being interpreted as structure accesses, they still >>> should be considered no-alias. However, the least common type group >>> for these types is likely to be the "omnipotent char" node, but >>> certainly not the type of the field, which is "int". >> >> I'm not sure what "likely" means in this context. >> MDNode::getMostGenericTBAA does not currently seem to have a special >> case for the first member of structure types. Instead, it does not >> deal with them at all. MDNode::getMostGenericTBAA looks as the access >> type, which should be a scalar, and collects the paths from those >> types up to the root. Then it returns a scalar access tag for the >> type which is common, and most distant from the root, along those paths. >> >> Are you trying to extend this to do something else with the >> struct/member information on the accesses? > > Yes, sorry I didn't mention that MDNode::getMostGenericTBAA() > currently only considers final access types. > > This is supposed to get us closer to the support for aggregate > accesses, but at this moment I'm trying to fix > MDNode::getMostGenericTBAA() to handle struct-path accesses.I think you are right, I can do this without changing the format. If neither of the accesses encloses the other, then we can determine the common group for the final access types and build a tag for it. This is what MDNode::getMostGenericTBAA() currently does and it seems to be the right thing to do. OK, will update the diff respectively and put it on review soon. Still, the problem remains if we want to support aggregate final access types as they have no parent types. Furthermore, if we will ever decide that we want a new format for TBAA metadata, then we probably want consider adding information about sizes as mentioned in this patch: https://reviews.llvm.org/D39455 Thanks, --
On 10/31/2017 03:37 AM, Ivan Kosarev wrote:> On 31/10/17 01:48, Hal Finkel wrote: >> On 10/30/2017 04:57 PM, Ivan Kosarev via llvm-dev wrote: >>> Hello, >>> >>> Consider these two TBAA access tags: >>> >>> !1 = !{!5, !5, i64 0} >>> !3 = !{!7, !7, i64 0} >>> >>> !5 = !{!"A", !9} >>> !7 = !{!"B", !9} >> >> I'd find this email less confusing if you'd write out all of your >> examples using the current TBAA format (not using some forms that >> will be auto-upgraded). I don't believe we generate any TBAA fields >> with only two operands currently. Instead, we'd generate here: >> >> !5 = !{!"A", !9, i64 0} >> >> In the current scheme, scalar edges in the type-aliasing graph look >> just like structure fields (just all at offset zero). > > Correct, but the point is, we still support such two-field forms in > input files so the new format shouldn't rely on them.Okay. That makes sense.> >> >>> >>> The tag !1 describes an access to an object of type "A" and !3 >>> describes an access to object of type "B". >>> >>> Both the type descriptors, !5 and !7, refer to node !9 as their type >>> group. A definition of that node could look like this: >>> >>> !9 = !{"omnipotent char", ...} >>> >>> We know that these two accesses should be considered no-alias as >>> neither of them encloses the other; the least common type group for >>> them is !9. TypeBasedAAResult::Aliases() and >>> MDNode::getMostGenericTBAA() respond accordingly and all is good. >>> >>> Then, let's change the definition for the node !9: >>> >>> !9 = !{"int", ...} >>> >>> Now it doesn't look like a type group, but rather a structure >>> member. And nodes !5 and !7 now look as descriptors for structure >>> types, with their offset fields added during auto-upgrade: >>> >>> !5 = !{!"A", !9, i64 0} >>> !7 = !{!"B", !9, i64 0} >>> >>> We know that, being interpreted as structure accesses, they still >>> should be considered no-alias. However, the least common type group >>> for these types is likely to be the "omnipotent char" node, but >>> certainly not the type of the field, which is "int". >> >> I'm not sure what "likely" means in this context. >> MDNode::getMostGenericTBAA does not currently seem to have a special >> case for the first member of structure types. Instead, it does not >> deal with them at all. MDNode::getMostGenericTBAA looks as the access >> type, which should be a scalar, and collects the paths from those >> types up to the root. Then it returns a scalar access tag for the >> type which is common, and most distant from the root, along those paths. >> >> Are you trying to extend this to do something else with the >> struct/member information on the accesses? > > Yes, sorry I didn't mention that MDNode::getMostGenericTBAA() > currently only considers final access types. > > This is supposed to get us closer to the support for aggregate > accesses, but at this moment I'm trying to fix > MDNode::getMostGenericTBAA() to handle struct-path accesses. We do > this in TypeBasedAAResult::Aliases() and I see no reasons why we > shouldn't do the same on merging of access tags.Okay. Please describe the algorithm you'd like to implement. If you have: !0 = {!"root"} !1 = {!"char", !0, i64 0 } !2 = {!"Inner", !1, i64 0 } // is this a struct with one field or a type? !3 = {!"Outer", !2 ,i64 0 } // and two access tags: !8 = {!2, !1, i64 0} !9 = {!3, !1, i64 0} Currently, if we ask for the most-generic TBAA, we look at the access types only, and so in this case trivially return: !10 - {!1, !1, i64 0} But you want to return !8, because by walking up from !9 through the base type !3, adjusting for the offset (which is trivial in this case), we arrive at !2 @ offset 0. That's the same place in the graph as !8 (which is !2 at offset 0), and so that's the common access tag. This seems to work without ambiguity, so I'm not seeing the problem. -Hal> Another goal is to use the same function for matching couples of tags > and finding most generic tags for them. Here's what it would look like: > > https://reviews.llvm.org/differential/diff/120871/ > > But for this we need to able to distinct members from parent types. > > Thanks, > > >> >> -Hal >> >>> >>> The problem is that since the formats for the member-of-structure >>> and member-of-type-group relationships match, >>> MDNode::getMostGenericTBAA() cannot disambiguate between them and >>> always treat first members of structure types as type groups. >>> >>> To resolve this issue I'm thinking of changing the format of type >>> nodes so that all of them, except root ones, refer to their type >>> groups with their first operand. The scalar types "A" and "B" >>> mentioned above would then be rewritten as: >>> >>> !5 = !{!9, !"A"} >>> !7 = !{!9, !"B"} >>> >>> !9 = !{..., "omnipotent char"} >>> >>> and their structure versions would read: >>> >>> !5 = !{!9, !"A", !11, i64 0} >>> !7 = !{!9, !"B", !11, i64 0} >>> >>> !11 = !{!9, "int"} >>> >>> The new format can be easily recognized by considering the type of >>> the first operand: a string would mean the old format and a metadata >>> node would suggest the new convention. >>> >>> The question to the community is, are there any reasons that >>> wouldn't work or not desirable? Or, are there better alternatives to >>> the proposed solution? >>> >>> As usual, any comments are highly appreciated. >>> >>> Thanks, >>> >> >-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
Ivan Kosarev via llvm-dev
2017-Nov-02 16:12 UTC
[llvm-dev] An ambiguity in TBAA info format
On 02/11/17 05:41, Hal Finkel wrote:> > On 10/31/2017 03:37 AM, Ivan Kosarev wrote: > >> >>> >>>> >>>> The tag !1 describes an access to an object of type "A" and !3 >>>> describes an access to object of type "B". >>>> >>>> Both the type descriptors, !5 and !7, refer to node !9 as their >>>> type group. A definition of that node could look like this: >>>> >>>> !9 = !{"omnipotent char", ...} >>>> >>>> We know that these two accesses should be considered no-alias as >>>> neither of them encloses the other; the least common type group for >>>> them is !9. TypeBasedAAResult::Aliases() and >>>> MDNode::getMostGenericTBAA() respond accordingly and all is good. >>>> >>>> Then, let's change the definition for the node !9: >>>> >>>> !9 = !{"int", ...} >>>> >>>> Now it doesn't look like a type group, but rather a structure >>>> member. And nodes !5 and !7 now look as descriptors for structure >>>> types, with their offset fields added during auto-upgrade: >>>> >>>> !5 = !{!"A", !9, i64 0} >>>> !7 = !{!"B", !9, i64 0} >>>> >>>> We know that, being interpreted as structure accesses, they still >>>> should be considered no-alias. However, the least common type group >>>> for these types is likely to be the "omnipotent char" node, but >>>> certainly not the type of the field, which is "int". >>> >>> I'm not sure what "likely" means in this context. >>> MDNode::getMostGenericTBAA does not currently seem to have a special >>> case for the first member of structure types. Instead, it does not >>> deal with them at all. MDNode::getMostGenericTBAA looks as the >>> access type, which should be a scalar, and collects the paths from >>> those types up to the root. Then it returns a scalar access tag for >>> the type which is common, and most distant from the root, along >>> those paths. >>> >>> Are you trying to extend this to do something else with the >>> struct/member information on the accesses? >> >> Yes, sorry I didn't mention that MDNode::getMostGenericTBAA() >> currently only considers final access types. >> >> This is supposed to get us closer to the support for aggregate >> accesses, but at this moment I'm trying to fix >> MDNode::getMostGenericTBAA() to handle struct-path accesses. We do >> this in TypeBasedAAResult::Aliases() and I see no reasons why we >> shouldn't do the same on merging of access tags. > > Okay. Please describe the algorithm you'd like to implement. > > If you have: > > !0 = {!"root"} > !1 = {!"char", !0, i64 0 } > !2 = {!"Inner", !1, i64 0 } // is this a struct with one field or a type? > !3 = {!"Outer", !2 ,i64 0 } > > // and two access tags: > !8 = {!2, !1, i64 0} > !9 = {!3, !1, i64 0} > > Currently, if we ask for the most-generic TBAA, we look at the access > types only, and so in this case trivially return: > !10 - {!1, !1, i64 0} > > But you want to return !8, because by walking up from !9 through the > base type !3, adjusting for the offset (which is trivial in this > case), we arrive at !2 @ offset 0. That's the same place in the graph > as !8 (which is !2 at offset 0), and so that's the common access tag. > > This seems to work without ambiguity, so I'm not seeing the problem.Yes, that is what I wanted. I just uploaded a patch that (hopefully) does exactly what you described: https://reviews.llvm.org/D39557 I agree we can do that as long as all final access types are scalar, but it wouldn't work if we want to support aggregate accesses. As you suggested, I will address this issue in the parallel thread. Thanks,> > -Hal > >> Another goal is to use the same function for matching couples of tags >> and finding most generic tags for them. Here's what it would look like: >> >> https://reviews.llvm.org/differential/diff/120871/ >> >> But for this we need to able to distinct members from parent types. >> >> Thanks, >> >> >>> >>> -Hal >>> >>>> >>>> The problem is that since the formats for the member-of-structure >>>> and member-of-type-group relationships match, >>>> MDNode::getMostGenericTBAA() cannot disambiguate between them and >>>> always treat first members of structure types as type groups. >>>> >>>> To resolve this issue I'm thinking of changing the format of type >>>> nodes so that all of them, except root ones, refer to their type >>>> groups with their first operand. The scalar types "A" and "B" >>>> mentioned above would then be rewritten as: >>>> >>>> !5 = !{!9, !"A"} >>>> !7 = !{!9, !"B"} >>>> >>>> !9 = !{..., "omnipotent char"} >>>> >>>> and their structure versions would read: >>>> >>>> !5 = !{!9, !"A", !11, i64 0} >>>> !7 = !{!9, !"B", !11, i64 0} >>>> >>>> !11 = !{!9, "int"} >>>> >>>> The new format can be easily recognized by considering the type of >>>> the first operand: a string would mean the old format and a >>>> metadata node would suggest the new convention. >>>> >>>> The question to the community is, are there any reasons that >>>> wouldn't work or not desirable? Or, are there better alternatives >>>> to the proposed solution? >>>> >>>> As usual, any comments are highly appreciated. >>>> >>>> Thanks, >>>> >>> >> >--