Peter Collingbourne via llvm-dev
2016-Jun-01 22:48 UTC
[llvm-dev] RFC: a renaming/redesign for LLVM's bitset metadata
Hi all, The bitset metadata currently used in LLVM has a few problems: 1. It has the wrong name. The name "bitset" refers to an implementation detail of one use of the metadata (i.e. its original use case, CFI). This makes it harder to understand, as the name makes no sense in the context of virtual call optimization. 2. It is represented using a global named metadata node, rather than being directly associated with a global. This makes it harder to manipulate the metadata when rebuilding global variables, summarise it as part of ThinLTO and drop unused metadata when associated globals are dropped. For this reason, CFI does not currently work correctly when both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable globals, and fails to associate metadata with the rebuilt globals. As I understand it, the same problem could also affect ASan, which rebuilds globals with a red zone. I would like to solve both of these problems in the following way: 1. Rename the metadata to "type metadata". This new name reflects how the metadata is currently being used (i.e. to represent type information for CFI and vtable opt). 2. Attach metadata directly to the globals that it pertains to, rather than using the "llvm.bitsets" global metadata node as we are doing now. This would be done using the newly introduced capability to attach metadata to global variables (r271348 and r271358). Passes which manipulate globals can easily copy metadata between globals with the GlobalObject::copyMetadata function, which would be taught to understand type metadata. To give an example of how this would look, suppose that we have the following declarations: class A { virtual void f() {} }; class B : public A { virtual void f() {} virtual void g() {} }; The vtables for A and B would be represented in IR like this: @_ZTV1A = constant [3 x i8*] [i8* ..., i8* ..., i8* @A::f], !type !0 @_ZTV1B = constant [4 x i8*] [i8* ..., i8* ..., i8* @B::f, i8* @B::g], type !0, !type !1 !0 = {i64 16, !"A"} !1 = {i64 16, !"B"} The metadata !0 indicates that the attached global has an address point for the type A at byte offset 16, and metadata !1 indicates that the attached global has an address point for the type B at byte offset 16. We attach !0 to _ZTV1A, which indicates that the vtable for A has a valid address point for A at offset 16, and attach both !0 and !1 to _ZTV1B, which indicates that the vtable for B has a valid address point for both A and B at offset 16. I also plan to apply this renaming to existing passes and intrinsics that use the "bitset" name. Thanks, -- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160601/4691fc76/attachment.html>
Mehdi Amini via llvm-dev
2016-Jun-03 01:20 UTC
[llvm-dev] RFC: a renaming/redesign for LLVM's bitset metadata
Hi, I don't have much comment, but still want to say that I think it is a worthwhile move. Thanks! -- Mehdi> On Jun 1, 2016, at 3:48 PM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > Hi all, > > The bitset metadata currently used in LLVM has a few problems: > > 1. It has the wrong name. The name "bitset" refers to an implementation detail of one use of the metadata (i.e. its original use case, CFI). This makes it harder to understand, as the name makes no sense in the context of virtual call optimization. > 2. It is represented using a global named metadata node, rather than being directly associated with a global. This makes it harder to manipulate the metadata when rebuilding global variables, summarise it as part of ThinLTO and drop unused metadata when associated globals are dropped. For this reason, CFI does not currently work correctly when both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable globals, and fails to associate metadata with the rebuilt globals. As I understand it, the same problem could also affect ASan, which rebuilds globals with a red zone. > > I would like to solve both of these problems in the following way: > > 1. Rename the metadata to "type metadata". This new name reflects how the metadata is currently being used (i.e. to represent type information for CFI and vtable opt). > 2. Attach metadata directly to the globals that it pertains to, rather than using the "llvm.bitsets" global metadata node as we are doing now. This would be done using the newly introduced capability to attach metadata to global variables (r271348 and r271358). Passes which manipulate globals can easily copy metadata between globals with the GlobalObject::copyMetadata function, which would be taught to understand type metadata. > > To give an example of how this would look, suppose that we have the following declarations: > > class A { > virtual void f() {} > }; > > class B : public A { > virtual void f() {} > virtual void g() {} > }; > > The vtables for A and B would be represented in IR like this: > > @_ZTV1A = constant [3 x i8*] [i8* ..., i8* ..., i8* @A::f], !type !0 > @_ZTV1B = constant [4 x i8*] [i8* ..., i8* ..., i8* @B::f, i8* @B::g], type !0, !type !1 > > !0 = {i64 16, !"A"} > !1 = {i64 16, !"B"} > > The metadata !0 indicates that the attached global has an address point for the type A at byte offset 16, and metadata !1 indicates that the attached global has an address point for the type B at byte offset 16. We attach !0 to _ZTV1A, which indicates that the vtable for A has a valid address point for A at offset 16, and attach both !0 and !1 to _ZTV1B, which indicates that the vtable for B has a valid address point for both A and B at offset 16. > > I also plan to apply this renaming to existing passes and intrinsics that use the "bitset" name. > > Thanks, > -- > -- > Peter
Mehdi Amini via llvm-dev
2016-Jun-07 03:16 UTC
[llvm-dev] RFC: a renaming/redesign for LLVM's bitset metadata
Hi Peter, While we're at it, do you have a plan for ThinLTO summaries? I remember you described doing full LTO for the Vtables themselves, but that was for CFI, not for devirtualization. -- Mehdi> On Jun 1, 2016, at 3:48 PM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > Hi all, > > The bitset metadata currently used in LLVM has a few problems: > > 1. It has the wrong name. The name "bitset" refers to an implementation detail of one use of the metadata (i.e. its original use case, CFI). This makes it harder to understand, as the name makes no sense in the context of virtual call optimization. > 2. It is represented using a global named metadata node, rather than being directly associated with a global. This makes it harder to manipulate the metadata when rebuilding global variables, summarise it as part of ThinLTO and drop unused metadata when associated globals are dropped. For this reason, CFI does not currently work correctly when both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable globals, and fails to associate metadata with the rebuilt globals. As I understand it, the same problem could also affect ASan, which rebuilds globals with a red zone. > > I would like to solve both of these problems in the following way: > > 1. Rename the metadata to "type metadata". This new name reflects how the metadata is currently being used (i.e. to represent type information for CFI and vtable opt). > 2. Attach metadata directly to the globals that it pertains to, rather than using the "llvm.bitsets" global metadata node as we are doing now. This would be done using the newly introduced capability to attach metadata to global variables (r271348 and r271358). Passes which manipulate globals can easily copy metadata between globals with the GlobalObject::copyMetadata function, which would be taught to understand type metadata. > > To give an example of how this would look, suppose that we have the following declarations: > > class A { > virtual void f() {} > }; > > class B : public A { > virtual void f() {} > virtual void g() {} > }; > > The vtables for A and B would be represented in IR like this: > > @_ZTV1A = constant [3 x i8*] [i8* ..., i8* ..., i8* @A::f], !type !0 > @_ZTV1B = constant [4 x i8*] [i8* ..., i8* ..., i8* @B::f, i8* @B::g], type !0, !type !1 > > !0 = {i64 16, !"A"} > !1 = {i64 16, !"B"} > > The metadata !0 indicates that the attached global has an address point for the type A at byte offset 16, and metadata !1 indicates that the attached global has an address point for the type B at byte offset 16. We attach !0 to _ZTV1A, which indicates that the vtable for A has a valid address point for A at offset 16, and attach both !0 and !1 to _ZTV1B, which indicates that the vtable for B has a valid address point for both A and B at offset 16. > > I also plan to apply this renaming to existing passes and intrinsics that use the "bitset" name. > > Thanks, > -- > -- > Peter
Peter Collingbourne via llvm-dev
2016-Jun-07 03:31 UTC
[llvm-dev] RFC: a renaming/redesign for LLVM's bitset metadata
I had a writeup of the overall approach here. http://lists.llvm.org/pipermail/llvm-dev/2016-May/099095.html For both devirtualization and CFI, we would indeed do full LTO for the vtables. Peter On Mon, Jun 6, 2016 at 8:16 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:> Hi Peter, > > While we're at it, do you have a plan for ThinLTO summaries? I remember > you described doing full LTO for the Vtables themselves, but that was for > CFI, not for devirtualization. > > -- > Mehdi > > > > On Jun 1, 2016, at 3:48 PM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > > > Hi all, > > > > The bitset metadata currently used in LLVM has a few problems: > > > > 1. It has the wrong name. The name "bitset" refers to an implementation > detail of one use of the metadata (i.e. its original use case, CFI). This > makes it harder to understand, as the name makes no sense in the context of > virtual call optimization. > > 2. It is represented using a global named metadata node, rather than > being directly associated with a global. This makes it harder to manipulate > the metadata when rebuilding global variables, summarise it as part of > ThinLTO and drop unused metadata when associated globals are dropped. For > this reason, CFI does not currently work correctly when both CFI and vcall > opt are enabled, as vcall opt needs to rebuild vtable globals, and fails to > associate metadata with the rebuilt globals. As I understand it, the same > problem could also affect ASan, which rebuilds globals with a red zone. > > > > I would like to solve both of these problems in the following way: > > > > 1. Rename the metadata to "type metadata". This new name reflects how > the metadata is currently being used (i.e. to represent type information > for CFI and vtable opt). > > 2. Attach metadata directly to the globals that it pertains to, rather > than using the "llvm.bitsets" global metadata node as we are doing now. > This would be done using the newly introduced capability to attach metadata > to global variables (r271348 and r271358). Passes which manipulate globals > can easily copy metadata between globals with the > GlobalObject::copyMetadata function, which would be taught to understand > type metadata. > > > > To give an example of how this would look, suppose that we have the > following declarations: > > > > class A { > > virtual void f() {} > > }; > > > > class B : public A { > > virtual void f() {} > > virtual void g() {} > > }; > > > > The vtables for A and B would be represented in IR like this: > > > > @_ZTV1A = constant [3 x i8*] [i8* ..., i8* ..., i8* @A::f], !type !0 > > @_ZTV1B = constant [4 x i8*] [i8* ..., i8* ..., i8* @B::f, i8* @B::g], > type !0, !type !1 > > > > !0 = {i64 16, !"A"} > > !1 = {i64 16, !"B"} > > > > The metadata !0 indicates that the attached global has an address point > for the type A at byte offset 16, and metadata !1 indicates that the > attached global has an address point for the type B at byte offset 16. We > attach !0 to _ZTV1A, which indicates that the vtable for A has a valid > address point for A at offset 16, and attach both !0 and !1 to _ZTV1B, > which indicates that the vtable for B has a valid address point for both A > and B at offset 16. > > > > I also plan to apply this renaming to existing passes and intrinsics > that use the "bitset" name. > > > > Thanks, > > -- > > -- > > Peter > >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160606/8dba5ced/attachment.html>
Philip Reames via llvm-dev
2016-Jun-09 04:24 UTC
[llvm-dev] RFC: a renaming/redesign for LLVM's bitset metadata
On 06/01/2016 03:48 PM, Peter Collingbourne via llvm-dev wrote:> Hi all, > > The bitset metadata currently used in LLVM has a few problems: > > 1. It has the wrong name. The name "bitset" refers to an > implementation detail of one use of the metadata (i.e. its original > use case, CFI). This makes it harder to understand, as the name makes > no sense in the context of virtual call optimization. > 2. It is represented using a global named metadata node, rather than > being directly associated with a global. This makes it harder to > manipulate the metadata when rebuilding global variables, summarise it > as part of ThinLTO and drop unused metadata when associated globals > are dropped. For this reason, CFI does not currently work correctly > when both CFI and vcall opt are enabled, as vcall opt needs to rebuild > vtable globals, and fails to associate metadata with the rebuilt > globals. As I understand it, the same problem could also affect ASan, > which rebuilds globals with a red zone. > > I would like to solve both of these problems in the following way: > > 1. Rename the metadata to "type metadata". This new name reflects how > the metadata is currently being used (i.e. to represent type > information for CFI and vtable opt). > 2. Attach metadata directly to the globals that it pertains to, rather > than using the "llvm.bitsets" global metadata node as we are doing > now. This would be done using the newly introduced capability to > attach metadata to global variables (r271348 and r271358). Passes > which manipulate globals can easily copy metadata between globals with > the GlobalObject::copyMetadata function, which would be taught to > understand type metadata. > > To give an example of how this would look, suppose that we have the > following declarations: > > class A { > virtual void f() {} > }; > > class B : public A { > virtual void f() {} > virtual void g() {} > }; > > The vtables for A and B would be represented in IR like this: > > @_ZTV1A = constant [3 x i8*] [i8* ..., i8* ..., i8* @A::f], !type !0 > @_ZTV1B = constant [4 x i8*] [i8* ..., i8* ..., i8* @B::f, i8* @B::g], > type !0, !type !1 > > !0 = {i64 16, !"A"} > !1 = {i64 16, !"B"} > > The metadata !0 indicates that the attached global has an address > point for the type A at byte offset 16, and metadata !1 indicates that > the attached global has an address point for the type B at byte offset > 16. We attach !0 to _ZTV1A, which indicates that the vtable for A has > a valid address point for A at offset 16, and attach both !0 and !1 to > _ZTV1B, which indicates that the vtable for B has a valid address > point for both A and B at offset 16.Can you define "address point"? I don't recognize the term and thus didn't understand the example. FTR, I have no stance on the proposal.> > I also plan to apply this renaming to existing passes and intrinsics > that use the "bitset" name. > > Thanks, > -- > -- > Peter > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160608/840c5cbb/attachment.html>
Peter Collingbourne via llvm-dev
2016-Jun-09 04:31 UTC
[llvm-dev] RFC: a renaming/redesign for LLVM's bitset metadata
On Wed, Jun 8, 2016 at 9:24 PM, Philip Reames <listmail at philipreames.com> wrote:> > > On 06/01/2016 03:48 PM, Peter Collingbourne via llvm-dev wrote: > > Hi all, > > The bitset metadata currently used in LLVM has a few problems: > > 1. It has the wrong name. The name "bitset" refers to an implementation > detail of one use of the metadata (i.e. its original use case, CFI). This > makes it harder to understand, as the name makes no sense in the context of > virtual call optimization. > 2. It is represented using a global named metadata node, rather than being > directly associated with a global. This makes it harder to manipulate the > metadata when rebuilding global variables, summarise it as part of ThinLTO > and drop unused metadata when associated globals are dropped. For this > reason, CFI does not currently work correctly when both CFI and vcall opt > are enabled, as vcall opt needs to rebuild vtable globals, and fails to > associate metadata with the rebuilt globals. As I understand it, the same > problem could also affect ASan, which rebuilds globals with a red zone. > > I would like to solve both of these problems in the following way: > > 1. Rename the metadata to "type metadata". This new name reflects how the > metadata is currently being used (i.e. to represent type information for > CFI and vtable opt). > 2. Attach metadata directly to the globals that it pertains to, rather > than using the "llvm.bitsets" global metadata node as we are doing now. > This would be done using the newly introduced capability to attach metadata > to global variables (r271348 and r271358). Passes which manipulate globals > can easily copy metadata between globals with the > GlobalObject::copyMetadata function, which would be taught to understand > type metadata. > > To give an example of how this would look, suppose that we have the > following declarations: > > class A { > virtual void f() {} > }; > > class B : public A { > virtual void f() {} > virtual void g() {} > }; > > The vtables for A and B would be represented in IR like this: > > @_ZTV1A = constant [3 x i8*] [i8* ..., i8* ..., i8* @A::f], !type !0 > @_ZTV1B = constant [4 x i8*] [i8* ..., i8* ..., i8* @B::f, i8* @B::g], > type !0, !type !1 > > !0 = {i64 16, !"A"} > !1 = {i64 16, !"B"} > > The metadata !0 indicates that the attached global has an address point > for the type A at byte offset 16, and metadata !1 indicates that the > attached global has an address point for the type B at byte offset 16. We > attach !0 to _ZTV1A, which indicates that the vtable for A has a valid > address point for A at offset 16, and attach both !0 and !1 to _ZTV1B, > which indicates that the vtable for B has a valid address point for both A > and B at offset 16. > > Can you define "address point"? I don't recognize the term and thus > didn't understand the example. >A vtable's address point is the address within a vtable that is stored in an object's virtual pointer field. It is generally the address of the first virtual function pointer. The Itanium ABI defines it here: https://mentorembedded.github.io/cxx-abi/abi.html#vtable-general Peter> > FTR, I have no stance on the proposal. > > > I also plan to apply this renaming to existing passes and intrinsics that > use the "bitset" name. > > Thanks, > -- > -- > Peter > > > _______________________________________________ > LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160608/49831600/attachment.html>