thr3ads.net - llvm dev - [llvm-dev] RFC: Safe Whole Program Devirtualization Enablement [Dec 2019]

If this information is useful, please help other people find it:
Share via:

Teresa Johnson via llvm-dev

2019-Dec-11 14:21 UTC

[llvm-dev] RFC: Safe Whole Program Devirtualization Enablement

Please send any comments. As mentioned at the end I will follow up with
some patches as soon as they are cleaned up and I create some test cases.

RFC: Safe Whole Program Devirtualization Enablement
==================================================
High Level Summary
------------------

The goal of the changes described in this RFC is to support aggressive
Whole Program Devirtualization without requiring -fvisibility=hidden at
compile time, by pre-enabling bitcode for whole program devirtualization,
but delaying the decision on whether to apply devirtualization until LTO
link time. This is needed both because we may not know whether the link
mode is safe for hidden LTO visibility until link time, and also to allow
bitcode objects to be shared between links of targets with differing valid
LTO visibility. This utilizes the !vcall_visibility metadata added for Dead
Virtual Function Elimination.

The summary of changes required are (these are described in more detail
later):

1) When -fwhole-program-vtables is specified, always insert type test
assumes for virtual calls, and additionally add !vcall_visibility metadata
to vtable definitions (which will be summarized in the ThinLTO index).

2) At LTO link time, apply hidden LTO visibility to vtable definition
vcall_visibility metadata (or summary) when specified by a new link option
(-lto-whole-program-visibility).

3) During the LTO link time Whole Program Devirtualization analysis, only
allow devirtualization when the associated vtable definitions have hidden
LTO visibility, as derived from the !vcall_visibility metadata (summarized
in the index for index-only WPD).

4) Modify the Virtual Function Elimination application in GlobalDCE to
ignore vtables with !vcall_visibility when they are associated with type
tests (and not just type checked loads).

Background
----------

Whole Program Devirtualization is supported for LTO (both regular and Thin)
via the -fwhole-program-vtables option. However, it can only be safely
applied to classes for which LTO can analyze the entire class hierarchy,
and therefore is restricted to those classes with hidden LTO visibility.
See https://clang.llvm.org/docs/LTOVisibility.html for more information.

The LTO visibility of a class is derived at compile time from the class’s
symbol visibility. Generally, only classes that are internal at the source
level (e.g. declared in an anonymous namespace) receive hidden LTO
visibility. Compiling with -fvisibility=hidden tells the compiler that,
unless otherwise marked, symbols are assumed to have hidden visibility,
which also implies that all classes have hidden LTO visibility (unless
decorated with a public visibility attribute). This results in much more
aggressive devirtualization.

However, compiling with -fvisibility=hidden is only safe when we know we
are LTO linking with full view of the class hierarchy. Specifically, this
is true when a binary is being LTO linked with either all sources being
bitcode (so that the LTO unit is the same as the linkage unit), or when the
only translation units being linked as native code are known to not derive
any classes defined in the LTO unit (e.g. system libraries). Additionally,
the binary may not dlopen any libraries at runtime that contain classes
derived from those defined in the main binary.

Assuming we are building and linking a binary that satisfies the above
constraints (we are LTO linking all translation units as bitcode, except
certain (e.g. system) libraries or other native objects known to be safe by
the user or build system, and the binary will not dlopen any libraries
deriving from the binary’s classes), then it should be safe to compile with
-fvisibility=hidden, along with -fwhole-program-vtables.

However, there are cases where it is unknown until link time whether we are
building a target that meets the above constraints. Additionally, we may
want to build additional targets that do not meet the criteria for safe
application of -fvisibility=hidden during the same build invocation
(specifically, because subsets of the code will be linked into shared
libraries instead of linking all code directly into the binary). Even if
possible to build two sets of bitcode object files (one with default
visibility for the unsafely linked targets and one with hidden visibility
for the safely linked targets), this causes duplication in both time and
space, which is prohibitive in an environment where it is common to build
targets with tens of thousands of sources, and multiple targets with
different link modes simultaneously.

The goals of the changes described in this RFC are to essentially delay the
application of -fvisibility=hidden until LTO link time, and allow bitcode
objects to be shared between links of targets with differing link modes and
therefore differing valid LTO visibility.

Type Information for Devirtualization
-------------------------------------

LTO whole program devirtualization is driven off of type information in the
IR. This includes type metadata (on vtable definitions), as well as type
test intrinsics before virtual calls. The former is safe to emit into the
IR in all cases, but the latter is currently not. The virtual call sites
are decorated with an llvm.assume(llvm.type.test(ptr, typeid)) sequence,
which drives the LTO analysis of virtual calls. This sequence is an
assertion that the given pointer is associated with the given type
identifier (https://llvm.org/docs/LangRef.html#llvm-type-test-intrinsic).
It is currently inserted only for classes with hidden LTO visibility as the
implication of this sequence is that we have full visibility of that type’s
class hierarchy, and may devirtualize the call based on that knowledge.
This assumption is not valid if the class does not have hidden LTO
visibility.

In order to drive later devirtualization, we still need the type
compatibility information provided by the llvm.type.test, but want to delay
a decision on whether it is valid to assume that we have full class
hierarchy visibility, and thus whether devirtualization of that target can
be safely applied.

Specifically, what we want to know at LTO time is whether the vtable has
hidden LTO visibility or not, and use that to guide the application of
devirtualization to the type tested virtual call sites. By default, only
those with statically guaranteed hidden LTO visibility should be marked as
such. And as described later, at LTO link time we can optionally decide to
convert vtables to hidden LTO visibility for more aggressive
devirtualization when appropriate.

There is already a mechanism in the compiler to describe the vtable
visibility, which was recently added for Dead Virtual Function Elimination
(D63932): !vcall_visibility metadata, documented at
https://llvm.org/docs/TypeMetadata.html#vcall-visibility-metadata. This
metadata is attached to vtable definitions, currently only when VFE is
enabled. As described in the documentation, because this is currently only
used for VFE, it also requires that the corresponding function pointer
loads use the llvm.type.checked.load intrinsic. This would not be required
for devirtualization (although the VFE support in GlobalDCE will need
modification to ignore the metadata when type checked loads not used, more
on that later).

This RFC proposes adding the !vcall_visibility metadata to vtable
definitions when -fwhole-program-vtables is specified. Unlike for VFE, the
function pointer loads can still use normal loads with corresponding type
test assume sequences (better for optimization).

Additional changes to the LTO compilation steps are detailed below.

Pre-Link LTO Compile
--------------------

First, type test assume sequences will be inserted when
-fwhole-program-vtables is specified, and not just for classes with hidden
LTO visibility.

Second, as mentioned earlier, the !vcall_visibility metadata will be
inserted under -fwhole-program-vtables. For the purposes of index-only WPD,
a single-bit flag indicating whether or not the vtable def has hidden LTO
visibility is added to the GVarFlags on the GlobalVarSummary. Note that we
can collapse the 3 enum values of the metadata down to a single bit,
because for the purposes of devirtualization, both
VCallVisibilityLinkageUnit and VCallVisibilityTranslationUnit can be
treated the same (we only need to have at least VCallVisibilityLinkageUnit
to devirtualize). The ModuleSummaryIndex builder will set this new flag
from the !vcall_visibility metadata on vtable definitions.

Finally, the VFE support in GlobalDCE (which is enabled by default and
currently triggers automatically in the presence of this metadata), will
need to be modified to ignore !vcall_visibility metadata inserted for
devirtualization only, i.e. when there are any type test assume sequences
for that Type ID. This should be straightforward, as we can scan the type
tests and remove any vtables decorated with compatible type ids from
VFESafeVTables. Note that this change will affect the invocation of
GlobalDCE both here in the pre-link LTO compile as well as later in the LTO
Backend (where it is applied to a broader set of vtables).

LTO Link Handling
-----------------

During Whole Program Devirtualization analysis, when looking at the vtables
corresponding to the summarized virtual calls during
tryFindVirtualCallTargets, we must consult the vcall_visibility
information. For hybrid (regular+thin) LTO, the vtable definitions are in
the regular LTO partition and so the IR can be consulted directly. For
index-only WPD, we instead consult the flag on the vtable’s
GlobalVarSummary.

If any of the vtable definitions compatible with a given virtual call have
public LTO visibility, the devirtualization must be skipped.

By default, only classes that have statically determined hidden LTO
visibility would be allowed to devirtualize. However, as noted earlier, we
want to enable more aggressive devirtualization at LTO link time when we
know that the linking mode guarantees full LTO visibility of any code that
may derive classes from the bitcode being linked. To do so, we will add a
new linker option:

For lld, the proposed option is: -lto-whole-program-visibility.
For gold, the corresponding plugin option would be
“whole-program-visibility”.

When this option is set, LTO will convert all vtable definitions to have
hidden LTO visibility before invoking Whole Program Devirtualization. In
the hybrid LTO case this would mean changing the metadata on the IR. In the
index-only case this would be done in the summaries.

LTO Backend Handling
--------------------

No changes are required in the LTO backend’s invocation of Whole Program
Devirtualization, since any visibility constraints are enforced at LTO link
time, and the loosening of visibility under the new link option only needs
to affect the LTO WPD invocation.

As mentioned earlier when describing the pre-link LTO compile changes,
GlobalDCE will be changed to ignore vtables with !vcall_visibility metadata
corresponding to type tests (and not just type checked loads).

Status
------

These changes have been prototyped and tested with index-only WPD (with the
exception of the proposed changes to GlobalDCE, at the moment I have been
testing with -enable-vfe=false). I will be cleaning up the changes and
sending patches for review in the coming days.

-- 
Teresa Johnson | Software Engineer | tejohnson at google.com |
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191211/49c62d22/attachment.html>

Evgeny Leviant via llvm-dev

2019-Dec-13 16:56 UTC

head link

[llvm-dev] Safe Whole Program Devirtualization Enablement

> Specifically, what we want to know at LTO time is whether the vtable has
hidden LTO visibility or not

I can be missing something, but why can't we use type metadata instead of
!vcall_visibility to identify vtable pointers? We can skip emission of !type
for vtables having [[clang::lto_visibility_public]] attribute and postpone
decision on other vtables in the way you suggested.

________________________________
От: Teresa Johnson <tejohnson at google.com>
Отправлено: 11 декабря 2019 г. 17:21
Кому: llvm-dev
Копия: Peter Collingbourne; Steven Wu; Evgeny Leviant; David Li
Тема: RFC: Safe Whole Program Devirtualization Enablement

CAUTION: This email originated from outside of the organization. Do not click
links or open attachments unless you recognize the sender and know the content
is safe. If you suspect potential phishing or spam email, report it to
ReportSpam at accesssoftek.com
Please send any comments. As mentioned at the end I will follow up with some
patches as soon as they are cleaned up and I create some test cases.

RFC: Safe Whole Program Devirtualization Enablement
==================================================
High Level Summary
------------------

The goal of the changes described in this RFC is to support aggressive Whole
Program Devirtualization without requiring -fvisibility=hidden at compile time,
by pre-enabling bitcode for whole program devirtualization, but delaying the
decision on whether to apply devirtualization until LTO link time. This is
needed both because we may not know whether the link mode is safe for hidden LTO
visibility until link time, and also to allow bitcode objects to be shared
between links of targets with differing valid LTO visibility. This utilizes the
!vcall_visibility metadata added for Dead Virtual Function Elimination.

The summary of changes required are (these are described in more detail later):

1) When -fwhole-program-vtables is specified, always insert type test assumes
for virtual calls, and additionally add !vcall_visibility metadata to vtable
definitions (which will be summarized in the ThinLTO index).

2) At LTO link time, apply hidden LTO visibility to vtable definition
vcall_visibility metadata (or summary) when specified by a new link option
(-lto-whole-program-visibility).

3) During the LTO link time Whole Program Devirtualization analysis, only allow
devirtualization when the associated vtable definitions have hidden LTO
visibility, as derived from the !vcall_visibility metadata (summarized in the
index for index-only WPD).

4) Modify the Virtual Function Elimination application in GlobalDCE to ignore
vtables with !vcall_visibility when they are associated with type tests (and not
just type checked loads).

Background
----------

Whole Program Devirtualization is supported for LTO (both regular and Thin) via
the -fwhole-program-vtables option. However, it can only be safely applied to
classes for which LTO can analyze the entire class hierarchy, and therefore is
restricted to those classes with hidden LTO visibility. See
https://clang.llvm.org/docs/LTOVisibility.html for more information.

The LTO visibility of a class is derived at compile time from the class’s symbol
visibility. Generally, only classes that are internal at the source level (e.g.
declared in an anonymous namespace) receive hidden LTO visibility. Compiling
with -fvisibility=hidden tells the compiler that, unless otherwise marked,
symbols are assumed to have hidden visibility, which also implies that all
classes have hidden LTO visibility (unless decorated with a public visibility
attribute). This results in much more aggressive devirtualization.

However, compiling with -fvisibility=hidden is only safe when we know we are LTO
linking with full view of the class hierarchy. Specifically, this is true when a
binary is being LTO linked with either all sources being bitcode (so that the
LTO unit is the same as the linkage unit), or when the only translation units
being linked as native code are known to not derive any classes defined in the
LTO unit (e.g. system libraries). Additionally, the binary may not dlopen any
libraries at runtime that contain classes derived from those defined in the main
binary.

Assuming we are building and linking a binary that satisfies the above
constraints (we are LTO linking all translation units as bitcode, except certain
(e.g. system) libraries or other native objects known to be safe by the user or
build system, and the binary will not dlopen any libraries deriving from the
binary’s classes), then it should be safe to compile with -fvisibility=hidden,
along with -fwhole-program-vtables.

However, there are cases where it is unknown until link time whether we are
building a target that meets the above constraints. Additionally, we may want to
build additional targets that do not meet the criteria for safe application of
-fvisibility=hidden during the same build invocation (specifically, because
subsets of the code will be linked into shared libraries instead of linking all
code directly into the binary). Even if possible to build two sets of bitcode
object files (one with default visibility for the unsafely linked targets and
one with hidden visibility for the safely linked targets), this causes
duplication in both time and space, which is prohibitive in an environment where
it is common to build targets with tens of thousands of sources, and multiple
targets with different link modes simultaneously.

The goals of the changes described in this RFC are to essentially delay the
application of -fvisibility=hidden until LTO link time, and allow bitcode
objects to be shared between links of targets with differing link modes and
therefore differing valid LTO visibility.

Type Information for Devirtualization
-------------------------------------

LTO whole program devirtualization is driven off of type information in the IR.
This includes type metadata (on vtable definitions), as well as type test
intrinsics before virtual calls. The former is safe to emit into the IR in all
cases, but the latter is currently not. The virtual call sites are decorated
with an llvm.assume(llvm.type.test(ptr, typeid)) sequence, which drives the LTO
analysis of virtual calls. This sequence is an assertion that the given pointer
is associated with the given type identifier
(https://llvm.org/docs/LangRef.html#llvm-type-test-intrinsic). It is currently
inserted only for classes with hidden LTO visibility as the implication of this
sequence is that we have full visibility of that type’s class hierarchy, and may
devirtualize the call based on that knowledge. This assumption is not valid if
the class does not have hidden LTO visibility.

In order to drive later devirtualization, we still need the type compatibility
information provided by the llvm.type.test, but want to delay a decision on
whether it is valid to assume that we have full class hierarchy visibility, and
thus whether devirtualization of that target can be safely applied.

Specifically, what we want to know at LTO time is whether the vtable has hidden
LTO visibility or not, and use that to guide the application of devirtualization
to the type tested virtual call sites. By default, only those with statically
guaranteed hidden LTO visibility should be marked as such. And as described
later, at LTO link time we can optionally decide to convert vtables to hidden
LTO visibility for more aggressive devirtualization when appropriate.

There is already a mechanism in the compiler to describe the vtable visibility,
which was recently added for Dead Virtual Function Elimination (D63932):
!vcall_visibility metadata, documented at
https://llvm.org/docs/TypeMetadata.html#vcall-visibility-metadata. This metadata
is attached to vtable definitions, currently only when VFE is enabled. As
described in the documentation, because this is currently only used for VFE, it
also requires that the corresponding function pointer loads use the
llvm.type.checked.load intrinsic. This would not be required for
devirtualization (although the VFE support in GlobalDCE will need modification
to ignore the metadata when type checked loads not used, more on that later).

This RFC proposes adding the !vcall_visibility metadata to vtable definitions
when -fwhole-program-vtables is specified. Unlike for VFE, the function pointer
loads can still use normal loads with corresponding type test assume sequences
(better for optimization).

Additional changes to the LTO compilation steps are detailed below.

Pre-Link LTO Compile
--------------------

First, type test assume sequences will be inserted when -fwhole-program-vtables
is specified, and not just for classes with hidden LTO visibility.

Second, as mentioned earlier, the !vcall_visibility metadata will be inserted
under -fwhole-program-vtables. For the purposes of index-only WPD, a single-bit
flag indicating whether or not the vtable def has hidden LTO visibility is added
to the GVarFlags on the GlobalVarSummary. Note that we can collapse the 3 enum
values of the metadata down to a single bit, because for the purposes of
devirtualization, both VCallVisibilityLinkageUnit and
VCallVisibilityTranslationUnit can be treated the same (we only need to have at
least VCallVisibilityLinkageUnit to devirtualize). The ModuleSummaryIndex
builder will set this new flag from the !vcall_visibility metadata on vtable
definitions.

Finally, the VFE support in GlobalDCE (which is enabled by default and currently
triggers automatically in the presence of this metadata), will need to be
modified to ignore !vcall_visibility metadata inserted for devirtualization
only, i.e. when there are any type test assume sequences for that Type ID. This
should be straightforward, as we can scan the type tests and remove any vtables
decorated with compatible type ids from VFESafeVTables. Note that this change
will affect the invocation of GlobalDCE both here in the pre-link LTO compile as
well as later in the LTO Backend (where it is applied to a broader set of
vtables).

LTO Link Handling
-----------------

During Whole Program Devirtualization analysis, when looking at the vtables
corresponding to the summarized virtual calls during tryFindVirtualCallTargets,
we must consult the vcall_visibility information. For hybrid (regular+thin) LTO,
the vtable definitions are in the regular LTO partition and so the IR can be
consulted directly. For index-only WPD, we instead consult the flag on the
vtable’s GlobalVarSummary.

If any of the vtable definitions compatible with a given virtual call have
public LTO visibility, the devirtualization must be skipped.

By default, only classes that have statically determined hidden LTO visibility
would be allowed to devirtualize. However, as noted earlier, we want to enable
more aggressive devirtualization at LTO link time when we know that the linking
mode guarantees full LTO visibility of any code that may derive classes from the
bitcode being linked. To do so, we will add a new linker option:

For lld, the proposed option is: -lto-whole-program-visibility.
For gold, the corresponding plugin option would be “whole-program-visibility”.

When this option is set, LTO will convert all vtable definitions to have hidden
LTO visibility before invoking Whole Program Devirtualization. In the hybrid LTO
case this would mean changing the metadata on the IR. In the index-only case
this would be done in the summaries.

LTO Backend Handling
--------------------

No changes are required in the LTO backend’s invocation of Whole Program
Devirtualization, since any visibility constraints are enforced at LTO link
time, and the loosening of visibility under the new link option only needs to
affect the LTO WPD invocation.

As mentioned earlier when describing the pre-link LTO compile changes, GlobalDCE
will be changed to ignore vtables with !vcall_visibility metadata corresponding
to type tests (and not just type checked loads).

Status
------

These changes have been prototyped and tested with index-only WPD (with the
exception of the proposed changes to GlobalDCE, at the moment I have been
testing with -enable-vfe=false). I will be cleaning up the changes and sending
patches for review in the coming days.

--
Teresa Johnson | Software Engineer | tejohnson at google.com<mailto:tejohnson
at google.com> |
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191213/d0a2c7e4/attachment-0001.html>

Teresa Johnson via llvm-dev

2019-Dec-13 17:06 UTC

head link

[llvm-dev] Safe Whole Program Devirtualization Enablement

On Fri, Dec 13, 2019 at 8:56 AM Evgeny Leviant <eleviant at
accesssoftek.com>
wrote:
> > Specifically, what we want to know at LTO time is whether the vtable
> has hidden LTO visibility or not
>
>
> I can be missing something, but why can't we use type metadata instead
of
> !vcall_visibility to identify vtable pointers? We can skip emission of
> !type for vtables having [[clang::lto_visibility_public]] attribute and
> postpone decision on other vtables in the way you suggested.
>I'm not sure if you mean the vtables that have received this
attribute manually, or just the ones that by default would get public LTO
visibility (the latter is the vast bulk of the interesting case).
Regardless, it is the same reason. At LTO link time we want to optionally
treat these as hidden (i.e. delay the effects of what would have been done
at compile time under -fvisibility=hidden). If we don't emit the !type
metadata, then we cannot do this as we lose the class hierarchy info
necessary for WPD. The vcall_visibility attribute just tells us which
vtables we must treat conservatively as public without the LTO link time
assertion provided by the proposed new link option that we can safely treat
public classes as hidden due to the link mode.

Teresa
>
>
> ------------------------------
> *От:* Teresa Johnson <tejohnson at google.com>
> *Отправлено:* 11 декабря 2019 г. 17:21
> *Кому:* llvm-dev
> *Копия:* Peter Collingbourne; Steven Wu; Evgeny Leviant; David Li
> *Тема:* RFC: Safe Whole Program Devirtualization Enablement
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you recognize the sender and know
> the content is safe.  If you suspect potential phishing or spam email,
> report it to ReportSpam at accesssoftek.com
> Please send any comments. As mentioned at the end I will follow up with
> some patches as soon as they are cleaned up and I create some test cases.
>
> RFC: Safe Whole Program Devirtualization Enablement
> ==================================================>
> High Level Summary
> ------------------
>
> The goal of the changes described in this RFC is to support aggressive
> Whole Program Devirtualization without requiring -fvisibility=hidden at
> compile time, by pre-enabling bitcode for whole program devirtualization,
> but delaying the decision on whether to apply devirtualization until LTO
> link time. This is needed both because we may not know whether the link
> mode is safe for hidden LTO visibility until link time, and also to allow
> bitcode objects to be shared between links of targets with differing valid
> LTO visibility. This utilizes the !vcall_visibility metadata added for Dead
> Virtual Function Elimination.
>
> The summary of changes required are (these are described in more detail
> later):
>
> 1) When -fwhole-program-vtables is specified, always insert type test
> assumes for virtual calls, and additionally add !vcall_visibility metadata
> to vtable definitions (which will be summarized in the ThinLTO index).
>
> 2) At LTO link time, apply hidden LTO visibility to vtable definition
> vcall_visibility metadata (or summary) when specified by a new link option
> (-lto-whole-program-visibility).
>
> 3) During the LTO link time Whole Program Devirtualization analysis, only
> allow devirtualization when the associated vtable definitions have hidden
> LTO visibility, as derived from the !vcall_visibility metadata (summarized
> in the index for index-only WPD).
>
> 4) Modify the Virtual Function Elimination application in GlobalDCE to
> ignore vtables with !vcall_visibility when they are associated with type
> tests (and not just type checked loads).
>
> Background
> ----------
>
> Whole Program Devirtualization is supported for LTO (both regular and
> Thin) via the -fwhole-program-vtables option. However, it can only be
> safely applied to classes for which LTO can analyze the entire class
> hierarchy, and therefore is restricted to those classes with hidden LTO
> visibility. See https://clang.llvm.org/docs/LTOVisibility.html for more
> information.
>
> The LTO visibility of a class is derived at compile time from the class’s
> symbol visibility. Generally, only classes that are internal at the source
> level (e.g. declared in an anonymous namespace) receive hidden LTO
> visibility. Compiling with -fvisibility=hidden tells the compiler that,
> unless otherwise marked, symbols are assumed to have hidden visibility,
> which also implies that all classes have hidden LTO visibility (unless
> decorated with a public visibility attribute). This results in much more
> aggressive devirtualization.
>
> However, compiling with -fvisibility=hidden is only safe when we know we
> are LTO linking with full view of the class hierarchy. Specifically, this
> is true when a binary is being LTO linked with either all sources being
> bitcode (so that the LTO unit is the same as the linkage unit), or when the
> only translation units being linked as native code are known to not derive
> any classes defined in the LTO unit (e.g. system libraries). Additionally,
> the binary may not dlopen any libraries at runtime that contain classes
> derived from those defined in the main binary.
>
> Assuming we are building and linking a binary that satisfies the above
> constraints (we are LTO linking all translation units as bitcode, except
> certain (e.g. system) libraries or other native objects known to be safe by
> the user or build system, and the binary will not dlopen any libraries
> deriving from the binary’s classes), then it should be safe to compile with
> -fvisibility=hidden, along with -fwhole-program-vtables.
>
> However, there are cases where it is unknown until link time whether we
> are building a target that meets the above constraints. Additionally, we
> may want to build additional targets that do not meet the criteria for safe
> application of -fvisibility=hidden during the same build invocation
> (specifically, because subsets of the code will be linked into shared
> libraries instead of linking all code directly into the binary). Even if
> possible to build two sets of bitcode object files (one with default
> visibility for the unsafely linked targets and one with hidden visibility
> for the safely linked targets), this causes duplication in both time and
> space, which is prohibitive in an environment where it is common to build
> targets with tens of thousands of sources, and multiple targets with
> different link modes simultaneously.
>
> The goals of the changes described in this RFC are to essentially delay
> the application of -fvisibility=hidden until LTO link time, and allow
> bitcode objects to be shared between links of targets with differing link
> modes and therefore differing valid LTO visibility.
>
> Type Information for Devirtualization
> -------------------------------------
>
> LTO whole program devirtualization is driven off of type information in
> the IR. This includes type metadata (on vtable definitions), as well as
> type test intrinsics before virtual calls. The former is safe to emit into
> the IR in all cases, but the latter is currently not. The virtual call
> sites are decorated with an llvm.assume(llvm.type.test(ptr, typeid))
> sequence, which drives the LTO analysis of virtual calls. This sequence is
> an assertion that the given pointer is associated with the given type
> identifier (https://llvm.org/docs/LangRef.html#llvm-type-test-intrinsic).
> It is currently inserted only for classes with hidden LTO visibility as the
> implication of this sequence is that we have full visibility of that type’s
> class hierarchy, and may devirtualize the call based on that knowledge.
> This assumption is not valid if the class does not have hidden LTO
> visibility.
>
> In order to drive later devirtualization, we still need the type
> compatibility information provided by the llvm.type.test, but want to delay
> a decision on whether it is valid to assume that we have full class
> hierarchy visibility, and thus whether devirtualization of that target can
> be safely applied.
>
> Specifically, what we want to know at LTO time is whether the vtable has
> hidden LTO visibility or not, and use that to guide the application of
> devirtualization to the type tested virtual call sites. By default, only
> those with statically guaranteed hidden LTO visibility should be marked as
> such. And as described later, at LTO link time we can optionally decide to
> convert vtables to hidden LTO visibility for more aggressive
> devirtualization when appropriate.
>
> There is already a mechanism in the compiler to describe the vtable
> visibility, which was recently added for Dead Virtual Function Elimination
> (D63932): !vcall_visibility metadata, documented at
> https://llvm.org/docs/TypeMetadata.html#vcall-visibility-metadata. This
> metadata is attached to vtable definitions, currently only when VFE is
> enabled. As described in the documentation, because this is currently only
> used for VFE, it also requires that the corresponding function pointer
> loads use the llvm.type.checked.load intrinsic. This would not be required
> for devirtualization (although the VFE support in GlobalDCE will need
> modification to ignore the metadata when type checked loads not used, more
> on that later).
>
> This RFC proposes adding the !vcall_visibility metadata to vtable
> definitions when -fwhole-program-vtables is specified. Unlike for VFE, the
> function pointer loads can still use normal loads with corresponding type
> test assume sequences (better for optimization).
>
> Additional changes to the LTO compilation steps are detailed below.
>
> Pre-Link LTO Compile
> --------------------
>
> First, type test assume sequences will be inserted when
> -fwhole-program-vtables is specified, and not just for classes with hidden
> LTO visibility.
>
> Second, as mentioned earlier, the !vcall_visibility metadata will be
> inserted under -fwhole-program-vtables. For the purposes of index-only WPD,
> a single-bit flag indicating whether or not the vtable def has hidden LTO
> visibility is added to the GVarFlags on the GlobalVarSummary. Note that we
> can collapse the 3 enum values of the metadata down to a single bit,
> because for the purposes of devirtualization, both
> VCallVisibilityLinkageUnit and VCallVisibilityTranslationUnit can be
> treated the same (we only need to have at least VCallVisibilityLinkageUnit
> to devirtualize). The ModuleSummaryIndex builder will set this new flag
> from the !vcall_visibility metadata on vtable definitions.
>
> Finally, the VFE support in GlobalDCE (which is enabled by default and
> currently triggers automatically in the presence of this metadata), will
> need to be modified to ignore !vcall_visibility metadata inserted for
> devirtualization only, i.e. when there are any type test assume sequences
> for that Type ID. This should be straightforward, as we can scan the type
> tests and remove any vtables decorated with compatible type ids from
> VFESafeVTables. Note that this change will affect the invocation of
> GlobalDCE both here in the pre-link LTO compile as well as later in the LTO
> Backend (where it is applied to a broader set of vtables).
>
> LTO Link Handling
> -----------------
>
> During Whole Program Devirtualization analysis, when looking at the
> vtables corresponding to the summarized virtual calls during
> tryFindVirtualCallTargets, we must consult the vcall_visibility
> information. For hybrid (regular+thin) LTO, the vtable definitions are in
> the regular LTO partition and so the IR can be consulted directly. For
> index-only WPD, we instead consult the flag on the vtable’s
> GlobalVarSummary.
>
> If any of the vtable definitions compatible with a given virtual call have
> public LTO visibility, the devirtualization must be skipped.
>
> By default, only classes that have statically determined hidden LTO
> visibility would be allowed to devirtualize. However, as noted earlier, we
> want to enable more aggressive devirtualization at LTO link time when we
> know that the linking mode guarantees full LTO visibility of any code that
> may derive classes from the bitcode being linked. To do so, we will add a
> new linker option:
>
> For lld, the proposed option is: -lto-whole-program-visibility.
> For gold, the corresponding plugin option would be
> “whole-program-visibility”.
>
> When this option is set, LTO will convert all vtable definitions to have
> hidden LTO visibility before invoking Whole Program Devirtualization. In
> the hybrid LTO case this would mean changing the metadata on the IR. In the
> index-only case this would be done in the summaries.
>
> LTO Backend Handling
> --------------------
>
> No changes are required in the LTO backend’s invocation of Whole Program
> Devirtualization, since any visibility constraints are enforced at LTO link
> time, and the loosening of visibility under the new link option only needs
> to affect the LTO WPD invocation.
>
> As mentioned earlier when describing the pre-link LTO compile changes,
> GlobalDCE will be changed to ignore vtables with !vcall_visibility metadata
> corresponding to type tests (and not just type checked loads).
>
> Status
> ------
>
> These changes have been prototyped and tested with index-only WPD (with
> the exception of the proposed changes to GlobalDCE, at the moment I have
> been testing with -enable-vfe=false). I will be cleaning up the changes and
> sending patches for review in the coming days.
>
> --
> Teresa Johnson | Software Engineer | tejohnson at google.com |
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191213/e8df1cae/attachment.html>

Iurii Gribov via llvm-dev

2019-Dec-17 15:36 UTC

head link

[llvm-dev] RFC: Safe Whole Program Devirtualization Enablement

(cc list this time)

Hi Teresa,

Apologies if this has been discussed before but ...
> The LTO visibility of a class is derived at compile time from the class’s
symbol visibility.
> Generally, only classes that are internal at the source level (e.g.
declared in an anonymous namespace) receive hidden LTO visibility.
> Compiling with -fvisibility=hidden tells the compiler that, unless 
> otherwise marked, symbols are assumed to have hidden visibility, which 
> also implies that all classes have hidden LTO visibility (unless decorated
with a public visibility attribute).
> This results in much more aggressive devirtualization.
Note that by default, unlike GCC, LLVM is liberal on visibility-constrained
optimizations. In particular it freely performs inlining, IPA and cloning on
them (see https://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html
which also suggested adding -fsemantic-interposition to actually respect
visibility in optimizations). It's unclear why devirtualization should
behave differently than other optimizations (at least by default).

-I

Teresa Johnson via llvm-dev

2019-Dec-17 16:32 UTC

head link

[llvm-dev] RFC: Safe Whole Program Devirtualization Enablement

On Tue, Dec 17, 2019 at 7:36 AM Iurii Gribov <Iurii.Gribov at
ceva-dsp.com>
wrote:
> (cc list this time)
>
> Hi Teresa,
>
> Apologies if this has been discussed before but ...
>
> > The LTO visibility of a class is derived at compile time from the
> class’s symbol visibility.
> > Generally, only classes that are internal at the source level (e.g.
> declared in an anonymous namespace) receive hidden LTO visibility.
> > Compiling with -fvisibility=hidden tells the compiler that, unless
> > otherwise marked, symbols are assumed to have hidden visibility, which
> > also implies that all classes have hidden LTO visibility (unless
> decorated with a public visibility attribute).
> > This results in much more aggressive devirtualization.
>
> Note that by default, unlike GCC, LLVM is liberal on
> visibility-constrained optimizations. In particular it freely performs
> inlining, IPA and cloning on them (see
> https://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html which
> also suggested adding -fsemantic-interposition to actually respect
> visibility in optimizations). It's unclear why devirtualization should
> behave differently than other optimizations (at least by default).
>
Are you suggesting that we should be more aggressive by default (i.e.
without -fvisibility=hidden or any new options)? I believe that will be too
aggressive for class LTO visibility.  It is common to override a virtual
functions across shared library boundaries (e.g. a test may override a
virtual function from a shared library with a mock class). But with what I
am proposing we will assume it is safe under the proposed LTO link option,
which should be applied when linking statically other than e.g. system
libraries.

Thanks,
Teresa

> -I
>
>
-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191217/ca109dfe/attachment.html>

Teresa Johnson via llvm-dev

2019-Dec-26 19:55 UTC

head link

[llvm-dev] RFC: Safe Whole Program Devirtualization Enablement

FYI I mailed 3 patches this morning that together implement the RFC. PTAL:

D71907: [WPD/VFE] Always emit vcall_visibility metadata for
-fwhole-program-vtables
D71911: [ThinLTO] Summarize vcall_visibility metadata
D71913: [LTO/WPD] Enable aggressive WPD under LTO option

Teresa

On Wed, Dec 11, 2019 at 6:21 AM Teresa Johnson <tejohnson at google.com>
wrote:
> Please send any comments. As mentioned at the end I will follow up with
> some patches as soon as they are cleaned up and I create some test cases.
>
> RFC: Safe Whole Program Devirtualization Enablement
> ==================================================>
> High Level Summary
> ------------------
>
> The goal of the changes described in this RFC is to support aggressive
> Whole Program Devirtualization without requiring -fvisibility=hidden at
> compile time, by pre-enabling bitcode for whole program devirtualization,
> but delaying the decision on whether to apply devirtualization until LTO
> link time. This is needed both because we may not know whether the link
> mode is safe for hidden LTO visibility until link time, and also to allow
> bitcode objects to be shared between links of targets with differing valid
> LTO visibility. This utilizes the !vcall_visibility metadata added for Dead
> Virtual Function Elimination.
>
> The summary of changes required are (these are described in more detail
> later):
>
> 1) When -fwhole-program-vtables is specified, always insert type test
> assumes for virtual calls, and additionally add !vcall_visibility metadata
> to vtable definitions (which will be summarized in the ThinLTO index).
>
> 2) At LTO link time, apply hidden LTO visibility to vtable definition
> vcall_visibility metadata (or summary) when specified by a new link option
> (-lto-whole-program-visibility).
>
> 3) During the LTO link time Whole Program Devirtualization analysis, only
> allow devirtualization when the associated vtable definitions have hidden
> LTO visibility, as derived from the !vcall_visibility metadata (summarized
> in the index for index-only WPD).
>
> 4) Modify the Virtual Function Elimination application in GlobalDCE to
> ignore vtables with !vcall_visibility when they are associated with type
> tests (and not just type checked loads).
>
> Background
> ----------
>
> Whole Program Devirtualization is supported for LTO (both regular and
> Thin) via the -fwhole-program-vtables option. However, it can only be
> safely applied to classes for which LTO can analyze the entire class
> hierarchy, and therefore is restricted to those classes with hidden LTO
> visibility. See https://clang.llvm.org/docs/LTOVisibility.html for more
> information.
>
> The LTO visibility of a class is derived at compile time from the class’s
> symbol visibility. Generally, only classes that are internal at the source
> level (e.g. declared in an anonymous namespace) receive hidden LTO
> visibility. Compiling with -fvisibility=hidden tells the compiler that,
> unless otherwise marked, symbols are assumed to have hidden visibility,
> which also implies that all classes have hidden LTO visibility (unless
> decorated with a public visibility attribute). This results in much more
> aggressive devirtualization.
>
> However, compiling with -fvisibility=hidden is only safe when we know we
> are LTO linking with full view of the class hierarchy. Specifically, this
> is true when a binary is being LTO linked with either all sources being
> bitcode (so that the LTO unit is the same as the linkage unit), or when the
> only translation units being linked as native code are known to not derive
> any classes defined in the LTO unit (e.g. system libraries). Additionally,
> the binary may not dlopen any libraries at runtime that contain classes
> derived from those defined in the main binary.
>
> Assuming we are building and linking a binary that satisfies the above
> constraints (we are LTO linking all translation units as bitcode, except
> certain (e.g. system) libraries or other native objects known to be safe by
> the user or build system, and the binary will not dlopen any libraries
> deriving from the binary’s classes), then it should be safe to compile with
> -fvisibility=hidden, along with -fwhole-program-vtables.
>
> However, there are cases where it is unknown until link time whether we
> are building a target that meets the above constraints. Additionally, we
> may want to build additional targets that do not meet the criteria for safe
> application of -fvisibility=hidden during the same build invocation
> (specifically, because subsets of the code will be linked into shared
> libraries instead of linking all code directly into the binary). Even if
> possible to build two sets of bitcode object files (one with default
> visibility for the unsafely linked targets and one with hidden visibility
> for the safely linked targets), this causes duplication in both time and
> space, which is prohibitive in an environment where it is common to build
> targets with tens of thousands of sources, and multiple targets with
> different link modes simultaneously.
>
> The goals of the changes described in this RFC are to essentially delay
> the application of -fvisibility=hidden until LTO link time, and allow
> bitcode objects to be shared between links of targets with differing link
> modes and therefore differing valid LTO visibility.
>
> Type Information for Devirtualization
> -------------------------------------
>
> LTO whole program devirtualization is driven off of type information in
> the IR. This includes type metadata (on vtable definitions), as well as
> type test intrinsics before virtual calls. The former is safe to emit into
> the IR in all cases, but the latter is currently not. The virtual call
> sites are decorated with an llvm.assume(llvm.type.test(ptr, typeid))
> sequence, which drives the LTO analysis of virtual calls. This sequence is
> an assertion that the given pointer is associated with the given type
> identifier (https://llvm.org/docs/LangRef.html#llvm-type-test-intrinsic).
> It is currently inserted only for classes with hidden LTO visibility as the
> implication of this sequence is that we have full visibility of that type’s
> class hierarchy, and may devirtualize the call based on that knowledge.
> This assumption is not valid if the class does not have hidden LTO
> visibility.
>
> In order to drive later devirtualization, we still need the type
> compatibility information provided by the llvm.type.test, but want to delay
> a decision on whether it is valid to assume that we have full class
> hierarchy visibility, and thus whether devirtualization of that target can
> be safely applied.
>
> Specifically, what we want to know at LTO time is whether the vtable has
> hidden LTO visibility or not, and use that to guide the application of
> devirtualization to the type tested virtual call sites. By default, only
> those with statically guaranteed hidden LTO visibility should be marked as
> such. And as described later, at LTO link time we can optionally decide to
> convert vtables to hidden LTO visibility for more aggressive
> devirtualization when appropriate.
>
> There is already a mechanism in the compiler to describe the vtable
> visibility, which was recently added for Dead Virtual Function Elimination
> (D63932): !vcall_visibility metadata, documented at
> https://llvm.org/docs/TypeMetadata.html#vcall-visibility-metadata. This
> metadata is attached to vtable definitions, currently only when VFE is
> enabled. As described in the documentation, because this is currently only
> used for VFE, it also requires that the corresponding function pointer
> loads use the llvm.type.checked.load intrinsic. This would not be required
> for devirtualization (although the VFE support in GlobalDCE will need
> modification to ignore the metadata when type checked loads not used, more
> on that later).
>
> This RFC proposes adding the !vcall_visibility metadata to vtable
> definitions when -fwhole-program-vtables is specified. Unlike for VFE, the
> function pointer loads can still use normal loads with corresponding type
> test assume sequences (better for optimization).
>
> Additional changes to the LTO compilation steps are detailed below.
>
> Pre-Link LTO Compile
> --------------------
>
> First, type test assume sequences will be inserted when
> -fwhole-program-vtables is specified, and not just for classes with hidden
> LTO visibility.
>
> Second, as mentioned earlier, the !vcall_visibility metadata will be
> inserted under -fwhole-program-vtables. For the purposes of index-only WPD,
> a single-bit flag indicating whether or not the vtable def has hidden LTO
> visibility is added to the GVarFlags on the GlobalVarSummary. Note that we
> can collapse the 3 enum values of the metadata down to a single bit,
> because for the purposes of devirtualization, both
> VCallVisibilityLinkageUnit and VCallVisibilityTranslationUnit can be
> treated the same (we only need to have at least VCallVisibilityLinkageUnit
> to devirtualize). The ModuleSummaryIndex builder will set this new flag
> from the !vcall_visibility metadata on vtable definitions.
>
> Finally, the VFE support in GlobalDCE (which is enabled by default and
> currently triggers automatically in the presence of this metadata), will
> need to be modified to ignore !vcall_visibility metadata inserted for
> devirtualization only, i.e. when there are any type test assume sequences
> for that Type ID. This should be straightforward, as we can scan the type
> tests and remove any vtables decorated with compatible type ids from
> VFESafeVTables. Note that this change will affect the invocation of
> GlobalDCE both here in the pre-link LTO compile as well as later in the LTO
> Backend (where it is applied to a broader set of vtables).
>
> LTO Link Handling
> -----------------
>
> During Whole Program Devirtualization analysis, when looking at the
> vtables corresponding to the summarized virtual calls during
> tryFindVirtualCallTargets, we must consult the vcall_visibility
> information. For hybrid (regular+thin) LTO, the vtable definitions are in
> the regular LTO partition and so the IR can be consulted directly. For
> index-only WPD, we instead consult the flag on the vtable’s
> GlobalVarSummary.
>
> If any of the vtable definitions compatible with a given virtual call have
> public LTO visibility, the devirtualization must be skipped.
>
> By default, only classes that have statically determined hidden LTO
> visibility would be allowed to devirtualize. However, as noted earlier, we
> want to enable more aggressive devirtualization at LTO link time when we
> know that the linking mode guarantees full LTO visibility of any code that
> may derive classes from the bitcode being linked. To do so, we will add a
> new linker option:
>
> For lld, the proposed option is: -lto-whole-program-visibility.
> For gold, the corresponding plugin option would be
> “whole-program-visibility”.
>
> When this option is set, LTO will convert all vtable definitions to have
> hidden LTO visibility before invoking Whole Program Devirtualization. In
> the hybrid LTO case this would mean changing the metadata on the IR. In the
> index-only case this would be done in the summaries.
>
> LTO Backend Handling
> --------------------
>
> No changes are required in the LTO backend’s invocation of Whole Program
> Devirtualization, since any visibility constraints are enforced at LTO link
> time, and the loosening of visibility under the new link option only needs
> to affect the LTO WPD invocation.
>
> As mentioned earlier when describing the pre-link LTO compile changes,
> GlobalDCE will be changed to ignore vtables with !vcall_visibility metadata
> corresponding to type tests (and not just type checked loads).
>
> Status
> ------
>
> These changes have been prototyped and tested with index-only WPD (with
> the exception of the proposed changes to GlobalDCE, at the moment I have
> been testing with -enable-vfe=false). I will be cleaning up the changes and
> sending patches for review in the coming days.
>
> --
> Teresa Johnson | Software Engineer | tejohnson at google.com |
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191226/b1ffa804/attachment-0001.html>

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Dec 2019 - RFC: Safe Whole Program Devirtualization Enablement

[llvm-dev] RFC: Safe Whole Program Devirtualization Enablement

[llvm-dev] Safe Whole Program Devirtualization Enablement

[llvm-dev] Safe Whole Program Devirtualization Enablement

[llvm-dev] RFC: Safe Whole Program Devirtualization Enablement

[llvm-dev] RFC: Safe Whole Program Devirtualization Enablement

[llvm-dev] RFC: Safe Whole Program Devirtualization Enablement

Reasonably Related Threads