thr3ads.net - llvm dev - [llvm-dev] [RFC] Design of a TBAA sanitizer [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2017-Apr-11 10:49 UTC

[llvm-dev] [RFC] Design of a TBAA sanitizer

On 04/11/2017 03:46 AM, Andrey Bokhanko wrote:> Hal,
>
> To clarify, my example meant to illustrate that for memory references 
> to structures' fields you have to keep a user-defined type, even for 
> one byte accesses. C++ allows references to "initial member
sequence"
> using pointers to structures of different types. And yes, there are 
> programs in the wild that rely on this (don't remember details... this 
> was from previous life).
>
> Another thing to consider -- what about memset / memcpy / etc that 
> inherently rely on type punning? If not inlined, they don't present 
> problems for an optimizer -- probably shouldn't be considered as 
> aliasing rules violation?
Good point. You (likely) wouldn't want to compile your memcpy / memset / 
etc. implementations with the TBAA sanitizer enabled (or we'd need to 
provide an attribute to disable it for certain functions). If they only 
use char* themselves, then it's fine, but if not, that's implementation 
magic. However, memset, etc. does need to be able to 'unset' the type of
memory and memcpy needs to be able to copy types (at least for PODs). 
The sanitizer would need to hook them for that purpose.

  -Hal
>
> Yours,
> Andrey
> ==> Compiler Architect
> NXP
>
>
> On Tue, Apr 11, 2017 at 12:05 AM, Hal Finkel <hfinkel at anl.gov 
> <mailto:hfinkel at anl.gov>> wrote:
>
>
>     On 04/10/2017 09:55 AM, Andrey Bokhanko wrote:
>>     Hi Hal,
>>
>>     I wonder how your solution will handle the following?
>>
>>     struct {
>>       int s1_f1;
>>       float s1_f2;
>>       int s1_f3;
>>       float s1_f4;
>>     } S1;
>>
>>     struct {
>>       int s2_f1;
>>       float s2_f2;
>>       int *s2_f3; // to add some interest, suppose that sizeof(int)
>>     == sizeof(int *)
>>       float s2_f4;
>>     } S2;
>>
>>     S1 *s1; S2 *s2;
>>     ...
>>     s2 = (S1*)s1;
>>     s2->s2_f1 = 0; // allowed
>>     s2->s2_f2 = 0; // allowed
>>     s2->s2_f3 = 0; // not allowed
>>     s2->s2_f4 = 0; // not allowed
>>
>>     Also, when you plan to set types for allocated memory?
>
>     The most-general thing seems to be to set the types along with a
>     store. As a result, the proposed scheme would not find a fault
>     with the code above, but would complain if anyone actually later
>     read S1.s1_f3.
>
>     If we want to catch these kinds of problems directly we'd need to
>     have the compiler insert code when the type is constructed to mark
>     the types, and then we'd need to check those types around stores.
>     This also sounds like a useful enhancement (although somewhat more
>     complicated to implement).
>
>>     What types will be set for memory allocated by a malloc call?
>
>     Memory would be untyped (or of unknown type) when allocated.
>
>     Thanks again,
>     Hal
>
>
>>
>>     Yours,
>>     Andrey
>>
>>
>>     On Tue, Apr 4, 2017 at 10:13 PM, Hal Finkel via llvm-dev
>>     <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>>
>>         Hi everyone,
>>
>>         At EuroLLVM, Chandler and I chatted about the design for a
>>         potential TBAA sanitizer. Here's my attempt to summarize:
>>
>>         C/C++ have type-based aliasing rules, and LLVM's optimizer
>>         can exploit these given TBAA metadata added by Clang.
>>         Roughly, a pointer of given type cannot be used to access an
>>         object of a different type (with, of course, certain
>>         exceptions). Unfortunately, there's a lot of code in the
wild
>>         that violates these rules (e.g. for type punning), and such
>>         code often must be built with -fno-strict-aliasing.
>>         Performance is often sacrificed as a result. Part of the
>>         problem is the difficulty of finding TBAA violations. A
>>         sanitizer would help.
>>
>>         A design goal of a TBAA sanitizer is to limit the
>>         shadow-memory overhead of the implementation. ASan, for
>>         example, uses 1 bit per byte. Here we're hoping to keep the
>>         overhead down to 2 bits per byte for the TBAA sanitizing. We
>>         might be able to do this, while handling all common types on
>>         the fast path, if we use both alignment and type information.
>>         When accessing data of B bytes, 2*B bits of shadow memory can
>>         be used. Thus, we'll get 2 bits for a one-byte type, 4 bits
>>         for a two-byte type, etc. Moreover, we need appropriate holes
>>         in the encoding space so that no type has a shadow encoding
>>         that overlaps with an aligned part of a larger type's
encoding.
>>         For example, we need to detect:
>>
>>           double f = ...; return *(int*) &f; // We should catch
this.
>>
>>         We might use the following encoding. The idea is that the
>>         common case, for which we need a reasonable fast path, is
>>         that type types are exactly equal. For this case, we want a
>>         simple comparison of the shadow type encodings to be
>>         sufficient to validate the access. For cases where the
>>         encodings don't match (and isn't zero to indicate an
unknown
>>         type), or for which there is no direct encoding for the
>>         access type, a slow path must be used. All of this assumes
>>         that we're validating the the pointer alignment first, and
>>         then checking the type encodings.
>>
>>         1 Byte:
>>         00 = 0  = unknown type
>>         01 = 1 = hole
>>         10 = 2 = hole
>>         11 = 3  = all one-byte types (slow path, see note later on
this)
>>
>>         2 Bytes:
>>         0000 = 0 = unknown type
>>         0101 = 5 = short
>>         0110 = 6 = hole (A)
>>         0111 = 7 = wchar_t (under some ABIs)
>>         1001 = 9 = hole (B)
>>         1010 = 10 = hole (C)
>>         1011 = 11 = char16_t
>>         1111 = 15 = all other types (slow path)
>>
>>         It is important here to have wchar_t have a direct encoding
>>         here because wchar_t is two bytes on Windows, and moreover,
>>         wchar_t is very commonly used on Windows. The partial
>>         encoding overlap of wchar_t (i.e. 0111) with the 11
>>         one-byte-type encoding works because 11 always indicates a
>>         slow-path check.
>>
>>         4 Bytes:
>>         0000 0000 = 0 = unknown type
>>         A A = int
>>         A B = float
>>         B A  = pointer (under some ABIs)
>>         B B = long (under some ABIs)
>>         A 1111 = wchar_t (under some ABIs)
>>         B 1111 = char32_t
>>         A C = hole (D)
>>         C A = hole (E)
>>         B C = hole (F)
>>         C B = hole (G)
>>         C C = hole (H)
>>         1111 1111 = 255 = all other types (slow path)
>>
>>         8 Bytes:
>>         0000 0000 0000 0000 = 0 = unknown type
>>         D D = double
>>         D E = long (under some ABIs)
>>         E D = long long (under some ABIs)
>>         E E = long double (under some ABIs)
>>         D F = pointer (under some ABIs)
>>         F D = hole (I)
>>         E F = hole (J)
>>         F E = hole
>>         F F = hole
>>         ...
>>         1111 1111 1111 1111 = 65535 = all other types
>>
>>         16 Bytes:
>>         0 = unknown type
>>         | | = __int128_t
>>         I J = long long (under some ABIs)
>>         J I = long double (under some ABIs)
>>         J J = hole
>>         ...
>>         -1 = all other types
>>
>>         For pointers, this scheme would consider all pointers to be
>>         the same (regardless of pointee type). Doing otherwise would
>>         mostly requiring putting pointer-type checking on the slow
>>         path (i.e. access via a pointer pointer), and that could add
>>         considerable overhead. We might, however, split out function
>>         pointers from other pointers. We could provide a compile-time
>>         option to control the granularity of pointer-type checks.
>>
>>         Builtin vector types for which the vector element type has a
>>         direct encoding also naturally have a direct encoding (the
>>         concatenation of the encoding for the element type).
>>
>>         Obviously the fact that we have no fast-path encodings for
>>         one-byte types could be problematic. Note however that:
>>
>>          1. If a larger type is being used to access a smaller type
>>         (plus more), the encodings won't match, so we always end up
>>         on the slow path.
>>
>>          2. If the access type is a one-byte type, we would want to
>>         validate quickly. However, most common one-byte types are
>>         universally aliasing (i.e. not subject to TBAA violations).
>>         Specifically, for C/C++, pointers to char, unsigned char,
>>         signed char (C only), and std::byte, can be used to access
>>         any part of any type. That leaves signed char (C++ only),
>>         bool/_Bool, and enums with a [signed/unsigned] char base type
>>         (C++ only, std::byte exempted) as pointee types we might wish
>>         to validate. We'd always need to fall back to the slow path
>>         to validate these. We could provide a compile-time option to
>>         disable such one-byte access checking if necessary.
>>
>>         How would the slow path work? First, the code needs to find
>>         the beginning of the allocation. It can do this by scanning
>>         backwards in the ASan shadow memory. Once located, we'll
read
>>         a pointer to a type-description structure from that "red
>>         zone" location. For dynamic allocations, ASan's
allocator
>>         will ensure such a space for the pointer exists. For static
>>         allocations and globals, the compiler will ensure it exists.
>>         The compiler will make sure that all constructors locate this
>>         field and fill it in. Destructors can clear it. If two of
>>         these type-description-structure pointers are equal, then we
>>         can conclude that the types are equal. If not, then we need
>>         to interpret the structure. The pointer itself might be to an
>>         interval map (to deal with arrays, placement new, etc. - we
>>         can use the low bit of the pointer to differentiate between
>>         an actual type-description structure and an interval map),
>>         and the leaves of the interval map point to actual
>>         type-description structures. The type-description structure
>>         is an array of (offset, type) pairs, where the type field is
>>         also a type-description-structure pointer. The
>>         type-description structures themselves are comdat globals
>>         emitted in each relevant translation unit, where the comdat
>>         key is formed using the mangled type name (and size, etc.),
>>         and pointers to these symbols are then used to identify the
>>         types.
>>
>>         Thoughts?
>>
>>         Thanks again,
>>         Hal
>>
>>         -- 
>>         Hal Finkel
>>         Lead, Compiler Technology and Programming Languages
>>         Leadership Computing Facility
>>         Argonne National Laboratory
>>
>>         _______________________________________________
>>         LLVM Developers mailing list
>>         llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>         http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>
>>
>
>     -- 
>     Hal Finkel
>     Lead, Compiler Technology and Programming Languages
>     Leadership Computing Facility
>     Argonne National Laboratory
>
>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170411/ea656691/attachment.html>

Kostya Serebryany via llvm-dev

2017-Apr-11 18:54 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

Evgeniy and I recently discussed something similar for detecting bad casts
(code named: TypeSanitizer).
The approach with the shadow memory looked attractive at the first glance,
but then we've drowned in details.

Specifically for TBAA, I had another idea, not involving shadow memory.
Consider LLVM queries MayAlias(P1, P2) and the result is false, i.e. we
think that P1 and P2 point to disjoint memory regions.
Then we insert a run-time check that intervals [P1,sizeof(*P1)) and
[P2,sizeof(*P2)) don't intersect.

For functions with a reasonable number of pointer pairs where MayAlias(P1,
P2)==false we could insert checks for all such pairs.
For larger functions -- only for those pairs where the optimizer actually
queried MayAlias(P1, P2).

--kcc


On Tue, Apr 11, 2017 at 3:49 AM, Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> On 04/11/2017 03:46 AM, Andrey Bokhanko wrote:
>
> Hal,
>
> To clarify, my example meant to illustrate that for memory references to
> structures' fields you have to keep a user-defined type, even for one
byte
> accesses. C++ allows references to "initial member sequence"
using pointers
> to structures of different types. And yes, there are programs in the wild
> that rely on this (don't remember details... this was from previous
life).
>
> Another thing to consider -- what about memset / memcpy / etc that
> inherently rely on type punning? If not inlined, they don't present
> problems for an optimizer -- probably shouldn't be considered as
aliasing
> rules violation?
>
>
> Good point. You (likely) wouldn't want to compile your memcpy / memset
/
> etc. implementations with the TBAA sanitizer enabled (or we'd need to
> provide an attribute to disable it for certain functions). If they only use
> char* themselves, then it's fine, but if not, that's implementation
magic.
> However, memset, etc. does need to be able to 'unset' the type of
memory
> and memcpy needs to be able to copy types (at least for PODs). The
> sanitizer would need to hook them for that purpose.
>
>  -Hal
>
>
>
> Yours,
> Andrey
> ==> Compiler Architect
> NXP
>
>
> On Tue, Apr 11, 2017 at 12:05 AM, Hal Finkel <hfinkel at anl.gov>
wrote:
>
>>
>> On 04/10/2017 09:55 AM, Andrey Bokhanko wrote:
>>
>> Hi Hal,
>>
>> I wonder how your solution will handle the following?
>>
>> struct {
>>   int s1_f1;
>>   float s1_f2;
>>   int s1_f3;
>>   float s1_f4;
>> } S1;
>>
>> struct {
>>   int s2_f1;
>>   float s2_f2;
>>   int *s2_f3; // to add some interest, suppose that sizeof(int)
=>> sizeof(int *)
>>   float s2_f4;
>> } S2;
>>
>> S1 *s1; S2 *s2;
>> ...
>> s2 = (S1*)s1;
>> s2->s2_f1 = 0; // allowed
>> s2->s2_f2 = 0; // allowed
>> s2->s2_f3 = 0; // not allowed
>> s2->s2_f4 = 0; // not allowed
>>
>> Also, when you plan to set types for allocated memory?
>>
>>
>> The most-general thing seems to be to set the types along with a store.
>> As a result, the proposed scheme would not find a fault with the code
>> above, but would complain if anyone actually later read S1.s1_f3.
>>
>> If we want to catch these kinds of problems directly we'd need to
have
>> the compiler insert code when the type is constructed to mark the
types,
>> and then we'd need to check those types around stores. This also
sounds
>> like a useful enhancement (although somewhat more complicated to
implement).
>>
>> What types will be set for memory allocated by a malloc call?
>>
>>
>> Memory would be untyped (or of unknown type) when allocated.
>>
>> Thanks again,
>> Hal
>>
>>
>>
>> Yours,
>> Andrey
>>
>>
>> On Tue, Apr 4, 2017 at 10:13 PM, Hal Finkel via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi everyone,
>>>
>>> At EuroLLVM, Chandler and I chatted about the design for a
potential
>>> TBAA sanitizer. Here's my attempt to summarize:
>>>
>>> C/C++ have type-based aliasing rules, and LLVM's optimizer can
exploit
>>> these given TBAA metadata added by Clang. Roughly, a pointer of
given type
>>> cannot be used to access an object of a different type (with, of
course,
>>> certain exceptions). Unfortunately, there's a lot of code in
the wild that
>>> violates these rules (e.g. for type punning), and such code often
must be
>>> built with -fno-strict-aliasing. Performance is often sacrificed as
a
>>> result. Part of the problem is the difficulty of finding TBAA
violations. A
>>> sanitizer would help.
>>>
>>> A design goal of a TBAA sanitizer is to limit the shadow-memory
overhead
>>> of the implementation. ASan, for example, uses 1 bit per byte. Here
we're
>>> hoping to keep the overhead down to 2 bits per byte for the TBAA
>>> sanitizing. We might be able to do this, while handling all common
types on
>>> the fast path, if we use both alignment and type information. When
>>> accessing data of B bytes, 2*B bits of shadow memory can be used.
Thus,
>>> we'll get 2 bits for a one-byte type, 4 bits for a two-byte
type, etc.
>>> Moreover, we need appropriate holes in the encoding space so that
no type
>>> has a shadow encoding that overlaps with an aligned part of a
larger type's
>>> encoding.
>>> For example, we need to detect:
>>>
>>>   double f = ...; return *(int*) &f; // We should catch this.
>>>
>>> We might use the following encoding. The idea is that the common
case,
>>> for which we need a reasonable fast path, is that type types are
exactly
>>> equal. For this case, we want a simple comparison of the shadow
type
>>> encodings to be sufficient to validate the access. For cases where
the
>>> encodings don't match (and isn't zero to indicate an
unknown type), or for
>>> which there is no direct encoding for the access type, a slow path
must be
>>> used. All of this assumes that we're validating the the pointer
alignment
>>> first, and then checking the type encodings.
>>>
>>> 1 Byte:
>>> 00 = 0  = unknown type
>>> 01 = 1 = hole
>>> 10 = 2 = hole
>>> 11 = 3  = all one-byte types (slow path, see note later on this)
>>>
>>> 2 Bytes:
>>> 0000 = 0 = unknown type
>>> 0101 = 5 = short
>>> 0110 = 6 = hole (A)
>>> 0111 = 7 = wchar_t (under some ABIs)
>>> 1001 = 9 = hole (B)
>>> 1010 = 10 = hole (C)
>>> 1011 = 11 = char16_t
>>> 1111 = 15 = all other types (slow path)
>>>
>>> It is important here to have wchar_t have a direct encoding here
because
>>> wchar_t is two bytes on Windows, and moreover, wchar_t is very
commonly
>>> used on Windows. The partial encoding overlap of wchar_t (i.e.
0111) with
>>> the 11 one-byte-type encoding works because 11 always indicates a
slow-path
>>> check.
>>>
>>> 4 Bytes:
>>> 0000 0000 = 0 = unknown type
>>> A A = int
>>> A B = float
>>> B A  = pointer (under some ABIs)
>>> B B = long (under some ABIs)
>>> A 1111 = wchar_t (under some ABIs)
>>> B 1111 = char32_t
>>> A C = hole (D)
>>> C A = hole (E)
>>> B C = hole (F)
>>> C B = hole (G)
>>> C C = hole (H)
>>> 1111 1111 = 255 = all other types (slow path)
>>>
>>> 8 Bytes:
>>> 0000 0000 0000 0000 = 0 = unknown type
>>> D D = double
>>> D E = long (under some ABIs)
>>> E D = long long (under some ABIs)
>>> E E = long double (under some ABIs)
>>> D F = pointer (under some ABIs)
>>> F D = hole (I)
>>> E F = hole (J)
>>> F E = hole
>>> F F = hole
>>> ...
>>> 1111 1111 1111 1111 = 65535 = all other types
>>>
>>> 16 Bytes:
>>> 0 = unknown type
>>> | | = __int128_t
>>> I J = long long (under some ABIs)
>>> J I = long double (under some ABIs)
>>> J J = hole
>>> ...
>>> -1 = all other types
>>>
>>> For pointers, this scheme would consider all pointers to be the
same
>>> (regardless of pointee type). Doing otherwise would mostly
requiring
>>> putting pointer-type checking on the slow path (i.e. access via a
pointer
>>> pointer), and that could add considerable overhead. We might,
however,
>>> split out function pointers from other pointers. We could provide a
>>> compile-time option to control the granularity of pointer-type
checks.
>>>
>>> Builtin vector types for which the vector element type has a direct
>>> encoding also naturally have a direct encoding (the concatenation
of the
>>> encoding for the element type).
>>>
>>> Obviously the fact that we have no fast-path encodings for one-byte
>>> types could be problematic. Note however that:
>>>
>>>  1. If a larger type is being used to access a smaller type (plus
more),
>>> the encodings won't match, so we always end up on the slow
path.
>>>
>>>  2. If the access type is a one-byte type, we would want to
validate
>>> quickly. However, most common one-byte types are universally
aliasing (i.e.
>>> not subject to TBAA violations). Specifically, for C/C++, pointers
to char,
>>> unsigned char, signed char (C only), and std::byte, can be used to
access
>>> any part of any type. That leaves signed char (C++ only),
bool/_Bool, and
>>> enums with a [signed/unsigned] char base type (C++ only, std::byte
>>> exempted) as pointee types we might wish to validate. We'd
always need to
>>> fall back to the slow path to validate these. We could provide a
>>> compile-time option to disable such one-byte access checking if
necessary.
>>>
>>> How would the slow path work? First, the code needs to find the
>>> beginning of the allocation. It can do this by scanning backwards
in the
>>> ASan shadow memory. Once located, we'll read a pointer to a
>>> type-description structure from that "red zone" location.
For dynamic
>>> allocations, ASan's allocator will ensure such a space for the
pointer
>>> exists. For static allocations and globals, the compiler will
ensure it
>>> exists. The compiler will make sure that all constructors locate
this field
>>> and fill it in. Destructors can clear it. If two of these
>>> type-description-structure pointers are equal, then we can conclude
that
>>> the types are equal. If not, then we need to interpret the
structure. The
>>> pointer itself might be to an interval map (to deal with arrays,
placement
>>> new, etc. - we can use the low bit of the pointer to differentiate
between
>>> an actual type-description structure and an interval map), and the
leaves
>>> of the interval map point to actual type-description structures.
The
>>> type-description structure is an array of (offset, type) pairs,
where the
>>> type field is also a type-description-structure pointer. The
>>> type-description structures themselves are comdat globals emitted
in each
>>> relevant translation unit, where the comdat key is formed using the
mangled
>>> type name (and size, etc.), and pointers to these symbols are then
used to
>>> identify the types.
>>>
>>> Thoughts?
>>>
>>> Thanks again,
>>> Hal
>>>
>>> --
>>> Hal Finkel
>>> Lead, Compiler Technology and Programming Languages
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170411/7fbeeeae/attachment.html>

Peter Collingbourne via llvm-dev

2017-Apr-11 20:05 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

I like this idea! It seems very practical, and sounds like it could be
extended to verify other AAs.

Peter

On Tue, Apr 11, 2017 at 11:54 AM, Kostya Serebryany via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Evgeniy and I recently discussed something similar for detecting bad casts
> (code named: TypeSanitizer).
> The approach with the shadow memory looked attractive at the first glance,
> but then we've drowned in details.
>
> Specifically for TBAA, I had another idea, not involving shadow memory.
> Consider LLVM queries MayAlias(P1, P2) and the result is false, i.e. we
> think that P1 and P2 point to disjoint memory regions.
> Then we insert a run-time check that intervals [P1,sizeof(*P1)) and
> [P2,sizeof(*P2)) don't intersect.
>
> For functions with a reasonable number of pointer pairs where MayAlias(P1,
> P2)==false we could insert checks for all such pairs.
> For larger functions -- only for those pairs where the optimizer actually
> queried MayAlias(P1, P2).
>
> --kcc
>
>
> On Tue, Apr 11, 2017 at 3:49 AM, Hal Finkel via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>> On 04/11/2017 03:46 AM, Andrey Bokhanko wrote:
>>
>> Hal,
>>
>> To clarify, my example meant to illustrate that for memory references
to
>> structures' fields you have to keep a user-defined type, even for
one byte
>> accesses. C++ allows references to "initial member sequence"
using pointers
>> to structures of different types. And yes, there are programs in the
wild
>> that rely on this (don't remember details... this was from previous
life).
>>
>> Another thing to consider -- what about memset / memcpy / etc that
>> inherently rely on type punning? If not inlined, they don't present
>> problems for an optimizer -- probably shouldn't be considered as
aliasing
>> rules violation?
>>
>>
>> Good point. You (likely) wouldn't want to compile your memcpy /
memset /
>> etc. implementations with the TBAA sanitizer enabled (or we'd need
to
>> provide an attribute to disable it for certain functions). If they only
use
>> char* themselves, then it's fine, but if not, that's
implementation magic.
>> However, memset, etc. does need to be able to 'unset' the type
of memory
>> and memcpy needs to be able to copy types (at least for PODs). The
>> sanitizer would need to hook them for that purpose.
>>
>>  -Hal
>>
>>
>>
>> Yours,
>> Andrey
>> ==>> Compiler Architect
>> NXP
>>
>>
>> On Tue, Apr 11, 2017 at 12:05 AM, Hal Finkel <hfinkel at anl.gov>
wrote:
>>
>>>
>>> On 04/10/2017 09:55 AM, Andrey Bokhanko wrote:
>>>
>>> Hi Hal,
>>>
>>> I wonder how your solution will handle the following?
>>>
>>> struct {
>>>   int s1_f1;
>>>   float s1_f2;
>>>   int s1_f3;
>>>   float s1_f4;
>>> } S1;
>>>
>>> struct {
>>>   int s2_f1;
>>>   float s2_f2;
>>>   int *s2_f3; // to add some interest, suppose that sizeof(int)
=>>> sizeof(int *)
>>>   float s2_f4;
>>> } S2;
>>>
>>> S1 *s1; S2 *s2;
>>> ...
>>> s2 = (S1*)s1;
>>> s2->s2_f1 = 0; // allowed
>>> s2->s2_f2 = 0; // allowed
>>> s2->s2_f3 = 0; // not allowed
>>> s2->s2_f4 = 0; // not allowed
>>>
>>> Also, when you plan to set types for allocated memory?
>>>
>>>
>>> The most-general thing seems to be to set the types along with a
store.
>>> As a result, the proposed scheme would not find a fault with the
code
>>> above, but would complain if anyone actually later read S1.s1_f3.
>>>
>>> If we want to catch these kinds of problems directly we'd need
to have
>>> the compiler insert code when the type is constructed to mark the
types,
>>> and then we'd need to check those types around stores. This
also sounds
>>> like a useful enhancement (although somewhat more complicated to
implement).
>>>
>>> What types will be set for memory allocated by a malloc call?
>>>
>>>
>>> Memory would be untyped (or of unknown type) when allocated.
>>>
>>> Thanks again,
>>> Hal
>>>
>>>
>>>
>>> Yours,
>>> Andrey
>>>
>>>
>>> On Tue, Apr 4, 2017 at 10:13 PM, Hal Finkel via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> At EuroLLVM, Chandler and I chatted about the design for a
potential
>>>> TBAA sanitizer. Here's my attempt to summarize:
>>>>
>>>> C/C++ have type-based aliasing rules, and LLVM's optimizer
can exploit
>>>> these given TBAA metadata added by Clang. Roughly, a pointer of
given type
>>>> cannot be used to access an object of a different type (with,
of course,
>>>> certain exceptions). Unfortunately, there's a lot of code
in the wild that
>>>> violates these rules (e.g. for type punning), and such code
often must be
>>>> built with -fno-strict-aliasing. Performance is often
sacrificed as a
>>>> result. Part of the problem is the difficulty of finding TBAA
violations. A
>>>> sanitizer would help.
>>>>
>>>> A design goal of a TBAA sanitizer is to limit the shadow-memory
>>>> overhead of the implementation. ASan, for example, uses 1 bit
per byte.
>>>> Here we're hoping to keep the overhead down to 2 bits per
byte for the TBAA
>>>> sanitizing. We might be able to do this, while handling all
common types on
>>>> the fast path, if we use both alignment and type information.
When
>>>> accessing data of B bytes, 2*B bits of shadow memory can be
used. Thus,
>>>> we'll get 2 bits for a one-byte type, 4 bits for a two-byte
type, etc.
>>>> Moreover, we need appropriate holes in the encoding space so
that no type
>>>> has a shadow encoding that overlaps with an aligned part of a
larger type's
>>>> encoding.
>>>> For example, we need to detect:
>>>>
>>>>   double f = ...; return *(int*) &f; // We should catch
this.
>>>>
>>>> We might use the following encoding. The idea is that the
common case,
>>>> for which we need a reasonable fast path, is that type types
are exactly
>>>> equal. For this case, we want a simple comparison of the shadow
type
>>>> encodings to be sufficient to validate the access. For cases
where the
>>>> encodings don't match (and isn't zero to indicate an
unknown type), or for
>>>> which there is no direct encoding for the access type, a slow
path must be
>>>> used. All of this assumes that we're validating the the
pointer alignment
>>>> first, and then checking the type encodings.
>>>>
>>>> 1 Byte:
>>>> 00 = 0  = unknown type
>>>> 01 = 1 = hole
>>>> 10 = 2 = hole
>>>> 11 = 3  = all one-byte types (slow path, see note later on
this)
>>>>
>>>> 2 Bytes:
>>>> 0000 = 0 = unknown type
>>>> 0101 = 5 = short
>>>> 0110 = 6 = hole (A)
>>>> 0111 = 7 = wchar_t (under some ABIs)
>>>> 1001 = 9 = hole (B)
>>>> 1010 = 10 = hole (C)
>>>> 1011 = 11 = char16_t
>>>> 1111 = 15 = all other types (slow path)
>>>>
>>>> It is important here to have wchar_t have a direct encoding
here
>>>> because wchar_t is two bytes on Windows, and moreover, wchar_t
is very
>>>> commonly used on Windows. The partial encoding overlap of
wchar_t (i.e.
>>>> 0111) with the 11 one-byte-type encoding works because 11
always indicates
>>>> a slow-path check.
>>>>
>>>> 4 Bytes:
>>>> 0000 0000 = 0 = unknown type
>>>> A A = int
>>>> A B = float
>>>> B A  = pointer (under some ABIs)
>>>> B B = long (under some ABIs)
>>>> A 1111 = wchar_t (under some ABIs)
>>>> B 1111 = char32_t
>>>> A C = hole (D)
>>>> C A = hole (E)
>>>> B C = hole (F)
>>>> C B = hole (G)
>>>> C C = hole (H)
>>>> 1111 1111 = 255 = all other types (slow path)
>>>>
>>>> 8 Bytes:
>>>> 0000 0000 0000 0000 = 0 = unknown type
>>>> D D = double
>>>> D E = long (under some ABIs)
>>>> E D = long long (under some ABIs)
>>>> E E = long double (under some ABIs)
>>>> D F = pointer (under some ABIs)
>>>> F D = hole (I)
>>>> E F = hole (J)
>>>> F E = hole
>>>> F F = hole
>>>> ...
>>>> 1111 1111 1111 1111 = 65535 = all other types
>>>>
>>>> 16 Bytes:
>>>> 0 = unknown type
>>>> | | = __int128_t
>>>> I J = long long (under some ABIs)
>>>> J I = long double (under some ABIs)
>>>> J J = hole
>>>> ...
>>>> -1 = all other types
>>>>
>>>> For pointers, this scheme would consider all pointers to be the
same
>>>> (regardless of pointee type). Doing otherwise would mostly
requiring
>>>> putting pointer-type checking on the slow path (i.e. access via
a pointer
>>>> pointer), and that could add considerable overhead. We might,
however,
>>>> split out function pointers from other pointers. We could
provide a
>>>> compile-time option to control the granularity of pointer-type
checks.
>>>>
>>>> Builtin vector types for which the vector element type has a
direct
>>>> encoding also naturally have a direct encoding (the
concatenation of the
>>>> encoding for the element type).
>>>>
>>>> Obviously the fact that we have no fast-path encodings for
one-byte
>>>> types could be problematic. Note however that:
>>>>
>>>>  1. If a larger type is being used to access a smaller type
(plus
>>>> more), the encodings won't match, so we always end up on
the slow path.
>>>>
>>>>  2. If the access type is a one-byte type, we would want to
validate
>>>> quickly. However, most common one-byte types are universally
aliasing (i.e.
>>>> not subject to TBAA violations). Specifically, for C/C++,
pointers to char,
>>>> unsigned char, signed char (C only), and std::byte, can be used
to access
>>>> any part of any type. That leaves signed char (C++ only),
bool/_Bool, and
>>>> enums with a [signed/unsigned] char base type (C++ only,
std::byte
>>>> exempted) as pointee types we might wish to validate. We'd
always need to
>>>> fall back to the slow path to validate these. We could provide
a
>>>> compile-time option to disable such one-byte access checking if
necessary.
>>>>
>>>> How would the slow path work? First, the code needs to find the
>>>> beginning of the allocation. It can do this by scanning
backwards in the
>>>> ASan shadow memory. Once located, we'll read a pointer to a
>>>> type-description structure from that "red zone"
location. For dynamic
>>>> allocations, ASan's allocator will ensure such a space for
the pointer
>>>> exists. For static allocations and globals, the compiler will
ensure it
>>>> exists. The compiler will make sure that all constructors
locate this field
>>>> and fill it in. Destructors can clear it. If two of these
>>>> type-description-structure pointers are equal, then we can
conclude that
>>>> the types are equal. If not, then we need to interpret the
structure. The
>>>> pointer itself might be to an interval map (to deal with
arrays, placement
>>>> new, etc. - we can use the low bit of the pointer to
differentiate between
>>>> an actual type-description structure and an interval map), and
the leaves
>>>> of the interval map point to actual type-description
structures. The
>>>> type-description structure is an array of (offset, type) pairs,
where the
>>>> type field is also a type-description-structure pointer. The
>>>> type-description structures themselves are comdat globals
emitted in each
>>>> relevant translation unit, where the comdat key is formed using
the mangled
>>>> type name (and size, etc.), and pointers to these symbols are
then used to
>>>> identify the types.
>>>>
>>>> Thoughts?
>>>>
>>>> Thanks again,
>>>> Hal
>>>>
>>>> --
>>>> Hal Finkel
>>>> Lead, Compiler Technology and Programming Languages
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
>>>
>>> --
>>> Hal Finkel
>>> Lead, Compiler Technology and Programming Languages
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>>
>>>
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>

-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170411/6fea91d8/attachment.html>

Sanjoy Das via llvm-dev

2017-Apr-11 20:25 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

Hi,

On April 11, 2017 at 11:55:12 AM, Kostya Serebryany via llvm-dev
(llvm-dev at lists.llvm.org) wrote:> Evgeniy and I recently discussed something similar for detecting bad casts
> (code named: TypeSanitizer).
> The approach with the shadow memory looked attractive at the first glance,
> but then we've drowned in details.
>
> Specifically for TBAA, I had another idea, not involving shadow memory.
> Consider LLVM queries MayAlias(P1, P2) and the result is false, i.e. we
> think that P1 and P2 point to disjoint memory regions.
> Then we insert a run-time check that intervals [P1,sizeof(*P1)) and
> [P2,sizeof(*P2)) don't intersect.
>
> For functions with a reasonable number of pointer pairs where MayAlias(P1,
> P2)==false we could insert checks for all such pairs.
I'm not very clear on how this will fit into TBAA -- I don't think
TBAA decides aliasing relation between pointers, but it decides
aliasing relation between accesses (!tbaa metadata only exists on
accesses).

This means, at least in LLVM IR, you have to account for cases like:

if (A)
  *(int *)ptr = 20;
if (B)
  float f = *(float *)ptr;

where ptr does not have a type (according to TBAA), but you want to
crash the program if both A and B are true.

-- Sanjoy
> For larger functions -- only for those pairs where the optimizer actually
> queried MayAlias(P1, P2).
>
> --kcc
>
>
> On Tue, Apr 11, 2017 at 3:49 AM, Hal Finkel via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> >
> > On 04/11/2017 03:46 AM, Andrey Bokhanko wrote:
> >
> > Hal,
> >
> > To clarify, my example meant to illustrate that for memory references
to
> > structures' fields you have to keep a user-defined type, even for
one byte
> > accesses. C++ allows references to "initial member sequence"
using pointers
> > to structures of different types. And yes, there are programs in the
wild
> > that rely on this (don't remember details... this was from
previous life).
> >
> > Another thing to consider -- what about memset / memcpy / etc that
> > inherently rely on type punning? If not inlined, they don't
present
> > problems for an optimizer -- probably shouldn't be considered as
aliasing
> > rules violation?
> >
> >
> > Good point. You (likely) wouldn't want to compile your memcpy /
memset /
> > etc. implementations with the TBAA sanitizer enabled (or we'd need
to
> > provide an attribute to disable it for certain functions). If they
only use
> > char* themselves, then it's fine, but if not, that's
implementation magic.
> > However, memset, etc. does need to be able to 'unset' the type
of memory
> > and memcpy needs to be able to copy types (at least for PODs). The
> > sanitizer would need to hook them for that purpose.
> >
> > -Hal
> >
> >
> >
> > Yours,
> > Andrey
> > ==> > Compiler Architect
> > NXP
> >
> >
> > On Tue, Apr 11, 2017 at 12:05 AM, Hal Finkel wrote:
> >
> >>
> >> On 04/10/2017 09:55 AM, Andrey Bokhanko wrote:
> >>
> >> Hi Hal,
> >>
> >> I wonder how your solution will handle the following?
> >>
> >> struct {
> >> int s1_f1;
> >> float s1_f2;
> >> int s1_f3;
> >> float s1_f4;
> >> } S1;
> >>
> >> struct {
> >> int s2_f1;
> >> float s2_f2;
> >> int *s2_f3; // to add some interest, suppose that sizeof(int)
=> >> sizeof(int *)
> >> float s2_f4;
> >> } S2;
> >>
> >> S1 *s1; S2 *s2;
> >> ...
> >> s2 = (S1*)s1;
> >> s2->s2_f1 = 0; // allowed
> >> s2->s2_f2 = 0; // allowed
> >> s2->s2_f3 = 0; // not allowed
> >> s2->s2_f4 = 0; // not allowed
> >>
> >> Also, when you plan to set types for allocated memory?
> >>
> >>
> >> The most-general thing seems to be to set the types along with a
store.
> >> As a result, the proposed scheme would not find a fault with the
code
> >> above, but would complain if anyone actually later read S1.s1_f3.
> >>
> >> If we want to catch these kinds of problems directly we'd need
to have
> >> the compiler insert code when the type is constructed to mark the
types,
> >> and then we'd need to check those types around stores. This
also sounds
> >> like a useful enhancement (although somewhat more complicated to
implement).
> >>
> >> What types will be set for memory allocated by a malloc call?
> >>
> >>
> >> Memory would be untyped (or of unknown type) when allocated.
> >>
> >> Thanks again,
> >> Hal
> >>
> >>
> >>
> >> Yours,
> >> Andrey
> >>
> >>
> >> On Tue, Apr 4, 2017 at 10:13 PM, Hal Finkel via llvm-dev <
> >> llvm-dev at lists.llvm.org> wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> At EuroLLVM, Chandler and I chatted about the design for a
potential
> >>> TBAA sanitizer. Here's my attempt to summarize:
> >>>
> >>> C/C++ have type-based aliasing rules, and LLVM's optimizer
can exploit
> >>> these given TBAA metadata added by Clang. Roughly, a pointer
of given type
> >>> cannot be used to access an object of a different type (with,
of course,
> >>> certain exceptions). Unfortunately, there's a lot of code
in the wild that
> >>> violates these rules (e.g. for type punning), and such code
often must be
> >>> built with -fno-strict-aliasing. Performance is often
sacrificed as a
> >>> result. Part of the problem is the difficulty of finding TBAA
violations. A
> >>> sanitizer would help.
> >>>
> >>> A design goal of a TBAA sanitizer is to limit the
shadow-memory overhead
> >>> of the implementation. ASan, for example, uses 1 bit per byte.
Here we're
> >>> hoping to keep the overhead down to 2 bits per byte for the
TBAA
> >>> sanitizing. We might be able to do this, while handling all
common types on
> >>> the fast path, if we use both alignment and type information.
When
> >>> accessing data of B bytes, 2*B bits of shadow memory can be
used. Thus,
> >>> we'll get 2 bits for a one-byte type, 4 bits for a
two-byte type, etc.
> >>> Moreover, we need appropriate holes in the encoding space so
that no type
> >>> has a shadow encoding that overlaps with an aligned part of a
larger type's
> >>> encoding.
> >>> For example, we need to detect:
> >>>
> >>> double f = ...; return *(int*) &f; // We should catch
this.
> >>>
> >>> We might use the following encoding. The idea is that the
common case,
> >>> for which we need a reasonable fast path, is that type types
are exactly
> >>> equal. For this case, we want a simple comparison of the
shadow type
> >>> encodings to be sufficient to validate the access. For cases
where the
> >>> encodings don't match (and isn't zero to indicate an
unknown type), or for
> >>> which there is no direct encoding for the access type, a slow
path must be
> >>> used. All of this assumes that we're validating the the
pointer alignment
> >>> first, and then checking the type encodings.
> >>>
> >>> 1 Byte:
> >>> 00 = 0 = unknown type
> >>> 01 = 1 = hole
> >>> 10 = 2 = hole
> >>> 11 = 3 = all one-byte types (slow path, see note later on
this)
> >>>
> >>> 2 Bytes:
> >>> 0000 = 0 = unknown type
> >>> 0101 = 5 = short
> >>> 0110 = 6 = hole (A)
> >>> 0111 = 7 = wchar_t (under some ABIs)
> >>> 1001 = 9 = hole (B)
> >>> 1010 = 10 = hole (C)
> >>> 1011 = 11 = char16_t
> >>> 1111 = 15 = all other types (slow path)
> >>>
> >>> It is important here to have wchar_t have a direct encoding
here because
> >>> wchar_t is two bytes on Windows, and moreover, wchar_t is very
commonly
> >>> used on Windows. The partial encoding overlap of wchar_t (i.e.
0111) with
> >>> the 11 one-byte-type encoding works because 11 always
indicates a slow-path
> >>> check.
> >>>
> >>> 4 Bytes:
> >>> 0000 0000 = 0 = unknown type
> >>> A A = int
> >>> A B = float
> >>> B A = pointer (under some ABIs)
> >>> B B = long (under some ABIs)
> >>> A 1111 = wchar_t (under some ABIs)
> >>> B 1111 = char32_t
> >>> A C = hole (D)
> >>> C A = hole (E)
> >>> B C = hole (F)
> >>> C B = hole (G)
> >>> C C = hole (H)
> >>> 1111 1111 = 255 = all other types (slow path)
> >>>
> >>> 8 Bytes:
> >>> 0000 0000 0000 0000 = 0 = unknown type
> >>> D D = double
> >>> D E = long (under some ABIs)
> >>> E D = long long (under some ABIs)
> >>> E E = long double (under some ABIs)
> >>> D F = pointer (under some ABIs)
> >>> F D = hole (I)
> >>> E F = hole (J)
> >>> F E = hole
> >>> F F = hole
> >>> ...
> >>> 1111 1111 1111 1111 = 65535 = all other types
> >>>
> >>> 16 Bytes:
> >>> 0 = unknown type
> >>> | | = __int128_t
> >>> I J = long long (under some ABIs)
> >>> J I = long double (under some ABIs)
> >>> J J = hole
> >>> ...
> >>> -1 = all other types
> >>>
> >>> For pointers, this scheme would consider all pointers to be
the same
> >>> (regardless of pointee type). Doing otherwise would mostly
requiring
> >>> putting pointer-type checking on the slow path (i.e. access
via a pointer
> >>> pointer), and that could add considerable overhead. We might,
however,
> >>> split out function pointers from other pointers. We could
provide a
> >>> compile-time option to control the granularity of pointer-type
checks.
> >>>
> >>> Builtin vector types for which the vector element type has a
direct
> >>> encoding also naturally have a direct encoding (the
concatenation of the
> >>> encoding for the element type).
> >>>
> >>> Obviously the fact that we have no fast-path encodings for
one-byte
> >>> types could be problematic. Note however that:
> >>>
> >>> 1. If a larger type is being used to access a smaller type
(plus more),
> >>> the encodings won't match, so we always end up on the slow
path.
> >>>
> >>> 2. If the access type is a one-byte type, we would want to
validate
> >>> quickly. However, most common one-byte types are universally
aliasing (i.e.
> >>> not subject to TBAA violations). Specifically, for C/C++,
pointers to char,
> >>> unsigned char, signed char (C only), and std::byte, can be
used to access
> >>> any part of any type. That leaves signed char (C++ only),
bool/_Bool, and
> >>> enums with a [signed/unsigned] char base type (C++ only,
std::byte
> >>> exempted) as pointee types we might wish to validate. We'd
always need to
> >>> fall back to the slow path to validate these. We could provide
a
> >>> compile-time option to disable such one-byte access checking
if necessary.
> >>>
> >>> How would the slow path work? First, the code needs to find
the
> >>> beginning of the allocation. It can do this by scanning
backwards in the
> >>> ASan shadow memory. Once located, we'll read a pointer to
a
> >>> type-description structure from that "red zone"
location. For dynamic
> >>> allocations, ASan's allocator will ensure such a space for
the pointer
> >>> exists. For static allocations and globals, the compiler will
ensure it
> >>> exists. The compiler will make sure that all constructors
locate this field
> >>> and fill it in. Destructors can clear it. If two of these
> >>> type-description-structure pointers are equal, then we can
conclude that
> >>> the types are equal. If not, then we need to interpret the
structure. The
> >>> pointer itself might be to an interval map (to deal with
arrays, placement
> >>> new, etc. - we can use the low bit of the pointer to
differentiate between
> >>> an actual type-description structure and an interval map), and
the leaves
> >>> of the interval map point to actual type-description
structures. The
> >>> type-description structure is an array of (offset, type)
pairs, where the
> >>> type field is also a type-description-structure pointer. The
> >>> type-description structures themselves are comdat globals
emitted in each
> >>> relevant translation unit, where the comdat key is formed
using the mangled
> >>> type name (and size, etc.), and pointers to these symbols are
then used to
> >>> identify the types.
> >>>
> >>> Thoughts?
> >>>
> >>> Thanks again,
> >>> Hal
> >>>
> >>> --
> >>> Hal Finkel
> >>> Lead, Compiler Technology and Programming Languages
> >>> Leadership Computing Facility
> >>> Argonne National Laboratory
> >>>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>
> >>
> >>
> >> --
> >> Hal Finkel
> >> Lead, Compiler Technology and Programming Languages
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >>
> >>
> >
> > --
> > Hal Finkel
> > Lead, Compiler Technology and Programming Languages
> > Leadership Computing Facility
> > Argonne National Laboratory
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> >
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Hal Finkel via llvm-dev

2017-Apr-12 22:24 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

On 04/11/2017 01:54 PM, Kostya Serebryany wrote:> Evgeniy and I recently discussed something similar for detecting bad 
> casts (code named: TypeSanitizer).
> The approach with the shadow memory looked attractive at the first 
> glance, but then we've drowned in details.
>
> Specifically for TBAA, I had another idea, not involving shadow memory.
> Consider LLVM queries MayAlias(P1, P2) and the result is false, i.e. 
> we think that P1 and P2 point to disjoint memory regions.
> Then we insert a run-time check that intervals [P1,sizeof(*P1)) and 
> [P2,sizeof(*P2)) don't intersect.
>
> For functions with a reasonable number of pointer pairs where 
> MayAlias(P1, P2)==false we could insert checks for all such pairs.
> For larger functions -- only for those pairs where the optimizer 
> actually queried MayAlias(P1, P2).
Interestingly, I tried something very close to this a couple of years 
ago: https://reviews.llvm.org/D4446 - I never committed it, maybe I 
should rebase it and we can look at it again. For the mode where we 
record AA queries, using a target that enables AA in the backend, I 
recorded the AA queries made during instruction scheduling. Then, during 
a second run, the instrumentation was inserted in the backend just 
before the IR was "frozen" for instruction selection. In this way the
IR
used to record the queries was the same as the IR as seen by the 
instrumenting pass.

The overlap checks could certainly suffer from the lifetime/memory-reuse 
issue that's been discussed, although that never same up in practice in 
my tests. I should add, however, that the compile-time overhead of 
inserting the checks, even just using the pre-recorded checks, was 
large. There are a lot of AA checks, even restricting to the intra-BB 
checks as done by the instruction scheduler; including other passes 
would only make that grow.

  -Hal
>
> --kcc
>
>
> On Tue, Apr 11, 2017 at 3:49 AM, Hal Finkel via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>
>     On 04/11/2017 03:46 AM, Andrey Bokhanko wrote:
>>     Hal,
>>
>>     To clarify, my example meant to illustrate that for memory
>>     references to structures' fields you have to keep a
user-defined
>>     type, even for one byte accesses. C++ allows references to
>>     "initial member sequence" using pointers to structures of
>>     different types. And yes, there are programs in the wild that
>>     rely on this (don't remember details... this was from previous
life).
>>
>>     Another thing to consider -- what about memset / memcpy / etc
>>     that inherently rely on type punning? If not inlined, they
don't
>>     present problems for an optimizer -- probably shouldn't be
>>     considered as aliasing rules violation?
>
>     Good point. You (likely) wouldn't want to compile your memcpy /
>     memset / etc. implementations with the TBAA sanitizer enabled (or
>     we'd need to provide an attribute to disable it for certain
>     functions). If they only use char* themselves, then it's fine, but
>     if not, that's implementation magic. However, memset, etc. does
>     need to be able to 'unset' the type of memory and memcpy needs
to
>     be able to copy types (at least for PODs). The sanitizer would
>     need to hook them for that purpose.
>
>      -Hal
>
>
>>
>>     Yours,
>>     Andrey
>>     ==>>     Compiler Architect
>>     NXP
>>
>>
>>     On Tue, Apr 11, 2017 at 12:05 AM, Hal Finkel <hfinkel at anl.gov
>>     <mailto:hfinkel at anl.gov>> wrote:
>>
>>
>>         On 04/10/2017 09:55 AM, Andrey Bokhanko wrote:
>>>         Hi Hal,
>>>
>>>         I wonder how your solution will handle the following?
>>>
>>>         struct {
>>>           int s1_f1;
>>>           float s1_f2;
>>>           int s1_f3;
>>>           float s1_f4;
>>>         } S1;
>>>
>>>         struct {
>>>           int s2_f1;
>>>           float s2_f2;
>>>           int *s2_f3; // to add some interest, suppose that
>>>         sizeof(int) == sizeof(int *)
>>>           float s2_f4;
>>>         } S2;
>>>
>>>         S1 *s1; S2 *s2;
>>>         ...
>>>         s2 = (S1*)s1;
>>>         s2->s2_f1 = 0; // allowed
>>>         s2->s2_f2 = 0; // allowed
>>>         s2->s2_f3 = 0; // not allowed
>>>         s2->s2_f4 = 0; // not allowed
>>>
>>>         Also, when you plan to set types for allocated memory?
>>
>>         The most-general thing seems to be to set the types along
>>         with a store. As a result, the proposed scheme would not find
>>         a fault with the code above, but would complain if anyone
>>         actually later read S1.s1_f3.
>>
>>         If we want to catch these kinds of problems directly we'd
>>         need to have the compiler insert code when the type is
>>         constructed to mark the types, and then we'd need to check
>>         those types around stores. This also sounds like a useful
>>         enhancement (although somewhat more complicated to implement).
>>
>>>         What types will be set for memory allocated by a malloc
call?
>>
>>         Memory would be untyped (or of unknown type) when allocated.
>>
>>         Thanks again,
>>         Hal
>>
>>
>>>
>>>         Yours,
>>>         Andrey
>>>
>>>
>>>         On Tue, Apr 4, 2017 at 10:13 PM, Hal Finkel via llvm-dev
>>>         <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>>
>>>         wrote:
>>>
>>>             Hi everyone,
>>>
>>>             At EuroLLVM, Chandler and I chatted about the design
for
>>>             a potential TBAA sanitizer. Here's my attempt to
summarize:
>>>
>>>             C/C++ have type-based aliasing rules, and LLVM's
>>>             optimizer can exploit these given TBAA metadata added
by
>>>             Clang. Roughly, a pointer of given type cannot be used
>>>             to access an object of a different type (with, of
>>>             course, certain exceptions). Unfortunately, there's
a
>>>             lot of code in the wild that violates these rules (e.g.
>>>             for type punning), and such code often must be built
>>>             with -fno-strict-aliasing. Performance is often
>>>             sacrificed as a result. Part of the problem is the
>>>             difficulty of finding TBAA violations. A sanitizer
would
>>>             help.
>>>
>>>             A design goal of a TBAA sanitizer is to limit the
>>>             shadow-memory overhead of the implementation. ASan, for
>>>             example, uses 1 bit per byte. Here we're hoping to
keep
>>>             the overhead down to 2 bits per byte for the TBAA
>>>             sanitizing. We might be able to do this, while handling
>>>             all common types on the fast path, if we use both
>>>             alignment and type information. When accessing data of
B
>>>             bytes, 2*B bits of shadow memory can be used. Thus,
>>>             we'll get 2 bits for a one-byte type, 4 bits for a
>>>             two-byte type, etc. Moreover, we need appropriate holes
>>>             in the encoding space so that no type has a shadow
>>>             encoding that overlaps with an aligned part of a larger
>>>             type's encoding.
>>>             For example, we need to detect:
>>>
>>>               double f = ...; return *(int*) &f; // We should
catch
>>>             this.
>>>
>>>             We might use the following encoding. The idea is that
>>>             the common case, for which we need a reasonable fast
>>>             path, is that type types are exactly equal. For this
>>>             case, we want a simple comparison of the shadow type
>>>             encodings to be sufficient to validate the access. For
>>>             cases where the encodings don't match (and
isn't zero to
>>>             indicate an unknown type), or for which there is no
>>>             direct encoding for the access type, a slow path must
be
>>>             used. All of this assumes that we're validating the
the
>>>             pointer alignment first, and then checking the type
>>>             encodings.
>>>
>>>             1 Byte:
>>>             00 = 0  = unknown type
>>>             01 = 1 = hole
>>>             10 = 2 = hole
>>>             11 = 3  = all one-byte types (slow path, see note later
>>>             on this)
>>>
>>>             2 Bytes:
>>>             0000 = 0 = unknown type
>>>             0101 = 5 = short
>>>             0110 = 6 = hole (A)
>>>             0111 = 7 = wchar_t (under some ABIs)
>>>             1001 = 9 = hole (B)
>>>             1010 = 10 = hole (C)
>>>             1011 = 11 = char16_t
>>>             1111 = 15 = all other types (slow path)
>>>
>>>             It is important here to have wchar_t have a direct
>>>             encoding here because wchar_t is two bytes on Windows,
>>>             and moreover, wchar_t is very commonly used on Windows.
>>>             The partial encoding overlap of wchar_t (i.e. 0111)
with
>>>             the 11 one-byte-type encoding works because 11 always
>>>             indicates a slow-path check.
>>>
>>>             4 Bytes:
>>>             0000 0000 = 0 = unknown type
>>>             A A = int
>>>             A B = float
>>>             B A  = pointer (under some ABIs)
>>>             B B = long (under some ABIs)
>>>             A 1111 = wchar_t (under some ABIs)
>>>             B 1111 = char32_t
>>>             A C = hole (D)
>>>             C A = hole (E)
>>>             B C = hole (F)
>>>             C B = hole (G)
>>>             C C = hole (H)
>>>             1111 1111 = 255 = all other types (slow path)
>>>
>>>             8 Bytes:
>>>             0000 0000 0000 0000 = 0 = unknown type
>>>             D D = double
>>>             D E = long (under some ABIs)
>>>             E D = long long (under some ABIs)
>>>             E E = long double (under some ABIs)
>>>             D F = pointer (under some ABIs)
>>>             F D = hole (I)
>>>             E F = hole (J)
>>>             F E = hole
>>>             F F = hole
>>>             ...
>>>             1111 1111 1111 1111 = 65535 = all other types
>>>
>>>             16 Bytes:
>>>             0 = unknown type
>>>             | | = __int128_t
>>>             I J = long long (under some ABIs)
>>>             J I = long double (under some ABIs)
>>>             J J = hole
>>>             ...
>>>             -1 = all other types
>>>
>>>             For pointers, this scheme would consider all pointers
to
>>>             be the same (regardless of pointee type). Doing
>>>             otherwise would mostly requiring putting pointer-type
>>>             checking on the slow path (i.e. access via a pointer
>>>             pointer), and that could add considerable overhead. We
>>>             might, however, split out function pointers from other
>>>             pointers. We could provide a compile-time option to
>>>             control the granularity of pointer-type checks.
>>>
>>>             Builtin vector types for which the vector element type
>>>             has a direct encoding also naturally have a direct
>>>             encoding (the concatenation of the encoding for the
>>>             element type).
>>>
>>>             Obviously the fact that we have no fast-path encodings
>>>             for one-byte types could be problematic. Note however
that:
>>>
>>>              1. If a larger type is being used to access a smaller
>>>             type (plus more), the encodings won't match, so we
>>>             always end up on the slow path.
>>>
>>>              2. If the access type is a one-byte type, we would
want
>>>             to validate quickly. However, most common one-byte
types
>>>             are universally aliasing (i.e. not subject to TBAA
>>>             violations). Specifically, for C/C++, pointers to char,
>>>             unsigned char, signed char (C only), and std::byte, can
>>>             be used to access any part of any type. That leaves
>>>             signed char (C++ only), bool/_Bool, and enums with a
>>>             [signed/unsigned] char base type (C++ only, std::byte
>>>             exempted) as pointee types we might wish to validate.
>>>             We'd always need to fall back to the slow path to
>>>             validate these. We could provide a compile-time option
>>>             to disable such one-byte access checking if necessary.
>>>
>>>             How would the slow path work? First, the code needs to
>>>             find the beginning of the allocation. It can do this by
>>>             scanning backwards in the ASan shadow memory. Once
>>>             located, we'll read a pointer to a type-description
>>>             structure from that "red zone" location. For
dynamic
>>>             allocations, ASan's allocator will ensure such a
space
>>>             for the pointer exists. For static allocations and
>>>             globals, the compiler will ensure it exists. The
>>>             compiler will make sure that all constructors locate
>>>             this field and fill it in. Destructors can clear it. If
>>>             two of these type-description-structure pointers are
>>>             equal, then we can conclude that the types are equal.
If
>>>             not, then we need to interpret the structure. The
>>>             pointer itself might be to an interval map (to deal
with
>>>             arrays, placement new, etc. - we can use the low bit of
>>>             the pointer to differentiate between an actual
>>>             type-description structure and an interval map), and
the
>>>             leaves of the interval map point to actual
>>>             type-description structures. The type-description
>>>             structure is an array of (offset, type) pairs, where
the
>>>             type field is also a type-description-structure
pointer.
>>>             The type-description structures themselves are comdat
>>>             globals emitted in each relevant translation unit,
where
>>>             the comdat key is formed using the mangled type name
>>>             (and size, etc.), and pointers to these symbols are
then
>>>             used to identify the types.
>>>
>>>             Thoughts?
>>>
>>>             Thanks again,
>>>             Hal
>>>
>>>             -- 
>>>             Hal Finkel
>>>             Lead, Compiler Technology and Programming Languages
>>>             Leadership Computing Facility
>>>             Argonne National Laboratory
>>>
>>>             _______________________________________________
>>>             LLVM Developers mailing list
>>>             llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>            
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>
>>>
>>
>>         -- 
>>         Hal Finkel
>>         Lead, Compiler Technology and Programming Languages
>>         Leadership Computing Facility
>>         Argonne National Laboratory
>>
>>
>
>     -- 
>     Hal Finkel
>     Lead, Compiler Technology and Programming Languages
>     Leadership Computing Facility
>     Argonne National Laboratory
>
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170412/662db838/attachment-0001.html>

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Apr 2017 - [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

Maybe Matching Threads