thr3ads.net - llvm dev - [llvm-dev] [RFC] Design of a TBAA sanitizer [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2017-Apr-04 20:13 UTC

[llvm-dev] [RFC] Design of a TBAA sanitizer

Hi everyone,

At EuroLLVM, Chandler and I chatted about the design for a potential 
TBAA sanitizer. Here's my attempt to summarize:

C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit 
these given TBAA metadata added by Clang. Roughly, a pointer of given 
type cannot be used to access an object of a different type (with, of 
course, certain exceptions). Unfortunately, there's a lot of code in the 
wild that violates these rules (e.g. for type punning), and such code 
often must be built with -fno-strict-aliasing. Performance is often 
sacrificed as a result. Part of the problem is the difficulty of finding 
TBAA violations. A sanitizer would help.

A design goal of a TBAA sanitizer is to limit the shadow-memory overhead 
of the implementation. ASan, for example, uses 1 bit per byte. Here 
we're hoping to keep the overhead down to 2 bits per byte for the TBAA 
sanitizing. We might be able to do this, while handling all common types 
on the fast path, if we use both alignment and type information. When 
accessing data of B bytes, 2*B bits of shadow memory can be used. Thus, 
we'll get 2 bits for a one-byte type, 4 bits for a two-byte type, etc. 
Moreover, we need appropriate holes in the encoding space so that no 
type has a shadow encoding that overlaps with an aligned part of a 
larger type's encoding.
For example, we need to detect:

   double f = ...; return *(int*) &f; // We should catch this.

We might use the following encoding. The idea is that the common case, 
for which we need a reasonable fast path, is that type types are exactly 
equal. For this case, we want a simple comparison of the shadow type 
encodings to be sufficient to validate the access. For cases where the 
encodings don't match (and isn't zero to indicate an unknown type), or 
for which there is no direct encoding for the access type, a slow path 
must be used. All of this assumes that we're validating the the pointer 
alignment first, and then checking the type encodings.

1 Byte:
00 = 0  = unknown type
01 = 1 = hole
10 = 2 = hole
11 = 3  = all one-byte types (slow path, see note later on this)

2 Bytes:
0000 = 0 = unknown type
0101 = 5 = short
0110 = 6 = hole (A)
0111 = 7 = wchar_t (under some ABIs)
1001 = 9 = hole (B)
1010 = 10 = hole (C)
1011 = 11 = char16_t
1111 = 15 = all other types (slow path)

It is important here to have wchar_t have a direct encoding here because 
wchar_t is two bytes on Windows, and moreover, wchar_t is very commonly 
used on Windows. The partial encoding overlap of wchar_t (i.e. 0111) 
with the 11 one-byte-type encoding works because 11 always indicates a 
slow-path check.

4 Bytes:
0000 0000 = 0 = unknown type
A A = int
A B = float
B A  = pointer (under some ABIs)
B B = long (under some ABIs)
A 1111 = wchar_t (under some ABIs)
B 1111 = char32_t
A C = hole (D)
C A = hole (E)
B C = hole (F)
C B = hole (G)
C C = hole (H)
1111 1111 = 255 = all other types (slow path)

8 Bytes:
0000 0000 0000 0000 = 0 = unknown type
D D = double
D E = long (under some ABIs)
E D = long long (under some ABIs)
E E = long double (under some ABIs)
D F = pointer (under some ABIs)
F D = hole (I)
E F = hole (J)
F E = hole
F F = hole
...
1111 1111 1111 1111 = 65535 = all other types

16 Bytes:
0 = unknown type
| | = __int128_t
I J = long long (under some ABIs)
J I = long double (under some ABIs)
J J = hole
...
-1 = all other types

For pointers, this scheme would consider all pointers to be the same 
(regardless of pointee type). Doing otherwise would mostly requiring 
putting pointer-type checking on the slow path (i.e. access via a 
pointer pointer), and that could add considerable overhead. We might, 
however, split out function pointers from other pointers. We could 
provide a compile-time option to control the granularity of pointer-type 
checks.

Builtin vector types for which the vector element type has a direct 
encoding also naturally have a direct encoding (the concatenation of the 
encoding for the element type).

Obviously the fact that we have no fast-path encodings for one-byte 
types could be problematic. Note however that:

  1. If a larger type is being used to access a smaller type (plus 
more), the encodings won't match, so we always end up on the slow path.

  2. If the access type is a one-byte type, we would want to validate 
quickly. However, most common one-byte types are universally aliasing 
(i.e. not subject to TBAA violations). Specifically, for C/C++, pointers 
to char, unsigned char, signed char (C only), and std::byte, can be used 
to access any part of any type. That leaves signed char (C++ only), 
bool/_Bool, and enums with a [signed/unsigned] char base type (C++ only, 
std::byte exempted) as pointee types we might wish to validate. We'd 
always need to fall back to the slow path to validate these. We could 
provide a compile-time option to disable such one-byte access checking 
if necessary.

How would the slow path work? First, the code needs to find the 
beginning of the allocation. It can do this by scanning backwards in the 
ASan shadow memory. Once located, we'll read a pointer to a 
type-description structure from that "red zone" location. For dynamic 
allocations, ASan's allocator will ensure such a space for the pointer 
exists. For static allocations and globals, the compiler will ensure it 
exists. The compiler will make sure that all constructors locate this 
field and fill it in. Destructors can clear it. If two of these 
type-description-structure pointers are equal, then we can conclude that 
the types are equal. If not, then we need to interpret the structure. 
The pointer itself might be to an interval map (to deal with arrays, 
placement new, etc. - we can use the low bit of the pointer to 
differentiate between an actual type-description structure and an 
interval map), and the leaves of the interval map point to actual 
type-description structures. The type-description structure is an array 
of (offset, type) pairs, where the type field is also a 
type-description-structure pointer. The type-description structures 
themselves are comdat globals emitted in each relevant translation unit, 
where the comdat key is formed using the mangled type name (and size, 
etc.), and pointers to these symbols are then used to identify the types.

Thoughts?

Thanks again,
Hal

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Dean Michael Berris via llvm-dev

2017-Apr-07 03:15 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

Hi Hal,

I'm not too knowledgable about these areas so pardon the potentially
ignorant questions.

First, does this mean the instrumentation/rewriting happens at the front-end so
we can identify places where the aliasing might happen and annotate those when
generating the IR? Say, in clang, does it only annotate potentially egregious
cases or does it have to do it for all pointer operations?

Second, how do you report the errors in the sanitiser? Is the intent to run like
ASAN where it will fail on cases where it trips? Or does it only collect the
information?

Third, what would the results look like? Can it tell where the aliasing
violations happened?

Lastly, how do features like c++11 unions get tracked, or the upcoming
std::variant<...> implementations that might do some trickery? I suspect
this is also dependent on things like alignment and padding, and even with
packed representations of structs that get union'ed with character arrays,
etc.

/me quickly Googles for TBAA's definition.

Cheers
> On 5 Apr 2017, at 06:13, Hal Finkel via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi everyone,
> 
> At EuroLLVM, Chandler and I chatted about the design for a potential TBAA
sanitizer. Here's my attempt to summarize:
> 
> C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
these given TBAA metadata added by Clang. Roughly, a pointer of given type
cannot be used to access an object of a different type (with, of course, certain
exceptions). Unfortunately, there's a lot of code in the wild that violates
these rules (e.g. for type punning), and such code often must be built with
-fno-strict-aliasing. Performance is often sacrificed as a result. Part of the
problem is the difficulty of finding TBAA violations. A sanitizer would help.
> 
> A design goal of a TBAA sanitizer is to limit the shadow-memory overhead of
the implementation. ASan, for example, uses 1 bit per byte. Here we're
hoping to keep the overhead down to 2 bits per byte for the TBAA sanitizing. We
might be able to do this, while handling all common types on the fast path, if
we use both alignment and type information. When accessing data of B bytes, 2*B
bits of shadow memory can be used. Thus, we'll get 2 bits for a one-byte
type, 4 bits for a two-byte type, etc. Moreover, we need appropriate holes in
the encoding space so that no type has a shadow encoding that overlaps with an
aligned part of a larger type's encoding.
> For example, we need to detect:
> 
>  double f = ...; return *(int*) &f; // We should catch this.
> 
> We might use the following encoding. The idea is that the common case, for
which we need a reasonable fast path, is that type types are exactly equal. For
this case, we want a simple comparison of the shadow type encodings to be
sufficient to validate the access. For cases where the encodings don't match
(and isn't zero to indicate an unknown type), or for which there is no
direct encoding for the access type, a slow path must be used. All of this
assumes that we're validating the the pointer alignment first, and then
checking the type encodings.
> 
> 1 Byte:
> 00 = 0  = unknown type
> 01 = 1 = hole
> 10 = 2 = hole
> 11 = 3  = all one-byte types (slow path, see note later on this)
> 
> 2 Bytes:
> 0000 = 0 = unknown type
> 0101 = 5 = short
> 0110 = 6 = hole (A)
> 0111 = 7 = wchar_t (under some ABIs)
> 1001 = 9 = hole (B)
> 1010 = 10 = hole (C)
> 1011 = 11 = char16_t
> 1111 = 15 = all other types (slow path)
> 
> It is important here to have wchar_t have a direct encoding here because
wchar_t is two bytes on Windows, and moreover, wchar_t is very commonly used on
Windows. The partial encoding overlap of wchar_t (i.e. 0111) with the 11
one-byte-type encoding works because 11 always indicates a slow-path check.
> 
> 4 Bytes:
> 0000 0000 = 0 = unknown type
> A A = int
> A B = float
> B A  = pointer (under some ABIs)
> B B = long (under some ABIs)
> A 1111 = wchar_t (under some ABIs)
> B 1111 = char32_t
> A C = hole (D)
> C A = hole (E)
> B C = hole (F)
> C B = hole (G)
> C C = hole (H)
> 1111 1111 = 255 = all other types (slow path)
> 
> 8 Bytes:
> 0000 0000 0000 0000 = 0 = unknown type
> D D = double
> D E = long (under some ABIs)
> E D = long long (under some ABIs)
> E E = long double (under some ABIs)
> D F = pointer (under some ABIs)
> F D = hole (I)
> E F = hole (J)
> F E = hole
> F F = hole
> ...
> 1111 1111 1111 1111 = 65535 = all other types
> 
> 16 Bytes:
> 0 = unknown type
> | | = __int128_t
> I J = long long (under some ABIs)
> J I = long double (under some ABIs)
> J J = hole
> ...
> -1 = all other types
> 
> For pointers, this scheme would consider all pointers to be the same
(regardless of pointee type). Doing otherwise would mostly requiring putting
pointer-type checking on the slow path (i.e. access via a pointer pointer), and
that could add considerable overhead. We might, however, split out function
pointers from other pointers. We could provide a compile-time option to control
the granularity of pointer-type checks.
> 
> Builtin vector types for which the vector element type has a direct
encoding also naturally have a direct encoding (the concatenation of the
encoding for the element type).
> 
> Obviously the fact that we have no fast-path encodings for one-byte types
could be problematic. Note however that:
> 
> 1. If a larger type is being used to access a smaller type (plus more), the
encodings won't match, so we always end up on the slow path.
> 
> 2. If the access type is a one-byte type, we would want to validate
quickly. However, most common one-byte types are universally aliasing (i.e. not
subject to TBAA violations). Specifically, for C/C++, pointers to char, unsigned
char, signed char (C only), and std::byte, can be used to access any part of any
type. That leaves signed char (C++ only), bool/_Bool, and enums with a
[signed/unsigned] char base type (C++ only, std::byte exempted) as pointee types
we might wish to validate. We'd always need to fall back to the slow path to
validate these. We could provide a compile-time option to disable such one-byte
access checking if necessary.
> 
> How would the slow path work? First, the code needs to find the beginning
of the allocation. It can do this by scanning backwards in the ASan shadow
memory. Once located, we'll read a pointer to a type-description structure
from that "red zone" location. For dynamic allocations, ASan's
allocator will ensure such a space for the pointer exists. For static
allocations and globals, the compiler will ensure it exists. The compiler will
make sure that all constructors locate this field and fill it in. Destructors
can clear it. If two of these type-description-structure pointers are equal,
then we can conclude that the types are equal. If not, then we need to interpret
the structure. The pointer itself might be to an interval map (to deal with
arrays, placement new, etc. - we can use the low bit of the pointer to
differentiate between an actual type-description structure and an interval map),
and the leaves of the interval map point to actual type-description structures.
The type-description structure is an array of (offset, type) pairs, where the
type field is also a type-description-structure pointer. The type-description
structures themselves are comdat globals emitted in each relevant translation
unit, where the comdat key is formed using the mangled type name (and size,
etc.), and pointers to these symbols are then used to identify the types.
> 
> Thoughts?
> 
> Thanks again,
> Hal
> 
> -- 
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- Dean

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/153a7552/attachment.html>

Stephen Kell via llvm-dev

2017-Apr-07 14:26 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

> At EuroLLVM, Chandler and I chatted about the design for a potential
> TBAA sanitizer. Here's my attempt to summarize:
> 
> C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
> these given TBAA metadata added by Clang. Roughly, a pointer of given
> type cannot be used to access an object of a different type (with, of
> course, certain exceptions). Unfortunately, there's a lot of code in
the
> wild that violates these rules (e.g. for type punning), and such code
> often must be built with -fno-strict-aliasing. Performance is often
> sacrificed as a result. Part of the problem is the difficulty of finding
> TBAA violations. A sanitizer would help.
It's not quite the same thing, but at last year's EuroLLVM, Chris 
Diamand and I presented a Clang-based tool for detecting bad pointer 
casts. This seems to be fairly good at identifying code which needs 
-fno-strict-aliasing... although it is not designed for checking of 
exactly that. <http://www.llvm.org/devmtg/2016-03/#presentation9>

Funnily enough, I was about to e-mail this list with some notes about 
that system, since I currently have some spare cycles which could 
perhaps help get the Clang/LLVM parts contributed. I'll mostly save 
those thoughts for a separate thread (appearing soon!), but continuing 
on-topic below....
> A design goal of a TBAA sanitizer is to limit the shadow-memory overhead
> of the implementation. ASan, for example, uses 1 bit per byte. Here
> we're hoping to keep the overhead down to 2 bits per byte for the TBAA
> sanitizing. We might be able to do this, while handling all common types
> on the fast path, if we use both alignment and type information. 
Slightly provocative question, but are you sure that byte-scale shadow 
memory is a good fit? It might be, depending on the rest of the design 
(more below). A possible alternative is to maintain metadata only at 
allocation granularity... my approach does this, and it helps keep the 
overhead pretty low in time and space, with two caveats. One is about 
checking casts, not dereferences (more below); the other is needing more 
complex run-time machinery to understand what "allocations" are. 
(However, that has a ton of other applications, so I believe it's very 
good value.)
> When
> accessing data of B bytes, 2*B bits of shadow memory can be used. Thus,
> we'll get 2 bits for a one-byte type, 4 bits for a two-byte type, etc.
> Moreover, we need appropriate holes in the encoding space so that no
> type has a shadow encoding that overlaps with an aligned part of a
> larger type's encoding. For example, we need to detect:
> 
>   double f = ...; return *(int*) &f; // We should catch this.
> 
> We might use the following encoding. The idea is that the common case,
> for which we need a reasonable fast path, is that type types are exactly
> equal. For this case, we want a simple comparison of the shadow type
> encodings to be sufficient to validate the access. For cases where the
> encodings don't match (and isn't zero to indicate an unknown type),
or
> for which there is no direct encoding for the access type, a slow path
> must be used. All of this assumes that we're validating the the pointer
> alignment first, and then checking the type encodings.
> 
> 1 Byte:
> 00 = 0  = unknown type
> 01 = 1 = hole
> 10 = 2 = hole
> 11 = 3  = all one-byte types (slow path, see note later on this)
> 
> 2 Bytes:
> 0000 = 0 = unknown type
> 0101 = 5 = short
> 0110 = 6 = hole (A)
> 0111 = 7 = wchar_t (under some ABIs)
> 1001 = 9 = hole (B)
> 1010 = 10 = hole (C)
> 1011 = 11 = char16_t
> 1111 = 15 = all other types (slow path)
> 
> It is important here to have wchar_t have a direct encoding here because
> wchar_t is two bytes on Windows, and moreover, wchar_t is very commonly
> used on Windows. The partial encoding overlap of wchar_t (i.e. 0111)
> with the 11 one-byte-type encoding works because 11 always indicates a
> slow-path check.
> 
> 4 Bytes:
> 0000 0000 = 0 = unknown type
> A A = int
> A B = float
> B A  = pointer (under some ABIs)
> B B = long (under some ABIs)
> A 1111 = wchar_t (under some ABIs)
> B 1111 = char32_t
> A C = hole (D)
> C A = hole (E)
> B C = hole (F)
> C B = hole (G)
> C C = hole (H)
> 1111 1111 = 255 = all other types (slow path)
> 
> 8 Bytes:
> 0000 0000 0000 0000 = 0 = unknown type
> D D = double
> D E = long (under some ABIs)
> E D = long long (under some ABIs)
> E E = long double (under some ABIs)
> D F = pointer (under some ABIs)
> F D = hole (I)
> E F = hole (J)
> F E = hole
> F F = hole
> ...
> 1111 1111 1111 1111 = 65535 = all other types
> 
> 16 Bytes:
> 0 = unknown type
> | | = __int128_t
> I J = long long (under some ABIs)
> J I = long double (under some ABIs)
> J J = hole
> ...
> -1 = all other types
Just so I'm understanding: is this talking about shadowing memory with 
its "leaf" type only, or with its "top-level" type? So if I
have, say,
the following usually-64-bit type:

struct compound { int x; float y; }

... would we paint the shadow area with an alternating pattern of int 
(0110 0110) and floats (0110 1001)? Or would we be using the "all other 
types" thing since the memory is actually "compound"?
> For pointers, this scheme would consider all pointers to be the same
> (regardless of pointee type). Doing otherwise would mostly requiring
> putting pointer-type checking on the slow path (i.e. access via a
> pointer pointer), and that could add considerable overhead. We might,
> however, split out function pointers from other pointers. We could
> provide a compile-time option to control the granularity of pointer-type
> checks.
Aha, the pointers-to-pointers problem. Treating all pointers the same 
feels like a mistake because, say, aliasing a T* through a void** is 
frequently done and really quite useful (although technically UB, if I 
understand correctly; is this exploited in LLVM?), whereas aliasing a T* 
through an arbitrary S** is probably a bug.

Just for contrast: I'm familiar with this problem, and my way around 
this comes in a few parts. I check pointer creation (casts) only; checks 
on pointer use (dereference) are unnecessary. Usually it takes a while 
before people believe me on this, but the short story is that 
pointer-creation checks enforce an invariant that no bad-type pointer 
gets into circulation. So dereferences are always okay (until the first 
warning, at least; execution can continue after a warning, in a 
best-effort no-guarantees fashion). This also greatly helps performance. 
I precisely record the type of pointer-typed storage, e.g. so an array 
of T** really is recorded as such. There are then some special 
allowances for "generic pointers to pointers", including void** ,
which
need checking *writes* through them. These checks ignore the 
dereferenced pointer's type, and check directly for a match between the 
type of the written-to storage (the pointer in target memory) and the 
type of the pointed-to object (i.e. what's on the end of the pointer 
being written). Caching can part-amortise repeated similar checks.
> Builtin vector types for which the vector element type has a direct
> encoding also naturally have a direct encoding (the concatenation of the
> encoding for the element type).
> 
> Obviously the fact that we have no fast-path encodings for one-byte
> types could be problematic. Note however that:
> 
>  1. If a larger type is being used to access a smaller type (plus more),
> the encodings won't match, so we always end up on the slow path.
> 
>  2. If the access type is a one-byte type, we would want to validate
> quickly. 
> However, most common one-byte types are universally aliasing (i.e. not
> subject to TBAA violations). Specifically, for C/C++, pointers to char,
> unsigned char, signed char (C only), and std::byte, can be used to
> access any part of any type. That leaves signed char (C++ only),
> bool/_Bool, and enums with a [signed/unsigned] char base type (C++ only,
> std::byte exempted) as pointee types we might wish to validate. We'd
> always need to fall back to the slow path to validate these. We could
> provide a compile-time option to disable such one-byte access checking
> if necessary.
> 
> How would the slow path work? First, the code needs to find the
> beginning of the allocation. It can do this by scanning backwards in the
> ASan shadow memory. Once located, we'll read a pointer to a
> type-description structure from that "red zone" location. For
dynamic
> allocations, ASan's allocator will ensure such a space for the pointer
> exists. For static allocations and globals, the compiler will ensure it
> exists. The compiler will make sure that all constructors locate this
> field and fill it in. Destructors can clear it. If two of these
> type-description-structure pointers are equal, then we can conclude that
> the types are equal. If not, then we need to interpret the structure.
> The pointer itself might be to an interval map (to deal with arrays,
> placement new, etc. - we can use the low bit of the pointer to
> differentiate between an actual type-description structure and an
> interval map), and the leaves of the interval map point to actual
> type-description structures. The type-description structure is an array
> of (offset, type) pairs, where the type field is also a
> type-description-structure pointer. The type-description structures
> themselves are comdat globals emitted in each relevant translation unit,
> where the comdat key is formed using the mangled type name (and size,
> etc.), and pointers to these symbols are then used to identify the
> types.
Just as a note, my infrastructure uses a similar comdat trick for the
type descriptors, although the internal structure is quite a bit more
elaborate.
> Thoughts?
My high-level thought is the following question. Is the plan fixed on 
developing a tool specifically for C/C++ effective-type rules, i.e. 
focused on catching/fixing those particular cases of UB and with the 
goal of improving user code performance? Or is there any interest in the 
overlapping but distinct problem of helping users detect and debug 
bad-pointer problems of a type flavour (i.e. those that are not bounds 
errors or temporal errors)?

Plenty of pointer bugs are not UB... e.g. for any heap allocation 
("object without a declared type", in C11-speak) I'm allowed to
scribble
over it, thereby changing its effective type. It's only UB if I read it 
with its previous effective type. But the scribbling itself is fairly 
likely to be the bug, because programmers rarely want to change the type 
of an allocation mid-lifetime. Meanwhile, many pointer-type problems are 
too indirect or complex for the compiler to actually do TBAA-derived 
optimisation on, leaving little to be gained from the UB/TBAA point of 
view... but again, plenty from the debugging perspective. I suspect the 
biggest value of any tool, even if geared specifically towards TBAA, 
will turn out to be largely in debugging scenarios more general than 
effective-type UB bugs.

I realise all the above might come across as "here's the problem
I'd
solve instead" which may not be helpful... I'm interested by any tool
in
this general space, so looking forward to hearing more, and happy to 
follow up on any of the above,

Stephen.

PS: on top of the EuroLLVM '16 talk, in case you're wondering about how 
my system works, the following research papers might help. Or feel free 
just to ask questions....

- "Dynamically diagnosing run-time type errors in unsafe code" (OOPSLA
'16)
<http://www.cl.cam.ac.uk/~srk31/#oopsla16a>

- "Towards a dynamic object model within Unix processes" (Onward
'15)
<http://www.cl.cam.ac.uk/~srk31/#onward15>

... or code if you prefer: 
<https://github.com/stephenrkell/liballocs>
<https://github.com/stephenrkell/libcrunch>
<https://github.com/chrisdiamand/clangcrunch>.

Stephen Kell via llvm-dev

2017-Apr-10 10:15 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

> At EuroLLVM, Chandler and I chatted about the design for a potential
> TBAA sanitizer. Here's my attempt to summarize:
>
> C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
> these given TBAA metadata added by Clang. Roughly, a pointer of given
> type cannot be used to access an object of a different type (with, of
> course, certain exceptions). Unfortunately, there's a lot of code in
the
> wild that violates these rules (e.g. for type punning), and such code
> often must be built with -fno-strict-aliasing. Performance is often
> sacrificed as a result. Part of the problem is the difficulty of finding
> TBAA violations. A sanitizer would help.
It's not quite the same thing, but at last year's EuroLLVM, Chris
Diamand and I presented a Clang-based tool for detecting bad pointer
casts. This seems to be fairly good at identifying code which needs
-fno-strict-aliasing... although it is not designed for checking of
exactly that. <http://www.llvm.org/devmtg/2016-03/#presentation9>

Funnily enough, I was about to e-mail this list with some notes about
that system, since I currently have some spare cycles which could
perhaps help get the Clang/LLVM parts contributed. I'll mostly save
those thoughts for a separate thread (appearing soon!), but continuing
on-topic below....
> A design goal of a TBAA sanitizer is to limit the shadow-memory overhead
> of the implementation. ASan, for example, uses 1 bit per byte. Here
> we're hoping to keep the overhead down to 2 bits per byte for the TBAA
> sanitizing. We might be able to do this, while handling all common types
> on the fast path, if we use both alignment and type information.
Slightly provocative question, but are you sure that byte-scale shadow
memory is a good fit? It might be, depending on the rest of the design
(more below). A possible alternative is to maintain metadata only at
allocation granularity... my approach does this, and it helps keep the
overhead pretty low in time and space, with two caveats. One is about
checking casts, not dereferences (more below); the other is needing more
complex run-time machinery to understand what "allocations" are.
(However, that has a ton of other applications, so I believe it's very
good value.)
> When
> accessing data of B bytes, 2*B bits of shadow memory can be used. Thus,
> we'll get 2 bits for a one-byte type, 4 bits for a two-byte type, etc.
> Moreover, we need appropriate holes in the encoding space so that no
> type has a shadow encoding that overlaps with an aligned part of a
> larger type's encoding. For example, we need to detect:
>
>   double f = ...; return *(int*) &f; // We should catch this.
>
> We might use the following encoding. The idea is that the common case,
> for which we need a reasonable fast path, is that type types are exactly
> equal. For this case, we want a simple comparison of the shadow type
> encodings to be sufficient to validate the access. For cases where the
> encodings don't match (and isn't zero to indicate an unknown type),
or
> for which there is no direct encoding for the access type, a slow path
> must be used. All of this assumes that we're validating the the pointer
> alignment first, and then checking the type encodings.
> 
> 1 Byte:
> 00 = 0  = unknown type
> 01 = 1 = hole
> 10 = 2 = hole
> 11 = 3  = all one-byte types (slow path, see note later on this)
> 
> 2 Bytes:
> 0000 = 0 = unknown type
> 0101 = 5 = short
> 0110 = 6 = hole (A)
> 0111 = 7 = wchar_t (under some ABIs)
> 1001 = 9 = hole (B)
> 1010 = 10 = hole (C)
> 1011 = 11 = char16_t
> 1111 = 15 = all other types (slow path)
>
> It is important here to have wchar_t have a direct encoding here because
> wchar_t is two bytes on Windows, and moreover, wchar_t is very commonly
> used on Windows. The partial encoding overlap of wchar_t (i.e. 0111)
> with the 11 one-byte-type encoding works because 11 always indicates a
> slow-path check.
> 
> 4 Bytes:
> 0000 0000 = 0 = unknown type
> A A = int
> A B = float
> B A  = pointer (under some ABIs)
> B B = long (under some ABIs)
> A 1111 = wchar_t (under some ABIs)
> B 1111 = char32_t
> A C = hole (D)
> C A = hole (E)
> B C = hole (F)
> C B = hole (G)
> C C = hole (H)
> 1111 1111 = 255 = all other types (slow path)
> 
> 8 Bytes:
> 0000 0000 0000 0000 = 0 = unknown type
> D D = double
> D E = long (under some ABIs)
> E D = long long (under some ABIs)
> E E = long double (under some ABIs)
> D F = pointer (under some ABIs)
> F D = hole (I)
> E F = hole (J)
> F E = hole
> F F = hole
> ...
> 1111 1111 1111 1111 = 65535 = all other types
> 
> 16 Bytes:
> 0 = unknown type
> | | = __int128_t
> I J = long long (under some ABIs)
> J I = long double (under some ABIs)
> J J = hole
> ...
> -1 = all other types
Just so I'm understanding: is this talking about shadowing memory with
its "leaf" type only, or with its "top-level" type? So if I
have, say,
the following usually-64-bit type:

struct compound { int x; float y; }

... would we paint the shadow area with an alternating pattern of int
(0110 0110) and floats (0110 1001)? Or would we be using the "all other
types" thing since the memory is actually "compound"?
> For pointers, this scheme would consider all pointers to be the same
> (regardless of pointee type). Doing otherwise would mostly requiring
> putting pointer-type checking on the slow path (i.e. access via a
> pointer pointer), and that could add considerable overhead. We might,
> however, split out function pointers from other pointers. We could
> provide a compile-time option to control the granularity of pointer-type
> checks.
Aha, the pointers-to-pointers problem. Treating all pointers the same 
feels like a mistake because, say, aliasing a T* through a void** is 
often done and really quite useful, whereas aliasing a T* through an 
arbitrary S** is probably a bug. That said, maybe the former only 
belongs in -fno-strict-aliasing code (it's technically UB), in which 
case better to catch it. (Does LLVM exploit this UB?)

Just for contrast: I'm familiar with this problem, and my way around
this comes in a few parts. I check pointer creation (casts) only; checks
on pointer use (dereference) are unnecessary. Usually it takes a while
before people believe me on this, but the short story is that
pointer-creation checks enforce an invariant that no bad-type pointer
gets into circulation. So dereferences are always okay (until the first
warning, at least; execution can continue after a warning, in a
best-effort no-guarantees fashion). This also greatly helps performance.
I precisely record the type of pointer-typed storage, e.g. so an array
of T** really is recorded as such. There are then some special
allowances for "generic pointers to pointers", including void** ,
which
need checking *writes* through them. These checks ignore the
dereferenced pointer's type, and check directly for a match between the
type of the written-to storage (the pointer in target memory) and the
type of the pointed-to object (i.e. what's on the end of the pointer
being written). Caching can part-amortise repeated similar checks.
> Builtin vector types for which the vector element type has a direct
> encoding also naturally have a direct encoding (the concatenation of the
> encoding for the element type).
> 
> Obviously the fact that we have no fast-path encodings for one-byte
> types could be problematic. Note however that:
> 
>  1. If a larger type is being used to access a smaller type (plus more),
> the encodings won't match, so we always end up on the slow path.
> 
>  2. If the access type is a one-byte type, we would want to validate
> quickly.
> However, most common one-byte types are universally aliasing (i.e. not
> subject to TBAA violations). Specifically, for C/C++, pointers to char,
> unsigned char, signed char (C only), and std::byte, can be used to
> access any part of any type. That leaves signed char (C++ only),
> bool/_Bool, and enums with a [signed/unsigned] char base type (C++ only,
> std::byte exempted) as pointee types we might wish to validate. We'd
> always need to fall back to the slow path to validate these. We could
> provide a compile-time option to disable such one-byte access checking
> if necessary.
>
> How would the slow path work? First, the code needs to find the
> beginning of the allocation. It can do this by scanning backwards in the
> ASan shadow memory. Once located, we'll read a pointer to a
> type-description structure from that "red zone" location. For
dynamic
> allocations, ASan's allocator will ensure such a space for the pointer
> exists. For static allocations and globals, the compiler will ensure it
> exists. The compiler will make sure that all constructors locate this
> field and fill it in. Destructors can clear it. If two of these
> type-description-structure pointers are equal, then we can conclude that
> the types are equal. If not, then we need to interpret the structure.
> The pointer itself might be to an interval map (to deal with arrays,
> placement new, etc. - we can use the low bit of the pointer to
> differentiate between an actual type-description structure and an
> interval map), and the leaves of the interval map point to actual
> type-description structures. The type-description structure is an array
> of (offset, type) pairs, where the type field is also a
> type-description-structure pointer. The type-description structures
> themselves are comdat globals emitted in each relevant translation unit,
> where the comdat key is formed using the mangled type name (and size,
> etc.), and pointers to these symbols are then used to identify the
> types.
Just as a note, my infrastructure uses a similar comdat trick for the
type descriptors, although the internal structure is quite a bit more
elaborate.
> Thoughts?
My high-level thought is the following question. Is the plan fixed on
developing a tool specifically for C/C++ effective-type rules, i.e.
focused on catching/fixing those particular cases of UB and with the
goal of improving user code performance? Or is there any interest in the
overlapping but distinct problem of helping users detect and debug
bad-pointer problems of a type flavour (i.e. those that are not bounds
errors or temporal errors)?

Plenty of pointer bugs are not UB... e.g. for any heap allocation
("object without a declared type", in C11-speak) I'm allowed to
scribble
over it, thereby changing its effective type. It's only UB if I read it
with its previous effective type. But the scribbling itself is fairly
likely to be the bug, because programmers rarely want to change the type
of an allocation mid-lifetime. Meanwhile, many pointer-type problems are
too indirect or complex for the compiler to actually do TBAA-derived
optimisation on, leaving little to be gained from the UB/TBAA point of
view... but again, plenty from the debugging perspective. I suspect the
biggest value of any tool, even if geared specifically towards TBAA,
will turn out to be largely in debugging scenarios more general than
effective-type UB bugs.

I realise all the above might come across as "here's the problem
I'd
solve instead" which may not be helpful... I'm interested by any tool
in
this general space, so looking forward to hearing more, and happy to
follow up on any of the above,

Stephen.

PS: on top of the EuroLLVM '16 talk, in case you're wondering about how
my system works, the following research papers might help. Or feel free
just to ask questions....

- "Dynamically diagnosing run-time type errors in unsafe code" (OOPSLA
'16)
<http://www.cl.cam.ac.uk/~srk31/#oopsla16a>

- "Towards a dynamic object model within Unix processes" (Onward
'15)
<http://www.cl.cam.ac.uk/~srk31/#onward15>

... or code if you prefer:
<https://github.com/stephenrkell/liballocs>
<https://github.com/stephenrkell/libcrunch>
<https://github.com/chrisdiamand/clangcrunch>.

Andrey Bokhanko via llvm-dev

2017-Apr-10 14:55 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

Hi Hal,

I wonder how your solution will handle the following?

struct {
  int s1_f1;
  float s1_f2;
  int s1_f3;
  float s1_f4;
} S1;

struct {
  int s2_f1;
  float s2_f2;
  int *s2_f3; // to add some interest, suppose that sizeof(int) =sizeof(int *)
  float s2_f4;
} S2;

S1 *s1; S2 *s2;
...
s2 = (S1*)s1;
s2->s2_f1 = 0; // allowed
s2->s2_f2 = 0; // allowed
s2->s2_f3 = 0; // not allowed
s2->s2_f4 = 0; // not allowed

Also, when you plan to set types for allocated memory? What types will be
set for memory allocated by a malloc call?

Yours,
Andrey


On Tue, Apr 4, 2017 at 10:13 PM, Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi everyone,
>
> At EuroLLVM, Chandler and I chatted about the design for a potential TBAA
> sanitizer. Here's my attempt to summarize:
>
> C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
> these given TBAA metadata added by Clang. Roughly, a pointer of given type
> cannot be used to access an object of a different type (with, of course,
> certain exceptions). Unfortunately, there's a lot of code in the wild
that
> violates these rules (e.g. for type punning), and such code often must be
> built with -fno-strict-aliasing. Performance is often sacrificed as a
> result. Part of the problem is the difficulty of finding TBAA violations. A
> sanitizer would help.
>
> A design goal of a TBAA sanitizer is to limit the shadow-memory overhead
> of the implementation. ASan, for example, uses 1 bit per byte. Here
we're
> hoping to keep the overhead down to 2 bits per byte for the TBAA
> sanitizing. We might be able to do this, while handling all common types on
> the fast path, if we use both alignment and type information. When
> accessing data of B bytes, 2*B bits of shadow memory can be used. Thus,
> we'll get 2 bits for a one-byte type, 4 bits for a two-byte type, etc.
> Moreover, we need appropriate holes in the encoding space so that no type
> has a shadow encoding that overlaps with an aligned part of a larger
type's
> encoding.
> For example, we need to detect:
>
>   double f = ...; return *(int*) &f; // We should catch this.
>
> We might use the following encoding. The idea is that the common case, for
> which we need a reasonable fast path, is that type types are exactly equal.
> For this case, we want a simple comparison of the shadow type encodings to
> be sufficient to validate the access. For cases where the encodings
don't
> match (and isn't zero to indicate an unknown type), or for which there
is
> no direct encoding for the access type, a slow path must be used. All of
> this assumes that we're validating the the pointer alignment first, and
> then checking the type encodings.
>
> 1 Byte:
> 00 = 0  = unknown type
> 01 = 1 = hole
> 10 = 2 = hole
> 11 = 3  = all one-byte types (slow path, see note later on this)
>
> 2 Bytes:
> 0000 = 0 = unknown type
> 0101 = 5 = short
> 0110 = 6 = hole (A)
> 0111 = 7 = wchar_t (under some ABIs)
> 1001 = 9 = hole (B)
> 1010 = 10 = hole (C)
> 1011 = 11 = char16_t
> 1111 = 15 = all other types (slow path)
>
> It is important here to have wchar_t have a direct encoding here because
> wchar_t is two bytes on Windows, and moreover, wchar_t is very commonly
> used on Windows. The partial encoding overlap of wchar_t (i.e. 0111) with
> the 11 one-byte-type encoding works because 11 always indicates a slow-path
> check.
>
> 4 Bytes:
> 0000 0000 = 0 = unknown type
> A A = int
> A B = float
> B A  = pointer (under some ABIs)
> B B = long (under some ABIs)
> A 1111 = wchar_t (under some ABIs)
> B 1111 = char32_t
> A C = hole (D)
> C A = hole (E)
> B C = hole (F)
> C B = hole (G)
> C C = hole (H)
> 1111 1111 = 255 = all other types (slow path)
>
> 8 Bytes:
> 0000 0000 0000 0000 = 0 = unknown type
> D D = double
> D E = long (under some ABIs)
> E D = long long (under some ABIs)
> E E = long double (under some ABIs)
> D F = pointer (under some ABIs)
> F D = hole (I)
> E F = hole (J)
> F E = hole
> F F = hole
> ...
> 1111 1111 1111 1111 = 65535 = all other types
>
> 16 Bytes:
> 0 = unknown type
> | | = __int128_t
> I J = long long (under some ABIs)
> J I = long double (under some ABIs)
> J J = hole
> ...
> -1 = all other types
>
> For pointers, this scheme would consider all pointers to be the same
> (regardless of pointee type). Doing otherwise would mostly requiring
> putting pointer-type checking on the slow path (i.e. access via a pointer
> pointer), and that could add considerable overhead. We might, however,
> split out function pointers from other pointers. We could provide a
> compile-time option to control the granularity of pointer-type checks.
>
> Builtin vector types for which the vector element type has a direct
> encoding also naturally have a direct encoding (the concatenation of the
> encoding for the element type).
>
> Obviously the fact that we have no fast-path encodings for one-byte types
> could be problematic. Note however that:
>
>  1. If a larger type is being used to access a smaller type (plus more),
> the encodings won't match, so we always end up on the slow path.
>
>  2. If the access type is a one-byte type, we would want to validate
> quickly. However, most common one-byte types are universally aliasing (i.e.
> not subject to TBAA violations). Specifically, for C/C++, pointers to char,
> unsigned char, signed char (C only), and std::byte, can be used to access
> any part of any type. That leaves signed char (C++ only), bool/_Bool, and
> enums with a [signed/unsigned] char base type (C++ only, std::byte
> exempted) as pointee types we might wish to validate. We'd always need
to
> fall back to the slow path to validate these. We could provide a
> compile-time option to disable such one-byte access checking if necessary.
>
> How would the slow path work? First, the code needs to find the beginning
> of the allocation. It can do this by scanning backwards in the ASan shadow
> memory. Once located, we'll read a pointer to a type-description
structure
> from that "red zone" location. For dynamic allocations,
ASan's allocator
> will ensure such a space for the pointer exists. For static allocations and
> globals, the compiler will ensure it exists. The compiler will make sure
> that all constructors locate this field and fill it in. Destructors can
> clear it. If two of these type-description-structure pointers are equal,
> then we can conclude that the types are equal. If not, then we need to
> interpret the structure. The pointer itself might be to an interval map (to
> deal with arrays, placement new, etc. - we can use the low bit of the
> pointer to differentiate between an actual type-description structure and
> an interval map), and the leaves of the interval map point to actual
> type-description structures. The type-description structure is an array of
> (offset, type) pairs, where the type field is also a
> type-description-structure pointer. The type-description structures
> themselves are comdat globals emitted in each relevant translation unit,
> where the comdat key is formed using the mangled type name (and size,
> etc.), and pointers to these symbols are then used to identify the types.
>
> Thoughts?
>
> Thanks again,
> Hal
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170410/ebd4806f/attachment.html>

Hal Finkel via llvm-dev

2017-Apr-10 18:46 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

On 04/06/2017 10:15 PM, Dean Michael Berris wrote:> Hi Hal,
>
> I'm not too knowledgable about these areas so pardon the potentially 
> ignorant questions.
No problem.
>
> First, does this mean the instrumentation/rewriting happens at the 
> front-end so we can identify places where the aliasing might happen 
> and annotate those when generating the IR? Say, in clang, does it only 
> annotate potentially egregious cases or does it have to do it for all 
> pointer operations?
Clang already generates TBAA metadata on relevant memory accesses, and I 
envision an IR-level instrumentation pass looking for memory accesses 
with TBAA metadata and generating TBAA-sanitizer checks prior to such 
accesses. The simplest way to do this is to set the types on stores and 
verify them on loads (along with atomicrmw and cmpxchg). We can also 
clear out type information upon encountering a lifetime.end intrinsic.
>
> Second, how do you report the errors in the sanitiser? Is the intent 
> to run like ASAN where it will fail on cases where it trips? Or does 
> it only collect the information?
I propose that it run along with ASAN, specifically as an enhancement to 
ASAN, using the existing ASAN shadow data to find the beginning of 
allocations for the slow-path check.
>
> Third, what would the results look like? Can it tell where the 
> aliasing violations happened?
This should happen very much like ASAN. The difference being that the 
resulting report will name the type being loaded and the type actually 
stored at the relevant location.
>
> Lastly, how do features like c++11 unions get tracked, or the upcoming 
> std::variant<...> implementations that might do some trickery? I 
> suspect this is also dependent on things like alignment and padding, 
> and even with packed representations of structs that get union'ed with 
> character arrays, etc.
I think that all of those things should just work; they all follow the 
rule that to read a type you need to write that type first, and unions, 
variant, etc. ensure that types are stored at properly-aligned addresses 
for each type.

Thanks,
Hal
>
> /me quickly Googles for TBAA's definition.
>
> Cheers
>
>> On 5 Apr 2017, at 06:13, Hal Finkel via llvm-dev 
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>>
>> Hi everyone,
>>
>> At EuroLLVM, Chandler and I chatted about the design for a potential 
>> TBAA sanitizer. Here's my attempt to summarize:
>>
>> C/C++ have type-based aliasing rules, and LLVM's optimizer can 
>> exploit these given TBAA metadata added by Clang. Roughly, a pointer 
>> of given type cannot be used to access an object of a different type 
>> (with, of course, certain exceptions). Unfortunately, there's a lot
>> of code in the wild that violates these rules (e.g. for type 
>> punning), and such code often must be built with 
>> -fno-strict-aliasing. Performance is often sacrificed as a result. 
>> Part of the problem is the difficulty of finding TBAA violations. A 
>> sanitizer would help.
>>
>> A design goal of a TBAA sanitizer is to limit the shadow-memory 
>> overhead of the implementation. ASan, for example, uses 1 bit per 
>> byte. Here we're hoping to keep the overhead down to 2 bits per
byte
>> for the TBAA sanitizing. We might be able to do this, while handling 
>> all common types on the fast path, if we use both alignment and type 
>> information. When accessing data of B bytes, 2*B bits of shadow 
>> memory can be used. Thus, we'll get 2 bits for a one-byte type, 4 
>> bits for a two-byte type, etc. Moreover, we need appropriate holes in 
>> the encoding space so that no type has a shadow encoding that 
>> overlaps with an aligned part of a larger type's encoding.
>> For example, we need to detect:
>>
>>  double f = ...; return *(int*) &f; // We should catch this.
>>
>> We might use the following encoding. The idea is that the common 
>> case, for which we need a reasonable fast path, is that type types 
>> are exactly equal. For this case, we want a simple comparison of the 
>> shadow type encodings to be sufficient to validate the access. For 
>> cases where the encodings don't match (and isn't zero to
indicate an
>> unknown type), or for which there is no direct encoding for the 
>> access type, a slow path must be used. All of this assumes that
we're
>> validating the the pointer alignment first, and then checking the 
>> type encodings.
>>
>> 1 Byte:
>> 00 = 0  = unknown type
>> 01 = 1 = hole
>> 10 = 2 = hole
>> 11 = 3  = all one-byte types (slow path, see note later on this)
>>
>> 2 Bytes:
>> 0000 = 0 = unknown type
>> 0101 = 5 = short
>> 0110 = 6 = hole (A)
>> 0111 = 7 = wchar_t (under some ABIs)
>> 1001 = 9 = hole (B)
>> 1010 = 10 = hole (C)
>> 1011 = 11 = char16_t
>> 1111 = 15 = all other types (slow path)
>>
>> It is important here to have wchar_t have a direct encoding here 
>> because wchar_t is two bytes on Windows, and moreover, wchar_t is 
>> very commonly used on Windows. The partial encoding overlap of 
>> wchar_t (i.e. 0111) with the 11 one-byte-type encoding works because 
>> 11 always indicates a slow-path check.
>>
>> 4 Bytes:
>> 0000 0000 = 0 = unknown type
>> A A = int
>> A B = float
>> B A  = pointer (under some ABIs)
>> B B = long (under some ABIs)
>> A 1111 = wchar_t (under some ABIs)
>> B 1111 = char32_t
>> A C = hole (D)
>> C A = hole (E)
>> B C = hole (F)
>> C B = hole (G)
>> C C = hole (H)
>> 1111 1111 = 255 = all other types (slow path)
>>
>> 8 Bytes:
>> 0000 0000 0000 0000 = 0 = unknown type
>> D D = double
>> D E = long (under some ABIs)
>> E D = long long (under some ABIs)
>> E E = long double (under some ABIs)
>> D F = pointer (under some ABIs)
>> F D = hole (I)
>> E F = hole (J)
>> F E = hole
>> F F = hole
>> ...
>> 1111 1111 1111 1111 = 65535 = all other types
>>
>> 16 Bytes:
>> 0 = unknown type
>> | | = __int128_t
>> I J = long long (under some ABIs)
>> J I = long double (under some ABIs)
>> J J = hole
>> ...
>> -1 = all other types
>>
>> For pointers, this scheme would consider all pointers to be the same 
>> (regardless of pointee type). Doing otherwise would mostly requiring 
>> putting pointer-type checking on the slow path (i.e. access via a 
>> pointer pointer), and that could add considerable overhead. We might, 
>> however, split out function pointers from other pointers. We could 
>> provide a compile-time option to control the granularity of 
>> pointer-type checks.
>>
>> Builtin vector types for which the vector element type has a direct 
>> encoding also naturally have a direct encoding (the concatenation of 
>> the encoding for the element type).
>>
>> Obviously the fact that we have no fast-path encodings for one-byte 
>> types could be problematic. Note however that:
>>
>> 1. If a larger type is being used to access a smaller type (plus 
>> more), the encodings won't match, so we always end up on the slow
path.
>>
>> 2. If the access type is a one-byte type, we would want to validate 
>> quickly. However, most common one-byte types are universally aliasing 
>> (i.e. not subject to TBAA violations). Specifically, for C/C++, 
>> pointers to char, unsigned char, signed char (C only), and std::byte, 
>> can be used to access any part of any type. That leaves signed char 
>> (C++ only), bool/_Bool, and enums with a [signed/unsigned] char base 
>> type (C++ only, std::byte exempted) as pointee types we might wish to 
>> validate. We'd always need to fall back to the slow path to
validate
>> these. We could provide a compile-time option to disable such 
>> one-byte access checking if necessary.
>>
>> How would the slow path work? First, the code needs to find the 
>> beginning of the allocation. It can do this by scanning backwards in 
>> the ASan shadow memory. Once located, we'll read a pointer to a 
>> type-description structure from that "red zone" location. For
dynamic
>> allocations, ASan's allocator will ensure such a space for the 
>> pointer exists. For static allocations and globals, the compiler will 
>> ensure it exists. The compiler will make sure that all constructors 
>> locate this field and fill it in. Destructors can clear it. If two of 
>> these type-description-structure pointers are equal, then we can 
>> conclude that the types are equal. If not, then we need to interpret 
>> the structure. The pointer itself might be to an interval map (to 
>> deal with arrays, placement new, etc. - we can use the low bit of the 
>> pointer to differentiate between an actual type-description structure 
>> and an interval map), and the leaves of the interval map point to 
>> actual type-description structures. The type-description structure is 
>> an array of (offset, type) pairs, where the type field is also a 
>> type-description-structure pointer. The type-description structures 
>> themselves are comdat globals emitted in each relevant translation 
>> unit, where the comdat key is formed using the mangled type name (and 
>> size, etc.), and pointers to these symbols are then used to identify 
>> the types.
>>
>> Thoughts?
>>
>> Thanks again,
>> Hal
>>
>> -- 
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> -- Dean
>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170410/27ea34a6/attachment-0001.html>

Hal Finkel via llvm-dev

2017-Apr-10 21:42 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

On 04/07/2017 09:26 AM, Stephen Kell wrote:>> At EuroLLVM, Chandler and I chatted about the design for a potential
>> TBAA sanitizer. Here's my attempt to summarize:
>>
>> C/C++ have type-based aliasing rules, and LLVM's optimizer can
exploit
>> these given TBAA metadata added by Clang. Roughly, a pointer of given
>> type cannot be used to access an object of a different type (with, of
>> course, certain exceptions). Unfortunately, there's a lot of code
in the
>> wild that violates these rules (e.g. for type punning), and such code
>> often must be built with -fno-strict-aliasing. Performance is often
>> sacrificed as a result. Part of the problem is the difficulty of
finding
>> TBAA violations. A sanitizer would help.
> It's not quite the same thing, but at last year's EuroLLVM, Chris
> Diamand and I presented a Clang-based tool for detecting bad pointer
> casts. This seems to be fairly good at identifying code which needs
> -fno-strict-aliasing... although it is not designed for checking of
> exactly that. <http://www.llvm.org/devmtg/2016-03/#presentation9>
>
> Funnily enough, I was about to e-mail this list with some notes about
> that system, since I currently have some spare cycles which could
> perhaps help get the Clang/LLVM parts contributed. I'll mostly save
> those thoughts for a separate thread (appearing soon!), but continuing
> on-topic below....
Sounds neat.
>
>> A design goal of a TBAA sanitizer is to limit the shadow-memory
overhead
>> of the implementation. ASan, for example, uses 1 bit per byte. Here
>> we're hoping to keep the overhead down to 2 bits per byte for the
TBAA
>> sanitizing. We might be able to do this, while handling all common
types
>> on the fast path, if we use both alignment and type information.
> Slightly provocative question, but are you sure that byte-scale shadow
> memory is a good fit?
Am I sure? No. But I *think* that is because I'd like to avoid false 
positives, and that means dealing with places where we dynamically 
change the type of part of an allocation (via placement new or 
whatever). This certainly seems necessary to deal with some C++ 
containers at least.
>   It might be, depending on the rest of the design
> (more below). A possible alternative is to maintain metadata only at
> allocation granularity... my approach does this, and it helps keep the
> overhead pretty low in time and space, with two caveats. One is about
> checking casts, not dereferences (more below); the other is needing more
> complex run-time machinery to understand what "allocations" are.
> (However, that has a ton of other applications, so I believe it's very
> good value.)
>
>> When
>> accessing data of B bytes, 2*B bits of shadow memory can be used. Thus,
>> we'll get 2 bits for a one-byte type, 4 bits for a two-byte type,
etc.
>> Moreover, we need appropriate holes in the encoding space so that no
>> type has a shadow encoding that overlaps with an aligned part of a
>> larger type's encoding. For example, we need to detect:
>>
>>    double f = ...; return *(int*) &f; // We should catch this.
>>
>> We might use the following encoding. The idea is that the common case,
>> for which we need a reasonable fast path, is that type types are
exactly
>> equal. For this case, we want a simple comparison of the shadow type
>> encodings to be sufficient to validate the access. For cases where the
>> encodings don't match (and isn't zero to indicate an unknown
type), or
>> for which there is no direct encoding for the access type, a slow path
>> must be used. All of this assumes that we're validating the the
pointer
>> alignment first, and then checking the type encodings.
>>
>> 1 Byte:
>> 00 = 0  = unknown type
>> 01 = 1 = hole
>> 10 = 2 = hole
>> 11 = 3  = all one-byte types (slow path, see note later on this)
>>
>> 2 Bytes:
>> 0000 = 0 = unknown type
>> 0101 = 5 = short
>> 0110 = 6 = hole (A)
>> 0111 = 7 = wchar_t (under some ABIs)
>> 1001 = 9 = hole (B)
>> 1010 = 10 = hole (C)
>> 1011 = 11 = char16_t
>> 1111 = 15 = all other types (slow path)
>>
>> It is important here to have wchar_t have a direct encoding here
because
>> wchar_t is two bytes on Windows, and moreover, wchar_t is very commonly
>> used on Windows. The partial encoding overlap of wchar_t (i.e. 0111)
>> with the 11 one-byte-type encoding works because 11 always indicates a
>> slow-path check.
>>
>> 4 Bytes:
>> 0000 0000 = 0 = unknown type
>> A A = int
>> A B = float
>> B A  = pointer (under some ABIs)
>> B B = long (under some ABIs)
>> A 1111 = wchar_t (under some ABIs)
>> B 1111 = char32_t
>> A C = hole (D)
>> C A = hole (E)
>> B C = hole (F)
>> C B = hole (G)
>> C C = hole (H)
>> 1111 1111 = 255 = all other types (slow path)
>>
>> 8 Bytes:
>> 0000 0000 0000 0000 = 0 = unknown type
>> D D = double
>> D E = long (under some ABIs)
>> E D = long long (under some ABIs)
>> E E = long double (under some ABIs)
>> D F = pointer (under some ABIs)
>> F D = hole (I)
>> E F = hole (J)
>> F E = hole
>> F F = hole
>> ...
>> 1111 1111 1111 1111 = 65535 = all other types
>>
>> 16 Bytes:
>> 0 = unknown type
>> | | = __int128_t
>> I J = long long (under some ABIs)
>> J I = long double (under some ABIs)
>> J J = hole
>> ...
>> -1 = all other types
> Just so I'm understanding: is this talking about shadowing memory with
> its "leaf" type only, or with its "top-level" type? So
if I have, say,
> the following usually-64-bit type:
>
> struct compound { int x; float y; }
>
> ... would we paint the shadow area with an alternating pattern of int
> (0110 0110) and floats (0110 1001)? Or would we be using the "all
other
> types" thing since the memory is actually "compound"?
I intended to imply that we'd fill with the alternating pattern 
indicating int and float.
>
>> For pointers, this scheme would consider all pointers to be the same
>> (regardless of pointee type). Doing otherwise would mostly requiring
>> putting pointer-type checking on the slow path (i.e. access via a
>> pointer pointer), and that could add considerable overhead. We might,
>> however, split out function pointers from other pointers. We could
>> provide a compile-time option to control the granularity of
pointer-type
>> checks.
> Aha, the pointers-to-pointers problem. Treating all pointers the same
> feels like a mistake because, say, aliasing a T* through a void** is
> frequently done and really quite useful (although technically UB, if I
> understand correctly; is this exploited in LLVM?
I don't know if it is exploited by LLVM, although as I recall, all 
implementations of posix_memalign and similar functions potentially run 
afoul of the rules (because they need to store the new pointer as a 
void* regardless of what it actually is). I'm not sure how strict we can 
be here in practice. Currently, AFAIKT, Clang emits TBAA metadata which 
does not differentiate between pointer types (it calls them all "any 
pointer").
> ), whereas aliasing a T*
> through an arbitrary S** is probably a bug.
>
> Just for contrast: I'm familiar with this problem, and my way around
> this comes in a few parts. I check pointer creation (casts) only; checks
> on pointer use (dereference) are unnecessary. Usually it takes a while
> before people believe me on this, but the short story is that
> pointer-creation checks enforce an invariant that no bad-type pointer
> gets into circulation. So dereferences are always okay (until the first
> warning, at least; execution can continue after a warning, in a
> best-effort no-guarantees fashion). This also greatly helps performance.
I'm sure this is true, but the problem is that C/C++ don't actually 
outlaw such pointer casts. As you also mention below, it only becomes a 
problem if such pointers are used to access an object of some 
incompatible type. If the sanitizer has false positives then it is not 
nearly as useful to me, even though the technique you describe might 
actually find more bugs. They seems like complementary techniques.
>   
> I precisely record the type of pointer-typed storage, e.g. so an array
> of T** really is recorded as such. There are then some special
> allowances for "generic pointers to pointers", including void** ,
which
> need checking *writes* through them. These checks ignore the
> dereferenced pointer's type, and check directly for a match between the
> type of the written-to storage (the pointer in target memory) and the
> type of the pointed-to object (i.e. what's on the end of the pointer
> being written). Caching can part-amortise repeated similar checks.
>
>> Builtin vector types for which the vector element type has a direct
>> encoding also naturally have a direct encoding (the concatenation of
the
>> encoding for the element type).
>>
>> Obviously the fact that we have no fast-path encodings for one-byte
>> types could be problematic. Note however that:
>>
>>   1. If a larger type is being used to access a smaller type (plus
more),
>> the encodings won't match, so we always end up on the slow path.
>>
>>   2. If the access type is a one-byte type, we would want to validate
>> quickly.
>> However, most common one-byte types are universally aliasing (i.e. not
>> subject to TBAA violations). Specifically, for C/C++, pointers to char,
>> unsigned char, signed char (C only), and std::byte, can be used to
>> access any part of any type. That leaves signed char (C++ only),
>> bool/_Bool, and enums with a [signed/unsigned] char base type (C++
only,
>> std::byte exempted) as pointee types we might wish to validate.
We'd
>> always need to fall back to the slow path to validate these. We could
>> provide a compile-time option to disable such one-byte access checking
>> if necessary.
>>
>> How would the slow path work? First, the code needs to find the
>> beginning of the allocation. It can do this by scanning backwards in
the
>> ASan shadow memory. Once located, we'll read a pointer to a
>> type-description structure from that "red zone" location. For
dynamic
>> allocations, ASan's allocator will ensure such a space for the
pointer
>> exists. For static allocations and globals, the compiler will ensure it
>> exists. The compiler will make sure that all constructors locate this
>> field and fill it in. Destructors can clear it. If two of these
>> type-description-structure pointers are equal, then we can conclude
that
>> the types are equal. If not, then we need to interpret the structure.
>> The pointer itself might be to an interval map (to deal with arrays,
>> placement new, etc. - we can use the low bit of the pointer to
>> differentiate between an actual type-description structure and an
>> interval map), and the leaves of the interval map point to actual
>> type-description structures. The type-description structure is an array
>> of (offset, type) pairs, where the type field is also a
>> type-description-structure pointer. The type-description structures
>> themselves are comdat globals emitted in each relevant translation
unit,
>> where the comdat key is formed using the mangled type name (and size,
>> etc.), and pointers to these symbols are then used to identify the
>> types.
> Just as a note, my infrastructure uses a similar comdat trick for the
> type descriptors, although the internal structure is quite a bit more
> elaborate.
>
>> Thoughts?
> My high-level thought is the following question. Is the plan fixed on
> developing a tool specifically for C/C++ effective-type rules, i.e.
> focused on catching/fixing those particular cases of UB and with the
> goal of improving user code performance? Or is there any interest in the
> overlapping but distinct problem of helping users detect and debug
> bad-pointer problems of a type flavour (i.e. those that are not bounds
> errors or temporal errors)?
My goal is the former, although I certainly believe there is also value 
in the latter.
>
> Plenty of pointer bugs are not UB... e.g. for any heap allocation
> ("object without a declared type", in C11-speak) I'm allowed
to scribble
> over it, thereby changing its effective type. It's only UB if I read it
> with its previous effective type.
Yes.
>   But the scribbling itself is fairly
> likely to be the bug,
Agreed. However, there are also legitimate use cases, and 
implementations of standard types (variant, any, etc.) may actually do this.
>   because programmers rarely want to change the type
> of an allocation mid-lifetime. Meanwhile, many pointer-type problems are
> too indirect or complex for the compiler to actually do TBAA-derived
> optimisation on, leaving little to be gained from the UB/TBAA point of
> view... but again, plenty from the debugging perspective. I suspect the
> biggest value of any tool, even if geared specifically towards TBAA,
> will turn out to be largely in debugging scenarios more general than
> effective-type UB bugs.
My motivation is to help people who are currently forced to build their 
code with -fno-strict-aliasing figure out what needs to be fixed in 
their code so they don't need to do that.
>
> I realise all the above might come across as "here's the problem
I'd
> solve instead" which may not be helpful... I'm interested by any
tool in
> this general space, so looking forward to hearing more, and happy to
> follow up on any of the above,
This was good feedback. Your work is indeed focused on a slightly 
different problem. It could be interesting and useful to have both.

Thanks,
Hal
>
> Stephen.
>
> PS: on top of the EuroLLVM '16 talk, in case you're wondering about
how
> my system works, the following research papers might help. Or feel free
> just to ask questions....
>
> - "Dynamically diagnosing run-time type errors in unsafe code"
(OOPSLA '16)
> <http://www.cl.cam.ac.uk/~srk31/#oopsla16a>
>
> - "Towards a dynamic object model within Unix processes" (Onward
'15)
> <http://www.cl.cam.ac.uk/~srk31/#onward15>
>
> ... or code if you prefer:
> <https://github.com/stephenrkell/liballocs>
> <https://github.com/stephenrkell/libcrunch>
> <https://github.com/chrisdiamand/clangcrunch>.
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Hal Finkel via llvm-dev

2017-Apr-10 22:05 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

On 04/10/2017 09:55 AM, Andrey Bokhanko wrote:> Hi Hal,
>
> I wonder how your solution will handle the following?
>
> struct {
>   int s1_f1;
>   float s1_f2;
>   int s1_f3;
>   float s1_f4;
> } S1;
>
> struct {
>   int s2_f1;
>   float s2_f2;
>   int *s2_f3; // to add some interest, suppose that sizeof(int) == 
> sizeof(int *)
>   float s2_f4;
> } S2;
>
> S1 *s1; S2 *s2;
> ...
> s2 = (S1*)s1;
> s2->s2_f1 = 0; // allowed
> s2->s2_f2 = 0; // allowed
> s2->s2_f3 = 0; // not allowed
> s2->s2_f4 = 0; // not allowed
>
> Also, when you plan to set types for allocated memory?
The most-general thing seems to be to set the types along with a store. 
As a result, the proposed scheme would not find a fault with the code 
above, but would complain if anyone actually later read S1.s1_f3.

If we want to catch these kinds of problems directly we'd need to have 
the compiler insert code when the type is constructed to mark the types, 
and then we'd need to check those types around stores. This also sounds 
like a useful enhancement (although somewhat more complicated to implement).
> What types will be set for memory allocated by a malloc call?
Memory would be untyped (or of unknown type) when allocated.

Thanks again,
Hal
>
> Yours,
> Andrey
>
>
> On Tue, Apr 4, 2017 at 10:13 PM, Hal Finkel via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>     Hi everyone,
>
>     At EuroLLVM, Chandler and I chatted about the design for a
>     potential TBAA sanitizer. Here's my attempt to summarize:
>
>     C/C++ have type-based aliasing rules, and LLVM's optimizer can
>     exploit these given TBAA metadata added by Clang. Roughly, a
>     pointer of given type cannot be used to access an object of a
>     different type (with, of course, certain exceptions).
>     Unfortunately, there's a lot of code in the wild that violates
>     these rules (e.g. for type punning), and such code often must be
>     built with -fno-strict-aliasing. Performance is often sacrificed
>     as a result. Part of the problem is the difficulty of finding TBAA
>     violations. A sanitizer would help.
>
>     A design goal of a TBAA sanitizer is to limit the shadow-memory
>     overhead of the implementation. ASan, for example, uses 1 bit per
>     byte. Here we're hoping to keep the overhead down to 2 bits per
>     byte for the TBAA sanitizing. We might be able to do this, while
>     handling all common types on the fast path, if we use both
>     alignment and type information. When accessing data of B bytes,
>     2*B bits of shadow memory can be used. Thus, we'll get 2 bits for
>     a one-byte type, 4 bits for a two-byte type, etc. Moreover, we
>     need appropriate holes in the encoding space so that no type has a
>     shadow encoding that overlaps with an aligned part of a larger
>     type's encoding.
>     For example, we need to detect:
>
>       double f = ...; return *(int*) &f; // We should catch this.
>
>     We might use the following encoding. The idea is that the common
>     case, for which we need a reasonable fast path, is that type types
>     are exactly equal. For this case, we want a simple comparison of
>     the shadow type encodings to be sufficient to validate the access.
>     For cases where the encodings don't match (and isn't zero to
>     indicate an unknown type), or for which there is no direct
>     encoding for the access type, a slow path must be used. All of
>     this assumes that we're validating the the pointer alignment
>     first, and then checking the type encodings.
>
>     1 Byte:
>     00 = 0  = unknown type
>     01 = 1 = hole
>     10 = 2 = hole
>     11 = 3  = all one-byte types (slow path, see note later on this)
>
>     2 Bytes:
>     0000 = 0 = unknown type
>     0101 = 5 = short
>     0110 = 6 = hole (A)
>     0111 = 7 = wchar_t (under some ABIs)
>     1001 = 9 = hole (B)
>     1010 = 10 = hole (C)
>     1011 = 11 = char16_t
>     1111 = 15 = all other types (slow path)
>
>     It is important here to have wchar_t have a direct encoding here
>     because wchar_t is two bytes on Windows, and moreover, wchar_t is
>     very commonly used on Windows. The partial encoding overlap of
>     wchar_t (i.e. 0111) with the 11 one-byte-type encoding works
>     because 11 always indicates a slow-path check.
>
>     4 Bytes:
>     0000 0000 = 0 = unknown type
>     A A = int
>     A B = float
>     B A  = pointer (under some ABIs)
>     B B = long (under some ABIs)
>     A 1111 = wchar_t (under some ABIs)
>     B 1111 = char32_t
>     A C = hole (D)
>     C A = hole (E)
>     B C = hole (F)
>     C B = hole (G)
>     C C = hole (H)
>     1111 1111 = 255 = all other types (slow path)
>
>     8 Bytes:
>     0000 0000 0000 0000 = 0 = unknown type
>     D D = double
>     D E = long (under some ABIs)
>     E D = long long (under some ABIs)
>     E E = long double (under some ABIs)
>     D F = pointer (under some ABIs)
>     F D = hole (I)
>     E F = hole (J)
>     F E = hole
>     F F = hole
>     ...
>     1111 1111 1111 1111 = 65535 = all other types
>
>     16 Bytes:
>     0 = unknown type
>     | | = __int128_t
>     I J = long long (under some ABIs)
>     J I = long double (under some ABIs)
>     J J = hole
>     ...
>     -1 = all other types
>
>     For pointers, this scheme would consider all pointers to be the
>     same (regardless of pointee type). Doing otherwise would mostly
>     requiring putting pointer-type checking on the slow path (i.e.
>     access via a pointer pointer), and that could add considerable
>     overhead. We might, however, split out function pointers from
>     other pointers. We could provide a compile-time option to control
>     the granularity of pointer-type checks.
>
>     Builtin vector types for which the vector element type has a
>     direct encoding also naturally have a direct encoding (the
>     concatenation of the encoding for the element type).
>
>     Obviously the fact that we have no fast-path encodings for
>     one-byte types could be problematic. Note however that:
>
>      1. If a larger type is being used to access a smaller type (plus
>     more), the encodings won't match, so we always end up on the slow
>     path.
>
>      2. If the access type is a one-byte type, we would want to
>     validate quickly. However, most common one-byte types are
>     universally aliasing (i.e. not subject to TBAA violations).
>     Specifically, for C/C++, pointers to char, unsigned char, signed
>     char (C only), and std::byte, can be used to access any part of
>     any type. That leaves signed char (C++ only), bool/_Bool, and
>     enums with a [signed/unsigned] char base type (C++ only, std::byte
>     exempted) as pointee types we might wish to validate. We'd always
>     need to fall back to the slow path to validate these. We could
>     provide a compile-time option to disable such one-byte access
>     checking if necessary.
>
>     How would the slow path work? First, the code needs to find the
>     beginning of the allocation. It can do this by scanning backwards
>     in the ASan shadow memory. Once located, we'll read a pointer to a
>     type-description structure from that "red zone" location. For
>     dynamic allocations, ASan's allocator will ensure such a space for
>     the pointer exists. For static allocations and globals, the
>     compiler will ensure it exists. The compiler will make sure that
>     all constructors locate this field and fill it in. Destructors can
>     clear it. If two of these type-description-structure pointers are
>     equal, then we can conclude that the types are equal. If not, then
>     we need to interpret the structure. The pointer itself might be to
>     an interval map (to deal with arrays, placement new, etc. - we can
>     use the low bit of the pointer to differentiate between an actual
>     type-description structure and an interval map), and the leaves of
>     the interval map point to actual type-description structures. The
>     type-description structure is an array of (offset, type) pairs,
>     where the type field is also a type-description-structure pointer.
>     The type-description structures themselves are comdat globals
>     emitted in each relevant translation unit, where the comdat key is
>     formed using the mangled type name (and size, etc.), and pointers
>     to these symbols are then used to identify the types.
>
>     Thoughts?
>
>     Thanks again,
>     Hal
>
>     -- 
>     Hal Finkel
>     Lead, Compiler Technology and Programming Languages
>     Leadership Computing Facility
>     Argonne National Laboratory
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170410/1eee16d3/attachment.html>

Hal Finkel via llvm-dev

2017-Apr-18 23:24 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

Hi again,

While not using the compressed shadow representation discussed here, 
mostly because it can't handle the struct-path (offset) information in 
our TBAA, I've posted patches for an initial TBAA sanitizer implementation:

https://reviews.llvm.org/D32197 (Runtime)
https://reviews.llvm.org/D32198 (LLVM)
https://reviews.llvm.org/D32199 (Clang)

Please try it out / review / etc.

  -Hal

On 04/04/2017 03:13 PM, Hal Finkel wrote:> Hi everyone,
>
> At EuroLLVM, Chandler and I chatted about the design for a potential 
> TBAA sanitizer. Here's my attempt to summarize:
>
> C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit 
> these given TBAA metadata added by Clang. Roughly, a pointer of given 
> type cannot be used to access an object of a different type (with, of 
> course, certain exceptions). Unfortunately, there's a lot of code in 
> the wild that violates these rules (e.g. for type punning), and such 
> code often must be built with -fno-strict-aliasing. Performance is 
> often sacrificed as a result. Part of the problem is the difficulty of 
> finding TBAA violations. A sanitizer would help.
>
> A design goal of a TBAA sanitizer is to limit the shadow-memory 
> overhead of the implementation. ASan, for example, uses 1 bit per 
> byte. Here we're hoping to keep the overhead down to 2 bits per byte 
> for the TBAA sanitizing. We might be able to do this, while handling 
> all common types on the fast path, if we use both alignment and type 
> information. When accessing data of B bytes, 2*B bits of shadow memory 
> can be used. Thus, we'll get 2 bits for a one-byte type, 4 bits for a 
> two-byte type, etc. Moreover, we need appropriate holes in the 
> encoding space so that no type has a shadow encoding that overlaps 
> with an aligned part of a larger type's encoding.
> For example, we need to detect:
>
>   double f = ...; return *(int*) &f; // We should catch this.
>
> We might use the following encoding. The idea is that the common case, 
> for which we need a reasonable fast path, is that type types are 
> exactly equal. For this case, we want a simple comparison of the 
> shadow type encodings to be sufficient to validate the access. For 
> cases where the encodings don't match (and isn't zero to indicate
an
> unknown type), or for which there is no direct encoding for the access 
> type, a slow path must be used. All of this assumes that we're 
> validating the the pointer alignment first, and then checking the type 
> encodings.
>
> 1 Byte:
> 00 = 0  = unknown type
> 01 = 1 = hole
> 10 = 2 = hole
> 11 = 3  = all one-byte types (slow path, see note later on this)
>
> 2 Bytes:
> 0000 = 0 = unknown type
> 0101 = 5 = short
> 0110 = 6 = hole (A)
> 0111 = 7 = wchar_t (under some ABIs)
> 1001 = 9 = hole (B)
> 1010 = 10 = hole (C)
> 1011 = 11 = char16_t
> 1111 = 15 = all other types (slow path)
>
> It is important here to have wchar_t have a direct encoding here 
> because wchar_t is two bytes on Windows, and moreover, wchar_t is very 
> commonly used on Windows. The partial encoding overlap of wchar_t 
> (i.e. 0111) with the 11 one-byte-type encoding works because 11 always 
> indicates a slow-path check.
>
> 4 Bytes:
> 0000 0000 = 0 = unknown type
> A A = int
> A B = float
> B A  = pointer (under some ABIs)
> B B = long (under some ABIs)
> A 1111 = wchar_t (under some ABIs)
> B 1111 = char32_t
> A C = hole (D)
> C A = hole (E)
> B C = hole (F)
> C B = hole (G)
> C C = hole (H)
> 1111 1111 = 255 = all other types (slow path)
>
> 8 Bytes:
> 0000 0000 0000 0000 = 0 = unknown type
> D D = double
> D E = long (under some ABIs)
> E D = long long (under some ABIs)
> E E = long double (under some ABIs)
> D F = pointer (under some ABIs)
> F D = hole (I)
> E F = hole (J)
> F E = hole
> F F = hole
> ...
> 1111 1111 1111 1111 = 65535 = all other types
>
> 16 Bytes:
> 0 = unknown type
> | | = __int128_t
> I J = long long (under some ABIs)
> J I = long double (under some ABIs)
> J J = hole
> ...
> -1 = all other types
>
> For pointers, this scheme would consider all pointers to be the same 
> (regardless of pointee type). Doing otherwise would mostly requiring 
> putting pointer-type checking on the slow path (i.e. access via a 
> pointer pointer), and that could add considerable overhead. We might, 
> however, split out function pointers from other pointers. We could 
> provide a compile-time option to control the granularity of 
> pointer-type checks.
>
> Builtin vector types for which the vector element type has a direct 
> encoding also naturally have a direct encoding (the concatenation of 
> the encoding for the element type).
>
> Obviously the fact that we have no fast-path encodings for one-byte 
> types could be problematic. Note however that:
>
>  1. If a larger type is being used to access a smaller type (plus 
> more), the encodings won't match, so we always end up on the slow path.
>
>  2. If the access type is a one-byte type, we would want to validate 
> quickly. However, most common one-byte types are universally aliasing 
> (i.e. not subject to TBAA violations). Specifically, for C/C++, 
> pointers to char, unsigned char, signed char (C only), and std::byte, 
> can be used to access any part of any type. That leaves signed char 
> (C++ only), bool/_Bool, and enums with a [signed/unsigned] char base 
> type (C++ only, std::byte exempted) as pointee types we might wish to 
> validate. We'd always need to fall back to the slow path to validate 
> these. We could provide a compile-time option to disable such one-byte 
> access checking if necessary.
>
> How would the slow path work? First, the code needs to find the 
> beginning of the allocation. It can do this by scanning backwards in 
> the ASan shadow memory. Once located, we'll read a pointer to a 
> type-description structure from that "red zone" location. For
dynamic
> allocations, ASan's allocator will ensure such a space for the pointer 
> exists. For static allocations and globals, the compiler will ensure 
> it exists. The compiler will make sure that all constructors locate 
> this field and fill it in. Destructors can clear it. If two of these 
> type-description-structure pointers are equal, then we can conclude 
> that the types are equal. If not, then we need to interpret the 
> structure. The pointer itself might be to an interval map (to deal 
> with arrays, placement new, etc. - we can use the low bit of the 
> pointer to differentiate between an actual type-description structure 
> and an interval map), and the leaves of the interval map point to 
> actual type-description structures. The type-description structure is 
> an array of (offset, type) pairs, where the type field is also a 
> type-description-structure pointer. The type-description structures 
> themselves are comdat globals emitted in each relevant translation 
> unit, where the comdat key is formed using the mangled type name (and 
> size, etc.), and pointers to these symbols are then used to identify 
> the types.
>
> Thoughts?
>
> Thanks again,
> Hal
>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Apr 2017 - [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

Maybe Matching Threads