thr3ads.net - llvm dev - [llvm-dev] [RFC] Design of a TBAA sanitizer [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Kostya Serebryany via llvm-dev

2017-Apr-11 21:39 UTC

[llvm-dev] [RFC] Design of a TBAA sanitizer

On Tue, Apr 11, 2017 at 1:40 PM, Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
> Hi,
>
> On April 11, 2017 at 1:37:20 PM, Sanjoy Das
> (sanjoy at playingwithpointers.com) wrote:
> > Hi Kostya,
> >
> > On April 11, 2017 at 1:30:10 PM, Kostya Serebryany (kcc at google.com)
> wrote:
> >
> > > of course, but accesses are done via pointers, and if TBAA
queries
> > > MayAlias(AccessViaP1, AccessViaP2)
> > > there should (??) be a point in the IR where both P1 and P2 exist
> together
> > > and can be compared.
> >
> > That may not be possible (though I'm second guessing what exactly
you
> have in mind so maybe
> > I'm missing something here):
> >
> > ptr0 = malloc();
> > *(int*)ptr0 = 20; // access0
> > free(ptr0);
> > ptr1 = calloc(); // bitwise equal to ptr0 by chance
> > float f = *(float *)ptr1; // access1
> >
> > The program above is fine (no TBAA violations), but at location
access1
> ptr1 and ptr0
> > overlap despite being NoAlias.
>
> Actually this isn't specific to TBAA.  Even without TBAA we mark
> malloc's return value as NoAlias, so in
>
> ptr0 = malloc();
> free(ptr0);
> ptr1 = malloc();
>
> ptr0 and ptr1 will be NoAlias despite overlapping (there is actually a
> real soundness issue here in LLVM's semantics, but I don't want to
> digress).  You can also recreate the pattern with realloc.
>
In both of your examples there is no place in the program where both P0 and
P1 are live simultaneously,
i.e. no analysis path is expected to query MayAlias(AccessToP0,
AccessToP1). No?

>
> The same problem exists with constant addresses.  LLVM states that
> constant locations are noalias with themselves, and you again have the
> "noalias does not imply pointer inequality" problem.
>
That won't even have to be special cased, because if we emit a check
ConstPtr != ConstPtr,
such a check will be trivially optimized away.

> -- Sanjoy
>
> >
> > -- Sanjoy
> >
> > >
> > >
> > > >
> > > > This means, at least in LLVM IR, you have to account for
cases like:
> > > >
> > > > if (A)
> > > > *(int *)ptr = 20;
> > > > if (B)
> > > > float f = *(float *)ptr;
> > > >
> > > > where ptr does not have a type (according to TBAA), but you
want to
> > > > crash the program if both A and B are true.
> > > >
> > > > -- Sanjoy
> > > >
> > > > > For larger functions -- only for those pairs where the
optimizer
> actually
> > > > > queried MayAlias(P1, P2).
> > > > >
> > > > > --kcc
> > > > >
> > > > >
> > > > > On Tue, Apr 11, 2017 at 3:49 AM, Hal Finkel via
llvm-dev <
> > > > > llvm-dev at lists.llvm.org> wrote:
> > > > >
> > > > > >
> > > > > > On 04/11/2017 03:46 AM, Andrey Bokhanko wrote:
> > > > > >
> > > > > > Hal,
> > > > > >
> > > > > > To clarify, my example meant to illustrate that
for memory
> references
> > > > to
> > > > > > structures' fields you have to keep a
user-defined type, even
> for one
> > > > byte
> > > > > > accesses. C++ allows references to "initial
member sequence"
> using
> > > > pointers
> > > > > > to structures of different types. And yes, there
are programs in
> the
> > > > wild
> > > > > > that rely on this (don't remember details...
this was from
> previous
> > > > life).
> > > > > >
> > > > > > Another thing to consider -- what about memset /
memcpy / etc
> that
> > > > > > inherently rely on type punning? If not inlined,
they don't
> present
> > > > > > problems for an optimizer -- probably
shouldn't be considered as
> > > > aliasing
> > > > > > rules violation?
> > > > > >
> > > > > >
> > > > > > Good point. You (likely) wouldn't want to
compile your memcpy /
> memset
> > > > /
> > > > > > etc. implementations with the TBAA sanitizer
enabled (or we'd
> need to
> > > > > > provide an attribute to disable it for certain
functions). If
> they
> > > > only use
> > > > > > char* themselves, then it's fine, but if not,
that's
> implementation
> > > > magic.
> > > > > > However, memset, etc. does need to be able to
'unset' the type of
> > > > memory
> > > > > > and memcpy needs to be able to copy types (at
least for PODs).
> The
> > > > > > sanitizer would need to hook them for that
purpose.
> > > > > >
> > > > > > -Hal
> > > > > >
> > > > > >
> > > > > >
> > > > > > Yours,
> > > > > > Andrey
> > > > > > ==> > > > > > Compiler Architect
> > > > > > NXP
> > > > > >
> > > > > >
> > > > > > On Tue, Apr 11, 2017 at 12:05 AM, Hal Finkel
wrote:
> > > > > >
> > > > > >>
> > > > > >> On 04/10/2017 09:55 AM, Andrey Bokhanko wrote:
> > > > > >>
> > > > > >> Hi Hal,
> > > > > >>
> > > > > >> I wonder how your solution will handle the
following?
> > > > > >>
> > > > > >> struct {
> > > > > >> int s1_f1;
> > > > > >> float s1_f2;
> > > > > >> int s1_f3;
> > > > > >> float s1_f4;
> > > > > >> } S1;
> > > > > >>
> > > > > >> struct {
> > > > > >> int s2_f1;
> > > > > >> float s2_f2;
> > > > > >> int *s2_f3; // to add some interest, suppose
that sizeof(int) => > > > > >> sizeof(int *)
> > > > > >> float s2_f4;
> > > > > >> } S2;
> > > > > >>
> > > > > >> S1 *s1; S2 *s2;
> > > > > >> ...
> > > > > >> s2 = (S1*)s1;
> > > > > >> s2->s2_f1 = 0; // allowed
> > > > > >> s2->s2_f2 = 0; // allowed
> > > > > >> s2->s2_f3 = 0; // not allowed
> > > > > >> s2->s2_f4 = 0; // not allowed
> > > > > >>
> > > > > >> Also, when you plan to set types for allocated
memory?
> > > > > >>
> > > > > >>
> > > > > >> The most-general thing seems to be to set the
types along with a
> > > > store.
> > > > > >> As a result, the proposed scheme would not
find a fault with
> the code
> > > > > >> above, but would complain if anyone actually
later read
> S1.s1_f3.
> > > > > >>
> > > > > >> If we want to catch these kinds of problems
directly we'd need
> to have
> > > > > >> the compiler insert code when the type is
constructed to mark
> the
> > > > types,
> > > > > >> and then we'd need to check those types
around stores. This also
> > > > sounds
> > > > > >> like a useful enhancement (although somewhat
more complicated to
> > > > implement).
> > > > > >>
> > > > > >> What types will be set for memory allocated by
a malloc call?
> > > > > >>
> > > > > >>
> > > > > >> Memory would be untyped (or of unknown type)
when allocated.
> > > > > >>
> > > > > >> Thanks again,
> > > > > >> Hal
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> Yours,
> > > > > >> Andrey
> > > > > >>
> > > > > >>
> > > > > >> On Tue, Apr 4, 2017 at 10:13 PM, Hal Finkel
via llvm-dev <
> > > > > >> llvm-dev at lists.llvm.org> wrote:
> > > > > >>
> > > > > >>> Hi everyone,
> > > > > >>>
> > > > > >>> At EuroLLVM, Chandler and I chatted about
the design for a
> potential
> > > > > >>> TBAA sanitizer. Here's my attempt to
summarize:
> > > > > >>>
> > > > > >>> C/C++ have type-based aliasing rules, and
LLVM's optimizer can
> > > > exploit
> > > > > >>> these given TBAA metadata added by Clang.
Roughly, a pointer of
> > > > given type
> > > > > >>> cannot be used to access an object of a
different type (with,
> of
> > > > course,
> > > > > >>> certain exceptions). Unfortunately,
there's a lot of code in
> the
> > > > wild that
> > > > > >>> violates these rules (e.g. for type
punning), and such code
> often
> > > > must be
> > > > > >>> built with -fno-strict-aliasing.
Performance is often
> sacrificed as a
> > > > > >>> result. Part of the problem is the
difficulty of finding TBAA
> > > > violations. A
> > > > > >>> sanitizer would help.
> > > > > >>>
> > > > > >>> A design goal of a TBAA sanitizer is to
limit the shadow-memory
> > > > overhead
> > > > > >>> of the implementation. ASan, for example,
uses 1 bit per byte.
> Here
> > > > we're
> > > > > >>> hoping to keep the overhead down to 2 bits
per byte for the
> TBAA
> > > > > >>> sanitizing. We might be able to do this,
while handling all
> common
> > > > types on
> > > > > >>> the fast path, if we use both alignment
and type information.
> When
> > > > > >>> accessing data of B bytes, 2*B bits of
shadow memory can be
> used.
> > > > Thus,
> > > > > >>> we'll get 2 bits for a one-byte type,
4 bits for a two-byte
> type,
> > > > etc.
> > > > > >>> Moreover, we need appropriate holes in the
encoding space so
> that no
> > > > type
> > > > > >>> has a shadow encoding that overlaps with
an aligned part of a
> larger
> > > > type's
> > > > > >>> encoding.
> > > > > >>> For example, we need to detect:
> > > > > >>>
> > > > > >>> double f = ...; return *(int*) &f; //
We should catch this.
> > > > > >>>
> > > > > >>> We might use the following encoding. The
idea is that the
> common
> > > > case,
> > > > > >>> for which we need a reasonable fast path,
is that type types
> are
> > > > exactly
> > > > > >>> equal. For this case, we want a simple
comparison of the
> shadow type
> > > > > >>> encodings to be sufficient to validate the
access. For cases
> where
> > > > the
> > > > > >>> encodings don't match (and isn't
zero to indicate an unknown
> type),
> > > > or for
> > > > > >>> which there is no direct encoding for the
access type, a slow
> path
> > > > must be
> > > > > >>> used. All of this assumes that we're
validating the the pointer
> > > > alignment
> > > > > >>> first, and then checking the type
encodings.
> > > > > >>>
> > > > > >>> 1 Byte:
> > > > > >>> 00 = 0 = unknown type
> > > > > >>> 01 = 1 = hole
> > > > > >>> 10 = 2 = hole
> > > > > >>> 11 = 3 = all one-byte types (slow path,
see note later on this)
> > > > > >>>
> > > > > >>> 2 Bytes:
> > > > > >>> 0000 = 0 = unknown type
> > > > > >>> 0101 = 5 = short
> > > > > >>> 0110 = 6 = hole (A)
> > > > > >>> 0111 = 7 = wchar_t (under some ABIs)
> > > > > >>> 1001 = 9 = hole (B)
> > > > > >>> 1010 = 10 = hole (C)
> > > > > >>> 1011 = 11 = char16_t
> > > > > >>> 1111 = 15 = all other types (slow path)
> > > > > >>>
> > > > > >>> It is important here to have wchar_t have
a direct encoding
> here
> > > > because
> > > > > >>> wchar_t is two bytes on Windows, and
moreover, wchar_t is very
> > > > commonly
> > > > > >>> used on Windows. The partial encoding
overlap of wchar_t (i.e.
> 0111)
> > > > with
> > > > > >>> the 11 one-byte-type encoding works
because 11 always
> indicates a
> > > > slow-path
> > > > > >>> check.
> > > > > >>>
> > > > > >>> 4 Bytes:
> > > > > >>> 0000 0000 = 0 = unknown type
> > > > > >>> A A = int
> > > > > >>> A B = float
> > > > > >>> B A = pointer (under some ABIs)
> > > > > >>> B B = long (under some ABIs)
> > > > > >>> A 1111 = wchar_t (under some ABIs)
> > > > > >>> B 1111 = char32_t
> > > > > >>> A C = hole (D)
> > > > > >>> C A = hole (E)
> > > > > >>> B C = hole (F)
> > > > > >>> C B = hole (G)
> > > > > >>> C C = hole (H)
> > > > > >>> 1111 1111 = 255 = all other types (slow
path)
> > > > > >>>
> > > > > >>> 8 Bytes:
> > > > > >>> 0000 0000 0000 0000 = 0 = unknown type
> > > > > >>> D D = double
> > > > > >>> D E = long (under some ABIs)
> > > > > >>> E D = long long (under some ABIs)
> > > > > >>> E E = long double (under some ABIs)
> > > > > >>> D F = pointer (under some ABIs)
> > > > > >>> F D = hole (I)
> > > > > >>> E F = hole (J)
> > > > > >>> F E = hole
> > > > > >>> F F = hole
> > > > > >>> ...
> > > > > >>> 1111 1111 1111 1111 = 65535 = all other
types
> > > > > >>>
> > > > > >>> 16 Bytes:
> > > > > >>> 0 = unknown type
> > > > > >>> | | = __int128_t
> > > > > >>> I J = long long (under some ABIs)
> > > > > >>> J I = long double (under some ABIs)
> > > > > >>> J J = hole
> > > > > >>> ...
> > > > > >>> -1 = all other types
> > > > > >>>
> > > > > >>> For pointers, this scheme would consider
all pointers to be
> the same
> > > > > >>> (regardless of pointee type). Doing
otherwise would mostly
> requiring
> > > > > >>> putting pointer-type checking on the slow
path (i.e. access
> via a
> > > > pointer
> > > > > >>> pointer), and that could add considerable
overhead. We might,
> > > > however,
> > > > > >>> split out function pointers from other
pointers. We could
> provide a
> > > > > >>> compile-time option to control the
granularity of pointer-type
> > > > checks.
> > > > > >>>
> > > > > >>> Builtin vector types for which the vector
element type has a
> direct
> > > > > >>> encoding also naturally have a direct
encoding (the
> concatenation of
> > > > the
> > > > > >>> encoding for the element type).
> > > > > >>>
> > > > > >>> Obviously the fact that we have no
fast-path encodings for
> one-byte
> > > > > >>> types could be problematic. Note however
that:
> > > > > >>>
> > > > > >>> 1. If a larger type is being used to
access a smaller type
> (plus
> > > > more),
> > > > > >>> the encodings won't match, so we
always end up on the slow
> path.
> > > > > >>>
> > > > > >>> 2. If the access type is a one-byte type,
we would want to
> validate
> > > > > >>> quickly. However, most common one-byte
types are universally
> > > > aliasing (i.e.
> > > > > >>> not subject to TBAA violations).
Specifically, for C/C++,
> pointers
> > > > to char,
> > > > > >>> unsigned char, signed char (C only), and
std::byte, can be
> used to
> > > > access
> > > > > >>> any part of any type. That leaves signed
char (C++ only),
> > > > bool/_Bool, and
> > > > > >>> enums with a [signed/unsigned] char base
type (C++ only,
> std::byte
> > > > > >>> exempted) as pointee types we might wish
to validate. We'd
> always
> > > > need to
> > > > > >>> fall back to the slow path to validate
these. We could provide
> a
> > > > > >>> compile-time option to disable such
one-byte access checking if
> > > > necessary.
> > > > > >>>
> > > > > >>> How would the slow path work? First, the
code needs to find the
> > > > > >>> beginning of the allocation. It can do
this by scanning
> backwards in
> > > > the
> > > > > >>> ASan shadow memory. Once located,
we'll read a pointer to a
> > > > > >>> type-description structure from that
"red zone" location. For
> dynamic
> > > > > >>> allocations, ASan's allocator will
ensure such a space for the
> > > > pointer
> > > > > >>> exists. For static allocations and
globals, the compiler will
> ensure
> > > > it
> > > > > >>> exists. The compiler will make sure that
all constructors
> locate
> > > > this field
> > > > > >>> and fill it in. Destructors can clear it.
If two of these
> > > > > >>> type-description-structure pointers are
equal, then we can
> conclude
> > > > that
> > > > > >>> the types are equal. If not, then we need
to interpret the
> > > > structure. The
> > > > > >>> pointer itself might be to an interval map
(to deal with
> arrays,
> > > > placement
> > > > > >>> new, etc. - we can use the low bit of the
pointer to
> differentiate
> > > > between
> > > > > >>> an actual type-description structure and
an interval map), and
> the
> > > > leaves
> > > > > >>> of the interval map point to actual
type-description
> structures. The
> > > > > >>> type-description structure is an array of
(offset, type) pairs,
> > > > where the
> > > > > >>> type field is also a
type-description-structure pointer. The
> > > > > >>> type-description structures themselves are
comdat globals
> emitted in
> > > > each
> > > > > >>> relevant translation unit, where the
comdat key is formed
> using the
> > > > mangled
> > > > > >>> type name (and size, etc.), and pointers
to these symbols are
> then
> > > > used to
> > > > > >>> identify the types.
> > > > > >>>
> > > > > >>> Thoughts?
> > > > > >>>
> > > > > >>> Thanks again,
> > > > > >>> Hal
> > > > > >>>
> > > > > >>> --
> > > > > >>> Hal Finkel
> > > > > >>> Lead, Compiler Technology and Programming
Languages
> > > > > >>> Leadership Computing Facility
> > > > > >>> Argonne National Laboratory
> > > > > >>>
> > > > > >>>
_______________________________________________
> > > > > >>> LLVM Developers mailing list
> > > > > >>> llvm-dev at lists.llvm.org
> > > > > >>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > > > >>>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> Hal Finkel
> > > > > >> Lead, Compiler Technology and Programming
Languages
> > > > > >> Leadership Computing Facility
> > > > > >> Argonne National Laboratory
> > > > > >>
> > > > > >>
> > > > > >
> > > > > > --
> > > > > > Hal Finkel
> > > > > > Lead, Compiler Technology and Programming
Languages
> > > > > > Leadership Computing Facility
> > > > > > Argonne National Laboratory
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > LLVM Developers mailing list
> > > > > > llvm-dev at lists.llvm.org
> > > > > >
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > > > >
> > > > > >
> > > > > _______________________________________________
> > > > > LLVM Developers mailing list
> > > > > llvm-dev at lists.llvm.org
> > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > > >
> > > >
> > >
> >
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170411/72b1e252/attachment.html>

Sanjoy Das via llvm-dev

2017-Apr-11 22:14 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

Hi Kostya,

On April 11, 2017 at 2:39:44 PM, Kostya Serebryany (kcc at google.com)
wrote:> > ptr0 = malloc();
> > free(ptr0);
> > ptr1 = malloc();
> >
> > ptr0 and ptr1 will be NoAlias despite overlapping (there is actually a
> > real soundness issue here in LLVM's semantics, but I don't
want to
> > digress). You can also recreate the pattern with realloc.
> >
>
> In both of your examples there is no place in the program where both P0 and
> P1 are live simultaneously,
> i.e. no analysis path is expected to query MayAlias(AccessToP0,
> AccessToP1). No?
I may be misunderstanding what you meant, but I don't see why not.

Say you had (all values are SSA values):

%p0 = malloc()
store i32 0, i32* %p0  // S0
free(%p0)
%p1 = malloc()
store i32 1, i32* %p1  // S1

and some pass wanted to sink S0 to after S1.  So it starts checking
"from the bottom", as

  Alias(S0, S1) = NoAlias
  Alias(S0, malloc()) = NoAlias
  Alias(S0, free(%p0)) = MayAlias

etc.  The last MayAlias will prevent it from doing the sink, but I
don't see why it can't ask the Alias(S0, S1) question.
> > The same problem exists with constant addresses. LLVM states that
> > constant locations are noalias with themselves, and you again have the
> > "noalias does not imply pointer inequality" problem.
>
> That won't even have to be special cased, because if we emit a check
> ConstPtr != ConstPtr,
> such a check will be trivially optimized away.
But won't it be constant folded to trigger the sanitizer crash /
warning?  That is, since LLVM will state the ConstPtr NoAlias
ConstPtr, you'll emit the check:

if (overlap(ConstPtr, Sz, ConstPtr, Sz))
  abort();

which will get constant folded to

if (true) abort();

If you meant that the implementation of overlap will differ based on
whether the pointers are constant pointers or not, I'm not sure if
that will work, since the fact that the values whose aliasness (I
think I invented a new word :P ) you're checking could have been
arbitrarily obscured (AA could have looked through PHIs and selects
etc.) which will prevent you from rediscovering that the values were
constant pointers in some cases.

-- Sanjoy

Kostya Serebryany via llvm-dev

2017-Apr-11 22:28 UTC

head link

[llvm-dev] [RFC] Design of a TBAA sanitizer

On Tue, Apr 11, 2017 at 3:14 PM, Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
> Hi Kostya,
>
> On April 11, 2017 at 2:39:44 PM, Kostya Serebryany (kcc at google.com)
wrote:
> > > ptr0 = malloc();
> > > free(ptr0);
> > > ptr1 = malloc();
> > >
> > > ptr0 and ptr1 will be NoAlias despite overlapping (there is
actually a
> > > real soundness issue here in LLVM's semantics, but I
don't want to
> > > digress). You can also recreate the pattern with realloc.
> > >
> >
> > In both of your examples there is no place in the program where both
P0
> and
> > P1 are live simultaneously,
> > i.e. no analysis path is expected to query MayAlias(AccessToP0,
> > AccessToP1). No?
>
> I may be misunderstanding what you meant, but I don't see why not.
>
> Say you had (all values are SSA values):
>
> %p0 = malloc()
> store i32 0, i32* %p0  // S0
> free(%p0)
> %p1 = malloc()
> store i32 1, i32* %p1  // S1
>
> and some pass wanted to sink S0 to after S1.  So it starts checking
> "from the bottom", as
>
>   Alias(S0, S1) = NoAlias
>   Alias(S0, malloc()) = NoAlias
>   Alias(S0, free(%p0)) = MayAlias
>
> etc.  The last MayAlias will prevent it from doing the sink, but I
> don't see why it can't ask the Alias(S0, S1) question.
>
Yea, that's a bit trickier.
But we can at least add checks for pairs of pointer accesses that the
analysis claims to be ok to reorder.


>
> > > The same problem exists with constant addresses. LLVM states that
> > > constant locations are noalias with themselves, and you again
have the
> > > "noalias does not imply pointer inequality" problem.
> >
> > That won't even have to be special cased, because if we emit a
check
> > ConstPtr != ConstPtr,
> > such a check will be trivially optimized away.
>
> But won't it be constant folded to trigger the sanitizer crash /
> warning?  That is, since LLVM will state the ConstPtr NoAlias
> ConstPtr, you'll emit the check:
>
> if (overlap(ConstPtr, Sz, ConstPtr, Sz))
>   abort();
>
> which will get constant folded to
>
> if (true) abort();
>
ah, yes, you are right, then this will have to be special-cased.

>
> If you meant that the implementation of overlap will differ based on
> whether the pointers are constant pointers or not, I'm not sure if
> that will work, since the fact that the values whose aliasness (I
> think I invented a new word :P ) you're checking could have been
> arbitrarily obscured (AA could have looked through PHIs and selects
> etc.) which will prevent you from rediscovering that the values were
> constant pointers in some cases.
>
> -- Sanjoy
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170411/bd4c5056/attachment.html>

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Apr 2017 - [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

[llvm-dev] [RFC] Design of a TBAA sanitizer

Apparently Analagous Threads