thr3ads.net - llvm dev - [LLVMdev] IR extension proposal: bitset constants [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Peter Collingbourne

2015-Jan-30 19:44 UTC

[LLVMdev] IR extension proposal: bitset constants

Hi Chris,

I wanted to start by giving an explanation of what I am trying to achieve
and how I am trying to achieve it.

I am working towards introducing into LLVM a security mechanism, Forward
Control Flow Integrity (CFI), that is designed to mitigate against
vulnerabilities that allow attacks based on corrupting vtable or function
pointers in memory in order to subvert a program's control flow.

More details are in a paper that I co-authored that was presented at USENIX
Security last year [1]. As mentioned in the paper, attackers are increasingly
relying on such techniques to subvert control flow. This is why I feel that it
is particularly important that compilers contain practical defenses against
such attacks.

One particular variant of the defense I am proposing, vtable verification,
was implemented in GCC and described in section 3 of the paper, however it
comes with a significant performance overhead, more than 8% on certain Chrome
browser-based benchmarks and up to around 20% on SPEC 2006 benchmarks (see
Figure 2). This is likely due to the fact that it searches lists of vtables
to determine if a given vtable is valid. This is a direct consequence of
its avoidance of techniques that depend on changing how the program is linked.

The implementation I am proposing to contribute to LLVM focuses on
performance. Building the needed data structures at link time allows us
to reduce the cost of checking the validity of a vtable pointer to a few
machine instructions and a single load from memory.

On Thu, Jan 29, 2015 at 11:04:41PM -0800, Chris Lattner
wrote:> I don’t think that adding an IR construct for this is the way to go. 
You’re making IR complicated for everyone to serve a very narrow use case (one
that admittedly I don’t really understand).
> Also, your patch is incomplete.  Presumably you have to scatter checks for
“if (!GV->isBitSet())” throughout the optimizer, codegen and other things
that touch globals.
I'll address these points simultaneously because I would like to explain
why extensive support for the new construct is not required, and why the
maintenance burden is not as large as it might seem (and why my patch does
not need to make such extensive changes to the optimizers).

The first point I'd like to make is that from the point of view of optimizer
passes other than bitset lowering, the llvm.bitset.test intrinsic has opaque
semantics with regard to the content of the globals it references, so they
cannot legally modify the contents of the bitset global.

The second is that any use of a bitset global other than as an argument to
the llvm.bitset.test intrinsic has undefined semantics. (This is something
that can be documented better.) This means that any optimizer pass that
looks through global initializers does not require any changes, as any
transformation it may perform on IR that treats such globals as regular
globals (for example by taking its address or loading from it) is semantics
preserving, as the semantics of such IR would have been undefined anyway.

With these points in mind, I'm reasonably confident that very little code
needs to care about the new flag.

(I should also point out that I know that the patch most likely works without
any other optimizer changes, because I have a work-in-progress patch that
implements the Clang side of this, and have successfully applied it to a
large C++ codebase and found real bugs in that codebase.)

Regarding codegen, I haven't implemented support in codegen for bitsets yet,
the intrinsic is completely handled by the pass. I can't imagine the changes
being very intrusive though. We can easily add a check that no bitset stuff
makes it through to codegen for the moment.
> The fact that this affects semantics and will only work with LTO and not
native linkers is also really weird to me.
I agree, which is why I plan to add support to lld and perhaps other linkers,
but we do have to start somewhere. Adding the functionality to the compiler
only seems like a reasonable first step, even if we depend on LTO to begin
with.
> Is there other precedent for that?  The only cases I know that affect LTO
add information that is safe to drop (e.g. TBAA etc).
There is the jumptable attribute, which has been used to implement a variant
of CFI for indirect function calls (see section 4 of the USENIX paper), and
that only works effectively with an LTO'd module. (We might end up adding
native linker support for that or something similar as well.)

Thanks,
-- 
Peter

[1] http://www.pcc.me.uk/~peter/acad/usenix14.pdf

JF Bastien

2015-Jan-31 19:35 UTC

head link

[LLVMdev] IR extension proposal: bitset constants

Trying to summarize all opinions expressed here: Peter is proposing an
initial implementation that would only work with LTO. Folks seem put off by
this implementation affecting IR without having proven itself, and having
shortcomings (as Jim pointed out). Kostya proposed going through metadata
(and Chris kind of did too by mentioning tbaa), but Peter points out that
this will make the implementation trickier.

It sounds like going through metadata for the LTO-only proof-of-concept
would be preferable, even if more complex. Peter, how more complex would
that be?

As an interested user of this I'm happy with LTO, and I can help with the
reviews of the code.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150131/78d2dc7c/attachment.html>

Peter Collingbourne

2015-Jan-31 22:07 UTC

head link

[LLVMdev] IR extension proposal: bitset constants

On Sat, Jan 31, 2015 at 11:35:01AM -0800, JF Bastien
wrote:> Trying to summarize all opinions expressed here: Peter is proposing an
> initial implementation that would only work with LTO. Folks seem put off by
> this implementation affecting IR without having proven itself, and having
> shortcomings (as Jim pointed out). Kostya proposed going through metadata
> (and Chris kind of did too by mentioning tbaa), but Peter points out that
> this will make the implementation trickier.
> 
> It sounds like going through metadata for the LTO-only proof-of-concept
> would be preferable, even if more complex. Peter, how more complex would
> that be?
I was just thinking about how the metadata-based design for this might work.

We could have a named metadata global containing the list of bitset entries.
Each entry would be a pair consisting of an identifier for the bitset (in
this case, the mangled name of the class) and a pointer into a global
(in this case, a valid vtable pointer).

For example, this class hierarchy:

class A { virtual void f(); };
class B : A { virtual void f(); };
class C : A { virtual void f(); };

would have these bitsets:

!llvm.bitsets = !{!0, !1, !2, !3, !4}

!0 = !{!"1A", i8* getelementptr inbounds ([3 x i8*]* @_ZTV1A, i32 0,
i32 2)}
!1 = !{!"1A", i8* getelementptr inbounds ([3 x i8*]* @_ZTV1B, i32 0,
i32 2)}
!2 = !{!"1A", i8* getelementptr inbounds ([3 x i8*]* @_ZTV1C, i32 0,
i32 2)}
!3 = !{!"1B", i8* getelementptr inbounds ([3 x i8*]* @_ZTV1B, i32 0,
i32 2)}
!4 = !{!"1C", i8* getelementptr inbounds ([3 x i8*]* @_ZTV1C, i32 0,
i32 2)}

The LLVM linker can already merge the contents of named metadata globals.

The intrinsic for reading from bitsets would have this definition:

declare i1 @llvm.bitset.test(i8*, metadata)

and would be called like this:

  %x = call i1 @llvm.bitset.test(i8* %p, metadata !{!"1A"})

The disadvantage of this design is that metadata strings always have global
scope, so identically named classes in anonymous namespaces in different
translation units would receive the same bitset. (We can't just use the
vtable globals as bitset identifiers, because they may be erased by globaldce.
There are probably various things we could do to force the globals to stick
around, if we wanted to go that way.)

Does that seem more reasonable to people?

Thanks,
-- 
Peter

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Jan 2015 - [LLVMdev] IR extension proposal: bitset constants

[LLVMdev] IR extension proposal: bitset constants

[LLVMdev] IR extension proposal: bitset constants

[LLVMdev] IR extension proposal: bitset constants

Reasonably Related Threads