thr3ads.net - llvm dev - [LLVMdev] [RFC] Simple control-flow integrity [Feb 2014]

If this information is useful, please help other people find it:
Share via:

Richard Osborne

2014-Feb-11 08:28 UTC

[LLVMdev] [RFC] Simple control-flow integrity

On 11 Feb 2014, at 08:15, Eric Christopher <echristo at gmail.com> wrote:
> On Mon, Feb 10, 2014 at 11:51 PM, Reid Kleckner <rnk at google.com>
wrote:
>> 
>> 
>> 
>> IIRC this came up before, and I don't think we expose anything like
a jump
>> table at the IR level.  As an IR-to-IR transform, I think asm is the
only
>> way to do it.
> 
> I'd have to look more at what he's doing, but wouldn't a simple
switch
> statement in IR suffice? Efficiency would be up to the various
> lowering mechanisms, but it wouldn't require inline asm.
> 
> -ericAnother option might be to create an array of function pointers in the LLVM IR,
i.e generate code that looks like:

void (*jumptable[])() = {
  &a,
  &b
};

void f(int index) {
  *(jumptable[index])();
}

Eric Christopher

2014-Feb-11 08:30 UTC

head link

[LLVMdev] [RFC] Simple control-flow integrity

On Tue, Feb 11, 2014 at 12:28 AM, Richard Osborne <richard at xmos.com>
wrote:>
> On 11 Feb 2014, at 08:15, Eric Christopher <echristo at gmail.com>
wrote:
>
>> On Mon, Feb 10, 2014 at 11:51 PM, Reid Kleckner <rnk at
google.com> wrote:
>>>
>>>
>>>
>>> IIRC this came up before, and I don't think we expose anything
like a jump
>>> table at the IR level.  As an IR-to-IR transform, I think asm is
the only
>>> way to do it.
>>
>> I'd have to look more at what he's doing, but wouldn't a
simple switch
>> statement in IR suffice? Efficiency would be up to the various
>> lowering mechanisms, but it wouldn't require inline asm.
>>
>> -eric
> Another option might be to create an array of function pointers in the LLVM
IR, i.e generate code that looks like:
>
> void (*jumptable[])() = {
>   &a,
>   &b
> };
>
> void f(int index) {
>   *(jumptable[index])();
> }
*nod* That's the sort of thing I was thinking about too.

-eric

Reid Kleckner

2014-Feb-11 17:12 UTC

head link

[LLVMdev] [RFC] Simple control-flow integrity

On Tue, Feb 11, 2014 at 12:28 AM, Richard Osborne <richard at xmos.com>
wrote:
>
> On 11 Feb 2014, at 08:15, Eric Christopher <echristo at gmail.com>
wrote:
>
> > On Mon, Feb 10, 2014 at 11:51 PM, Reid Kleckner <rnk at
google.com> wrote:
> >>
> >>
> >>
> >> IIRC this came up before, and I don't think we expose anything
like a
> jump
> >> table at the IR level.  As an IR-to-IR transform, I think asm is
the
> only
> >> way to do it.
> >
> > I'd have to look more at what he's doing, but wouldn't a
simple switch
> > statement in IR suffice? Efficiency would be up to the various
> > lowering mechanisms, but it wouldn't require inline asm.
> >
> > -eric
> Another option might be to create an array of function pointers in the
> LLVM IR, i.e generate code that looks like:
>
> void (*jumptable[])() = {
>   &a,
>   &b
> };
>
> void f(int index) {
>   *(jumptable[index])();
> }
>
This isn't ABI compatible.  Now function pointers point to data (or are
switch table indices) and not code.

I can imagine abusing indirectbr across separate functions, but rolling
your own jump table in asm seems better.

On Tue, Feb 11, 2014 at 5:07 AM, Joerg Sonnenberger <joerg at
britannica.bec.de> wrote:
> On Mon, Feb 10, 2014 at 03:33:32PM -0800, Tom Roeder wrote:
> >     3. adds a fast check for pointer safety at each indirect call
site:
>
> Why not using a bloom filter for valid target addresses instead?
>
Can a bloom filter be as fast as a simple bounds check?  I'm thinking lea
base, sub, cmp, jl, and cold call.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140211/f1ebd16b/attachment.html>

Tom Roeder

2014-Feb-11 17:46 UTC

head link

[LLVMdev] [RFC] Simple control-flow integrity

On Tue, Feb 11, 2014 at 9:12 AM, Reid Kleckner <rnk at google.com>
wrote:> On Tue, Feb 11, 2014 at 12:28 AM, Richard Osborne <richard at
xmos.com> wrote:
>>
>>
>> On 11 Feb 2014, at 08:15, Eric Christopher <echristo at
gmail.com> wrote:
>>
>> > On Mon, Feb 10, 2014 at 11:51 PM, Reid Kleckner <rnk at
google.com> wrote:
>> >>
>> >>
>> >>
>> >> IIRC this came up before, and I don't think we expose
anything like a
>> >> jump
>> >> table at the IR level.  As an IR-to-IR transform, I think asm
is the
>> >> only
>> >> way to do it.
>> >
>> > I'd have to look more at what he's doing, but wouldn't
a simple switch
>> > statement in IR suffice? Efficiency would be up to the various
>> > lowering mechanisms, but it wouldn't require inline asm.
I'm not sure I follow how this would work. Could you expand on this, please?

>> >
>> > -eric
>> Another option might be to create an array of function pointers in the
>> LLVM IR, i.e generate code that looks like:
>>
>> void (*jumptable[])() = {
>>   &a,
>>   &b
>> };
>>
>> void f(int index) {
>>   *(jumptable[index])();
>> }
>
>
> This isn't ABI compatible.  Now function pointers point to data (or are
> switch table indices) and not code.
>My first attempt at a solution did it just like this, with data
instead of pointers, but that opens several large cans of worms. As
mentioned, the ABI is a major problem, and to maintain compatibility,
you have to provide a complete escape analysis for all address-taken
functions, since they have to be transformed before they are passed to
external code. And, e.g., the IR translation of vtable calls actually
use a more complicated representation of function pointers, IIRC.

My version based on data instead of pointers worked well for simple
code and toy examples but it became extremely complicated and fragile
once I tried to apply it to real code. I finally realized that any
transformation that depends for its soundness on a complete solution
to an alias analysis problem is a non-starter for large, complicated
programs; conservative solutions weaken the protections provided by
the transformation.

This is why I switched to using inline asm; I add a declaration of a
fake external function for each entry in the table and use it as the
function pointer. Since my inline asm defines these symbols, the
linker is satisfied. There doesn't seem to be a way to create a true
jump table in LLVM, so this is the best we can do.

> I can imagine abusing indirectbr across separate functions, but rolling
your
> own jump table in asm seems better.
I tried using indirect branches for a bit, but I didn't see how to
make that work. Aren't they restricted to branching to labels inside
the current function? From the LLVM Language Reference entry on the
semantics of indirectbr:

Control transfers to the block specified in the address argument. All
possible destination blocks must be listed in the label list,
otherwise this instruction has undefined behavior. This implies that
jumps to labels defined in other functions have undefined behavior as
well.

>
> On Tue, Feb 11, 2014 at 5:07 AM, Joerg Sonnenberger
> <joerg at britannica.bec.de> wrote:
>>
>> On Mon, Feb 10, 2014 at 03:33:32PM -0800, Tom Roeder wrote:
>> >     3. adds a fast check for pointer safety at each indirect call
site:
>>
>> Why not using a bloom filter for valid target addresses instead?
>
>
> Can a bloom filter be as fast as a simple bounds check?  I'm thinking
lea
> base, sub, cmp, jl, and cold call.
Exactly. The jump table turns into a very small amount of code; note
that a normal bounds check has to check both bounds (so two subs and
cmps).  With the base and mask, and in an asm pseudo-code, it does:

sub base, addr
and mask, addr
add base, addr
cmp addr, orig
je normal_call
<load info for warning call>
call warning
normal_call:
call orig

And if you can get sufficient power-of-two alignment for the table,
you can do even better, since then the base is a prefix of all valid
addrs in its table. Unfortunately, Linux only gives you alignment up
to 2^12 under PIE/ASLR, even though the ELF binary claims to have
larger alignment.

and mask addr
add base addr
cmp addr, orig
<etc...same>

My code supports this as an option, but the default is the subtraction version.

Joerg Sonnenberger

2014-Feb-11 18:04 UTC

head link

[LLVMdev] [RFC] Simple control-flow integrity

On Tue, Feb 11, 2014 at 09:12:03AM -0800, Reid Kleckner
wrote:> > On Mon, Feb 10, 2014 at 03:33:32PM -0800, Tom Roeder wrote:
> > >     3. adds a fast check for pointer safety at each indirect call
site:
> >
> > Why not using a bloom filter for valid target addresses instead?
> >
> 
> Can a bloom filter be as fast as a simple bounds check?  I'm thinking
lea
> base, sub, cmp, jl, and cold call.
Depends. The potential issue with the "jump table" approach is code
size -- as written, at least 64bit for every potential target. If you
want to include virtual functions in that list it will grow very large.
A decently working bloom filter would need in the order of 1 or 2 bits
per potential target, making the chance of fitting into cache quite a
bit larger. A basic hash function as candidate would be "and size" for
the first filter and "mul with constant; and size" for the second,
followed by a bit test for each. On modern CPUs the mul is quite cheap,
so the trade off is more or less one memory access vs two.

Joerg

llvm dev - Feb 2014 - [LLVMdev] [RFC] Simple control-flow integrity

[LLVMdev] [RFC] Simple control-flow integrity

[LLVMdev] [RFC] Simple control-flow integrity

[LLVMdev] [RFC] Simple control-flow integrity

[LLVMdev] [RFC] Simple control-flow integrity

[LLVMdev] [RFC] Simple control-flow integrity