Schoedel, Kevin P
2012-Nov-26  15:56 UTC
[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests
I am investigating changing BoundsChecking to use address-based rather
than size- & offset-based tests.
To explain, here is a short code sample cribbed from one of the tests:
  %mem = tail call i8* @calloc(i64 1, i64 %elements)
  %memobj = bitcast i8* %mem to i64*
  %ptr = getelementptr inbounds i64* %memobj, i64 %index
  %4 = load i64* %ptr, align 8
Currently, the IR for bounds checking this load looks like this:
  %size = mul i64 8, %elements
  %offset = mul i64 %index, 8
  %objsize = sub i64 %size, %offset
  %cmp2 = icmp ult i64 %size, %offset
  %cmp3 = icmp ult i64 %objsize, 8
  %cmp1 = icmp slt i64 %offset, 0
  %9 = or i1 %cmp2, %cmp3
  %11 = or i1 %cmp1, %9
  br i1 %11, label %trap, label %12
                      ┆       ┆
                      │       │
    ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┢━━━━━━━┪╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶
    ↑                 ┃       ┃              ↑      ↑
    ·                 ┇       ┇              ·      ·
    ·                 ┃       ┃              ·      ·
    ·        ╴╴╴╴╴╴╴╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶    ObjSize  ·
    ·        ↑        ┃       ┃       ↑      ·      ·
    ·     data item   ┃       ┃  NeededSize  ·    Size
    ·        ↓        ┃       ┃       ↓      ↓      ·
    ·        ╴╴╴╴Ptr╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶      ·
    ·                 ┃       ┃              ↑      ·
  memory object       ┇       ┇            Offset   ·
    ↓                 ┃       ┃              ↓      ↓
    ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┡━━━━━━━┩╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶
                      │       │
                      │       │  Cmp1: Offset ≥ 0 (unless constant)
                      │       │  Cmp2: Size ≥ Offset
                      ┆       ┆  Cmp3: ObjSize ≥ NeededSize
I am looking at generating IR like this:
  %upperbound = getelementptr inbounds i64* %memobj, i64 %elements
  %end = getelementptr inbounds i64* %ptr, i64 1
  %ilowerbound = ptrtoint i64* %memobj to i64
  %iupperbound = ptrtoint i64* %upperbound to i64
  %iptr = ptrtoint i64* %ptr to i64
  %iend = ptrtoint i64* %end to i64
  %cmpl = icmp ult %iptr, %ilowerbound
  %cmpu = icmp ult %iupperbound, %iend
  %9 = or i1 %cmpl, %cmpu
  br i1 %11, label %trap, label %12
                      ┆       ┆
                      │       │
    ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┢━━━━━━━┪╶╶╶╶╶UpperBound
    ↑                 ┃       ┃        
    ·                 ┇       ┇        
    ·                 ┃       ┃        
    ·        ╴╴╴╴╴╴╴╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶End
    ·        ↑        ┃       ┃           
    ·     data item   ┃       ┃           
    ·        ↓        ┃       ┃           
    ·        ╴╴╴╴Ptr╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶Ptr
    ·                 ┃       ┃
  memory object       ┇       ┇
    ↓                 ┃       ┃
    ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┡━━━━━━━┩╶╶╶╶╶LowerBound
                      │       │
                      │       │  Cmp1: Ptr ≥ LowerBound
                      ┆       ┆  Cmp2: UpperBound ≥ End
Potential advantages that I see for address-based tests: 
- Always performs two comparisons, rather than (sometimes) three;
- No need to recompute Offset and ObjSize for varying Ptr within the
  same memory object;
- getelementptr, for UpperBound and End, is typically very cheap on
  contemporary processors (and ptrtoint typically free).
I have prototyped address-based testing (with one wart that makes it
not production grade) but not done benchmarking and analysis. Before
I do that, I'm looking for any reasons that this method of bounds
checking would be a "no go".
-- 
Kevin Schoedel kevin.p.schoedel at intel.com +1-519-772-2580
SSG-DPD-ECDL-DMP - Intel Dynamic Mobility and Parallelism
Nuno Lopes
2012-Nov-26  23:24 UTC
[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests
Hi Kevin, Thanks for your interest and for your deep analysis. Unfortunately, your approach doesn't catch all bugs and is vulnerable to an attack. Consider the following case: ...................... | ----- obj --- | | end ^ ptr ^ ^ end-of-memory The scenario is as follows: - an object is allocated in the last page of the address space - obj is byte addressable (e.g., a char buffer) - ptr points to the last few bytes of the address space (with a large offset, but starting in bounds) - the information read/written is large and therefore there's an overflow in the memory addresses that are accessed. In this case, you'll have ptr > lowerbound and end < upperbound. The bad part is that end < ptr. I thought a bit on the subject and I couldn't find any solution with just 2 branches. The exception I found is if the size of an allocated object is always smaller than half of the address space (which seems a reasonable assumption to me, but I didn't discussed it with anyone else). BTW, LLVM is supposed to be able to eliminate one of the comparisons on x86, since we can take advantage of the overflow flag that is set by the 'sub' instruction. Last time I checked (in July), LLVM was not performing this optimization, but it may have been fixed in the meantime. There are still 3 jumps, though. Nuno ----- Original Message ----- From: "Schoedel, Kevin P" <kevin.p.schoedel at intel.com> To: <llvmdev at cs.uiuc.edu> Sent: Monday, November 26, 2012 3:56 PM Subject: [LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests>I am investigating changing BoundsChecking to use address-based rather > than size- & offset-based tests. > > To explain, here is a short code sample cribbed from one of the tests: > > %mem = tail call i8* @calloc(i64 1, i64 %elements) > %memobj = bitcast i8* %mem to i64* > %ptr = getelementptr inbounds i64* %memobj, i64 %index > %4 = load i64* %ptr, align 8 > > Currently, the IR for bounds checking this load looks like this: > > %size = mul i64 8, %elements > %offset = mul i64 %index, 8 > %objsize = sub i64 %size, %offset > > %cmp2 = icmp ult i64 %size, %offset > %cmp3 = icmp ult i64 %objsize, 8 > %cmp1 = icmp slt i64 %offset, 0 > > %9 = or i1 %cmp2, %cmp3 > %11 = or i1 %cmp1, %9 > br i1 %11, label %trap, label %12 > > ┆ ┆ > │ │ > ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┢━━━━━━━┪╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶ > ↑ ┃ ┃ ↑ ↑ > · ┇ ┇ · · > · ┃ ┃ · · > · ╴╴╴╴╴╴╴╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶ ObjSize · > · ↑ ┃ ┃ ↑ · · > · data item ┃ ┃ NeededSize · Size > · ↓ ┃ ┃ ↓ ↓ · > · ╴╴╴╴Ptr╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶ · > · ┃ ┃ ↑ · > memory object ┇ ┇ Offset · > ↓ ┃ ┃ ↓ ↓ > ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┡━━━━━━━┩╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶ > │ │ > │ │ Cmp1: Offset ≥ 0 (unless constant) > │ │ Cmp2: Size ≥ Offset > ┆ ┆ Cmp3: ObjSize ≥ NeededSize > > > I am looking at generating IR like this: > > %upperbound = getelementptr inbounds i64* %memobj, i64 %elements > %end = getelementptr inbounds i64* %ptr, i64 1 > > %ilowerbound = ptrtoint i64* %memobj to i64 > %iupperbound = ptrtoint i64* %upperbound to i64 > %iptr = ptrtoint i64* %ptr to i64 > %iend = ptrtoint i64* %end to i64 > > %cmpl = icmp ult %iptr, %ilowerbound > %cmpu = icmp ult %iupperbound, %iend > > %9 = or i1 %cmpl, %cmpu > br i1 %11, label %trap, label %12 > > ┆ ┆ > │ │ > ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┢━━━━━━━┪╶╶╶╶╶UpperBound > ↑ ┃ ┃ > · ┇ ┇ > · ┃ ┃ > · ╴╴╴╴╴╴╴╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶End > · ↑ ┃ ┃ > · data item ┃ ┃ > · ↓ ┃ ┃ > · ╴╴╴╴Ptr╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶Ptr > · ┃ ┃ > memory object ┇ ┇ > ↓ ┃ ┃ > ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┡━━━━━━━┩╶╶╶╶╶LowerBound > │ │ > │ │ Cmp1: Ptr ≥ LowerBound > ┆ ┆ Cmp2: UpperBound ≥ End > > Potential advantages that I see for address-based tests: > - Always performs two comparisons, rather than (sometimes) three; > - No need to recompute Offset and ObjSize for varying Ptr within the > same memory object; > - getelementptr, for UpperBound and End, is typically very cheap on > contemporary processors (and ptrtoint typically free). > > I have prototyped address-based testing (with one wart that makes it > not production grade) but not done benchmarking and analysis. Before > I do that, I'm looking for any reasons that this method of bounds > checking would be a "no go". > > -- > Kevin Schoedel kevin.p.schoedel at intel.com +1-519-772-2580 > SSG-DPD-ECDL-DMP - Intel Dynamic Mobility and Parallelism
Philip Reames
2012-Dec-04  02:41 UTC
[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests
Nuno, Inspired by this email thread, I spent a bit of time today looking through the implementation of BoundsChecking::instrument(..). Based on my reading of prior work, it should be possible to do these checks in two comparisons, or possibly even one if the right assumptions could be made. Could you provide a bit of background of the expected domains of Size and Offset? In particular, are they signed or unsigned integers? A non-negative size doesn't seem to make much sense in this context, but depending on how it's calculated I could see it arising. Is a zero Size something that might arise here? I'm assuming the Offset comes from an arbitrary indexing expression and is thus a signed integer. I tried reading through some of the supporting code to track down where the various values were coming from, but had trouble tracking all the relevant pieces down. By working through the cases, I've identified combinations of two conditionals which should work if either a) Size is positive, or b) Size and Offset are both signed integers. Once I know which, if either, is appropriate, I'll write it up with a full explanation of why I think it works. Then we can discuss. If I'm right about the changed conditionals working, I'd then write up a patch for submission. Yours, Philip Reames On 11/26/2012 03:24 PM, Nuno Lopes wrote:> Hi Kevin, > > Thanks for your interest and for your deep analysis. > > Unfortunately, your approach doesn't catch all bugs and is vulnerable to > an attack. > Consider the following case: > > ...................... | ----- obj --- | | > end ^ ptr ^ ^ end-of-memory > > The scenario is as follows: > - an object is allocated in the last page of the address space > - obj is byte addressable (e.g., a char buffer) > - ptr points to the last few bytes of the address space (with a large > offset, but starting in bounds) > - the information read/written is large and therefore there's an > overflow in the memory addresses that are accessed. > > In this case, you'll have ptr > lowerbound and end < upperbound. The bad > part is that end < ptr. > > > I thought a bit on the subject and I couldn't find any solution with > just 2 branches. > The exception I found is if the size of an allocated object is always > smaller than half of the address space (which seems a reasonable > assumption to me, but I didn't discussed it with anyone else). > > BTW, LLVM is supposed to be able to eliminate one of the comparisons on > x86, since we can take advantage of the overflow flag that is set by the > 'sub' instruction. Last time I checked (in July), LLVM was not > performing this optimization, but it may have been fixed in the > meantime. There are still 3 jumps, though. > > Nuno > > > ----- Original Message ----- From: "Schoedel, Kevin P" > <kevin.p.schoedel at intel.com> > To: <llvmdev at cs.uiuc.edu> > Sent: Monday, November 26, 2012 3:56 PM > Subject: [LLVMdev] RFC: change BoundsChecking.cpp to use address-based > tests > > >> I am investigating changing BoundsChecking to use address-based rather >> than size- & offset-based tests. >> >> To explain, here is a short code sample cribbed from one of the tests: >> >> %mem = tail call i8* @calloc(i64 1, i64 %elements) >> %memobj = bitcast i8* %mem to i64* >> %ptr = getelementptr inbounds i64* %memobj, i64 %index >> %4 = load i64* %ptr, align 8 >> >> Currently, the IR for bounds checking this load looks like this: >> >> %size = mul i64 8, %elements >> %offset = mul i64 %index, 8 >> %objsize = sub i64 %size, %offset >> >> %cmp2 = icmp ult i64 %size, %offset >> %cmp3 = icmp ult i64 %objsize, 8 >> %cmp1 = icmp slt i64 %offset, 0 >> >> %9 = or i1 %cmp2, %cmp3 >> %11 = or i1 %cmp1, %9 >> br i1 %11, label %trap, label %12 >> >> ┆ ┆ >> │ │ >> ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┢━━━━━━━┪╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶ >> ↑ ┃ ┃ ↑ ↑ >> · ┇ ┇ · · >> · ┃ ┃ · · >> · ╴╴╴╴╴╴╴╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶ ObjSize · >> · ↑ ┃ ┃ ↑ · · >> · data item ┃ ┃ NeededSize · Size >> · ↓ ┃ ┃ ↓ ↓ · >> · ╴╴╴╴Ptr╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶ · >> · ┃ ┃ ↑ · >> memory object ┇ ┇ Offset · >> ↓ ┃ ┃ ↓ ↓ >> ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┡━━━━━━━┩╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶ >> │ │ >> │ │ Cmp1: Offset ≥ 0 (unless constant) >> │ │ Cmp2: Size ≥ Offset >> ┆ ┆ Cmp3: ObjSize ≥ NeededSize >> >> >> I am looking at generating IR like this: >> >> %upperbound = getelementptr inbounds i64* %memobj, i64 %elements >> %end = getelementptr inbounds i64* %ptr, i64 1 >> >> %ilowerbound = ptrtoint i64* %memobj to i64 >> %iupperbound = ptrtoint i64* %upperbound to i64 >> %iptr = ptrtoint i64* %ptr to i64 >> %iend = ptrtoint i64* %end to i64 >> >> %cmpl = icmp ult %iptr, %ilowerbound >> %cmpu = icmp ult %iupperbound, %iend >> >> %9 = or i1 %cmpl, %cmpu >> br i1 %11, label %trap, label %12 >> >> ┆ ┆ >> │ │ >> ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┢━━━━━━━┪╶╶╶╶╶UpperBound >> ↑ ┃ ┃ >> · ┇ ┇ >> · ┃ ┃ >> · ╴╴╴╴╴╴╴╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶End >> · ↑ ┃ ┃ >> · data item ┃ ┃ >> · ↓ ┃ ┃ >> · ╴╴╴╴Ptr╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶Ptr >> · ┃ ┃ >> memory object ┇ ┇ >> ↓ ┃ ┃ >> ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┡━━━━━━━┩╶╶╶╶╶LowerBound >> │ │ >> │ │ Cmp1: Ptr ≥ LowerBound >> ┆ ┆ Cmp2: UpperBound ≥ End >> >> Potential advantages that I see for address-based tests: >> - Always performs two comparisons, rather than (sometimes) three; >> - No need to recompute Offset and ObjSize for varying Ptr within the >> same memory object; >> - getelementptr, for UpperBound and End, is typically very cheap on >> contemporary processors (and ptrtoint typically free). >> >> I have prototyped address-based testing (with one wart that makes it >> not production grade) but not done benchmarking and analysis. Before >> I do that, I'm looking for any reasons that this method of bounds >> checking would be a "no go". >> >> -- >> Kevin Schoedel kevin.p.schoedel at intel.com +1-519-772-2580 >> SSG-DPD-ECDL-DMP - Intel Dynamic Mobility and Parallelism > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Possibly Parallel Threads
- [LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests
- [LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests
- [LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests
- [LLVMdev] max/min intrinsics
- [LLVMdev] max/min intrinsics