On 11 Feb 2014, at 08:15, Eric Christopher <echristo at gmail.com> wrote:> On Mon, Feb 10, 2014 at 11:51 PM, Reid Kleckner <rnk at google.com> wrote: >> >> >> >> IIRC this came up before, and I don't think we expose anything like a jump >> table at the IR level. As an IR-to-IR transform, I think asm is the only >> way to do it. > > I'd have to look more at what he's doing, but wouldn't a simple switch > statement in IR suffice? Efficiency would be up to the various > lowering mechanisms, but it wouldn't require inline asm. > > -ericAnother option might be to create an array of function pointers in the LLVM IR, i.e generate code that looks like: void (*jumptable[])() = { &a, &b }; void f(int index) { *(jumptable[index])(); }
On Tue, Feb 11, 2014 at 12:28 AM, Richard Osborne <richard at xmos.com> wrote:> > On 11 Feb 2014, at 08:15, Eric Christopher <echristo at gmail.com> wrote: > >> On Mon, Feb 10, 2014 at 11:51 PM, Reid Kleckner <rnk at google.com> wrote: >>> >>> >>> >>> IIRC this came up before, and I don't think we expose anything like a jump >>> table at the IR level. As an IR-to-IR transform, I think asm is the only >>> way to do it. >> >> I'd have to look more at what he's doing, but wouldn't a simple switch >> statement in IR suffice? Efficiency would be up to the various >> lowering mechanisms, but it wouldn't require inline asm. >> >> -eric > Another option might be to create an array of function pointers in the LLVM IR, i.e generate code that looks like: > > void (*jumptable[])() = { > &a, > &b > }; > > void f(int index) { > *(jumptable[index])(); > }*nod* That's the sort of thing I was thinking about too. -eric
On Tue, Feb 11, 2014 at 12:28 AM, Richard Osborne <richard at xmos.com> wrote:> > On 11 Feb 2014, at 08:15, Eric Christopher <echristo at gmail.com> wrote: > > > On Mon, Feb 10, 2014 at 11:51 PM, Reid Kleckner <rnk at google.com> wrote: > >> > >> > >> > >> IIRC this came up before, and I don't think we expose anything like a > jump > >> table at the IR level. As an IR-to-IR transform, I think asm is the > only > >> way to do it. > > > > I'd have to look more at what he's doing, but wouldn't a simple switch > > statement in IR suffice? Efficiency would be up to the various > > lowering mechanisms, but it wouldn't require inline asm. > > > > -eric > Another option might be to create an array of function pointers in the > LLVM IR, i.e generate code that looks like: > > void (*jumptable[])() = { > &a, > &b > }; > > void f(int index) { > *(jumptable[index])(); > } >This isn't ABI compatible. Now function pointers point to data (or are switch table indices) and not code. I can imagine abusing indirectbr across separate functions, but rolling your own jump table in asm seems better. On Tue, Feb 11, 2014 at 5:07 AM, Joerg Sonnenberger <joerg at britannica.bec.de> wrote:> On Mon, Feb 10, 2014 at 03:33:32PM -0800, Tom Roeder wrote: > > 3. adds a fast check for pointer safety at each indirect call site: > > Why not using a bloom filter for valid target addresses instead? >Can a bloom filter be as fast as a simple bounds check? I'm thinking lea base, sub, cmp, jl, and cold call. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140211/f1ebd16b/attachment.html>
On Tue, Feb 11, 2014 at 9:12 AM, Reid Kleckner <rnk at google.com> wrote:> On Tue, Feb 11, 2014 at 12:28 AM, Richard Osborne <richard at xmos.com> wrote: >> >> >> On 11 Feb 2014, at 08:15, Eric Christopher <echristo at gmail.com> wrote: >> >> > On Mon, Feb 10, 2014 at 11:51 PM, Reid Kleckner <rnk at google.com> wrote: >> >> >> >> >> >> >> >> IIRC this came up before, and I don't think we expose anything like a >> >> jump >> >> table at the IR level. As an IR-to-IR transform, I think asm is the >> >> only >> >> way to do it. >> > >> > I'd have to look more at what he's doing, but wouldn't a simple switch >> > statement in IR suffice? Efficiency would be up to the various >> > lowering mechanisms, but it wouldn't require inline asm.I'm not sure I follow how this would work. Could you expand on this, please?>> > >> > -eric >> Another option might be to create an array of function pointers in the >> LLVM IR, i.e generate code that looks like: >> >> void (*jumptable[])() = { >> &a, >> &b >> }; >> >> void f(int index) { >> *(jumptable[index])(); >> } > > > This isn't ABI compatible. Now function pointers point to data (or are > switch table indices) and not code. >My first attempt at a solution did it just like this, with data instead of pointers, but that opens several large cans of worms. As mentioned, the ABI is a major problem, and to maintain compatibility, you have to provide a complete escape analysis for all address-taken functions, since they have to be transformed before they are passed to external code. And, e.g., the IR translation of vtable calls actually use a more complicated representation of function pointers, IIRC. My version based on data instead of pointers worked well for simple code and toy examples but it became extremely complicated and fragile once I tried to apply it to real code. I finally realized that any transformation that depends for its soundness on a complete solution to an alias analysis problem is a non-starter for large, complicated programs; conservative solutions weaken the protections provided by the transformation. This is why I switched to using inline asm; I add a declaration of a fake external function for each entry in the table and use it as the function pointer. Since my inline asm defines these symbols, the linker is satisfied. There doesn't seem to be a way to create a true jump table in LLVM, so this is the best we can do.> I can imagine abusing indirectbr across separate functions, but rolling your > own jump table in asm seems better.I tried using indirect branches for a bit, but I didn't see how to make that work. Aren't they restricted to branching to labels inside the current function? From the LLVM Language Reference entry on the semantics of indirectbr: Control transfers to the block specified in the address argument. All possible destination blocks must be listed in the label list, otherwise this instruction has undefined behavior. This implies that jumps to labels defined in other functions have undefined behavior as well.> > On Tue, Feb 11, 2014 at 5:07 AM, Joerg Sonnenberger > <joerg at britannica.bec.de> wrote: >> >> On Mon, Feb 10, 2014 at 03:33:32PM -0800, Tom Roeder wrote: >> > 3. adds a fast check for pointer safety at each indirect call site: >> >> Why not using a bloom filter for valid target addresses instead? > > > Can a bloom filter be as fast as a simple bounds check? I'm thinking lea > base, sub, cmp, jl, and cold call.Exactly. The jump table turns into a very small amount of code; note that a normal bounds check has to check both bounds (so two subs and cmps). With the base and mask, and in an asm pseudo-code, it does: sub base, addr and mask, addr add base, addr cmp addr, orig je normal_call <load info for warning call> call warning normal_call: call orig And if you can get sufficient power-of-two alignment for the table, you can do even better, since then the base is a prefix of all valid addrs in its table. Unfortunately, Linux only gives you alignment up to 2^12 under PIE/ASLR, even though the ELF binary claims to have larger alignment. and mask addr add base addr cmp addr, orig <etc...same> My code supports this as an option, but the default is the subtraction version.
On Tue, Feb 11, 2014 at 09:12:03AM -0800, Reid Kleckner wrote:> > On Mon, Feb 10, 2014 at 03:33:32PM -0800, Tom Roeder wrote: > > > 3. adds a fast check for pointer safety at each indirect call site: > > > > Why not using a bloom filter for valid target addresses instead? > > > > Can a bloom filter be as fast as a simple bounds check? I'm thinking lea > base, sub, cmp, jl, and cold call.Depends. The potential issue with the "jump table" approach is code size -- as written, at least 64bit for every potential target. If you want to include virtual functions in that list it will grow very large. A decently working bloom filter would need in the order of 1 or 2 bits per potential target, making the chance of fitting into cache quite a bit larger. A basic hash function as candidate would be "and size" for the first filter and "mul with constant; and size" for the second, followed by a bit test for each. On modern CPUs the mul is quite cheap, so the trade off is more or less one memory access vs two. Joerg