Yichun Zhang via llvm-dev
2021-Nov-07 08:02 UTC
[llvm-dev] [BPF] Change the JMP instruction format in LLVM?
Hi folks! I need some help on how to change bpf's JMP instruction format in LLVM's bpf target. Currently, the BPF branching instructions use signed 16-bit integers as the jump offsets which is far from enough for large bpf programs (the eBPF VM in the Linux kernel already supports up to 1M instructions anyway). And we're trying to extend the JMP instruction to utilize the currently unused imm32 operand. The first attempt was the following patch but it never worked as expected: https://gist.github.com/agentzh/edaca7c06763eea9cd02cf7d6f67888d I use the following minimal C program to test it: https://gist.github.com/agentzh/01ade7ede9396e9ad1729b0c38d35835 And I used the following commands to compile and disassemble it: clang -g -fno-builtin -O0 -target bpf -o test.o -c test.c llvm-objdump -S --arch-name=bpf test.o > test.S Before the patch, the disassembly for 2 branching instructions in the output file test.S looks like this: 4: 7d 21 02 00 00 00 00 00 if r1 s>= r2 goto +2 <LBB0_2> 6: 05 00 09 00 00 00 00 00 goto +9 <LBB0_3> We can see that the jmp offset in instruction #4 is 02 00, which is 16-bit in little endian. And the jmp offset in instruction #6 is 09 00. The expected instruction bytes should be instead 4: 7d 21 00 00 02 00 00 00 6: 05 00 00 00 09 00 00 00 That is, we utilize the last 4 bytes, the 32-bit imm number to store the jmp offsets. But after applying my patch above, it looks like this: 4: 7d 21 02 00 00 00 00 00 if r1 s>= r2 goto +0 <foo+0x28> 6: 05 00 09 00 00 00 00 00 goto +0 <LBB0_2> Not only the +0 offset shown in the disassembly for instruction #6 is wrong (should be +9), but also the 32-bit imm numbers in both instructions are still zero. So I must miss something in my patch. If I use constant numbers in my patch, then they will appear in the disassembly, that is, something like let Inst{47-32} = 3; let Inst{31-0} = 7; Why won't the use of variables like BrDst in the patch? Any hints or guidance will be greatly appreciated! Also, after making the JMP instruction work, I'd also like to enforce LLVM to avoid using the conditional branching instructions for large offsets. Any hints and suggestions on how to make this work will also be appreciated. Thanks in advance! Best, Yichun