thr3ads.net - llvm dev - [llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Wangnan (F) via llvm-dev

2015-Aug-12 02:34 UTC

[llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 2015/8/4 3:44, Alexei Starovoitov wrote:

[SNIP]>> I'll post 2 LLVM patches by replying this mail. Please have a look
and
>> help me
>> send them to LLVM if you think my code is correct.
>
>
[SNIP]> patch 2:
> do we really need to hack clang?
> Can you just define a function that aliases to intrinsic,
> like we do for ld_abs/ld_ind ?
> void bpf_store_half(void *skb, u64 off, u64 val) 
> asm("llvm.bpf.store.half");
> then no extra patches necessary.
Hi Alexei,

By two weeks researching, I have to give you a sad answer that:

target specific intrinsic is not work.

I tried target specific intrinsic. However, LLVM isolates backend and
frontend, and there's no way to pass language level type information
to backend code.

Think about a program like this:

struct strA { int a; }
struct strB { int b; }
int func() {
   struct strA a;
   struct strB b;

   a.a = 1;
   b.b = 2;
   bpf_output(gettype(a), &a);
   bpf_output(gettype(b), &b);
   return 0;
}

BPF backend can't (and needn't) tell the difference between local
variables a and b in theory. In LLVM implementation, it filters type
information out using ComputeValueVTs().  Please have a look at
SelectionDAGBuilder::visitIntrinsicCall in
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp and
SelectionDAGBuilder::visitTargetIntrinsic in the same file. in
visitTargetIntrinsic, ComputeValueVTs acts as a barrier which strips
type information out from CallInst ("I"), and leave SDValue and
SDVTList
("Ops" and "VTs") to target code. SDValue and SDVTList are
wrappers of
EVT and MVT, all information we concern won't be passed here.

I think now we have 2 choices:

1. Hacking into clang, implement target specific builtin function. Now I
    have worked out a ugly but workable patch which setup a builtin 
function:
    __builtin_bpf_typeid(), which accepts local or global variable then
    returns different constant for different types.

2. Implementing an LLVM intrinsic call (llvm.typeid), make it be 
processed in
    visitIntrinsicCall(). I think we can get something useful if it is 
processed
    with that function.

The next thing should be generating debug information to map type and
constants which issued by __builtin_bpf_typeid() or llvm.typeid. Now we
have a crazy idea that, if we limit the name of the structure to 8 bytes,
we can insert the name into a u64, then there would be no need to consider
type information in DWARF. For example, in the above sample code, gettype(a)
will issue 0x0000000041727473 because its type is "strA". What do you
think?

Thank you.

Alexei Starovoitov via llvm-dev

2015-Aug-12 04:57 UTC

head link

[llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On Wed, Aug 12, 2015 at 10:34:43AM +0800, Wangnan (F) via llvm-dev
wrote:> 
> Think about a program like this:
> 
> struct strA { int a; }
> struct strB { int b; }
> int func() {
>   struct strA a;
>   struct strB b;
> 
>   a.a = 1;
>   b.b = 2;
>   bpf_output(gettype(a), &a);
>   bpf_output(gettype(b), &b);
>   return 0;
> }
> 
> BPF backend can't (and needn't) tell the difference between local
> variables a and b in theory. In LLVM implementation, it filters type
> information out using ComputeValueVTs().  Please have a look at
> SelectionDAGBuilder::visitIntrinsicCall in
> lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp and
> SelectionDAGBuilder::visitTargetIntrinsic in the same file. in
> visitTargetIntrinsic, ComputeValueVTs acts as a barrier which strips
> type information out from CallInst ("I"), and leave SDValue and
SDVTList
> ("Ops" and "VTs") to target code. SDValue and SDVTList
are wrappers of
> EVT and MVT, all information we concern won't be passed here.
> 
> I think now we have 2 choices:
> 
> 1. Hacking into clang, implement target specific builtin function. Now I
>    have worked out a ugly but workable patch which setup a builtin
function:
>    __builtin_bpf_typeid(), which accepts local or global variable then
>    returns different constant for different types.
> 
> 2. Implementing an LLVM intrinsic call (llvm.typeid), make it be processed
> in
>    visitIntrinsicCall(). I think we can get something useful if it is
> processed
>    with that function.
Yeah. You're right about pure target intrinsics.
I think llvm.typeid might work. imo it's cleaner than
doing it at clang level.
> The next thing should be generating debug information to map type and
> constants which issued by __builtin_bpf_typeid() or llvm.typeid. Now we
> have a crazy idea that, if we limit the name of the structure to 8 bytes,
> we can insert the name into a u64, then there would be no need to consider
> type information in DWARF. For example, in the above sample code,
gettype(a)
> will issue 0x0000000041727473 because its type is "strA". What do
you think?
that's way too hacky.
I was thinking when compiling we can keep llvm ir along with .o
instead of dwarf and extract type info from there.
dwarf has names and other things that we don't need. We only
care about actual field layout of the structs.
But it probably won't be easy to parse llvm ir on perf side
instead of dwarf.

btw, if you haven't looked at iovisor/bcc, there we're solving
similar problem differently. There we use clang rewriter, so all
structs fields are visible at this level, then we use bpf backend
in JIT mode and push bpf instructions into the kernel on the fly
completely skipping ELF and .o
For example in:
https://github.com/iovisor/bcc/blob/master/examples/distributed_bridge/tunnel.c
when you see
struct ethernet_t {
  unsigned long long  dst:48;
  unsigned long long  src:48;
  unsigned int        type:16;
} BPF_PACKET_HEADER;
struct ethernet_t *ethernet = cursor_advance(cursor, sizeof(*ethernet));
... ethernet->src ...
is recognized by clang rewriter and ->src is converted to a different
C code that is sent again into clang.
So there is no need to use dwarf or patch clang/llvm. clang rewriter
has all the info.
I'm not sure you can live with clang/llvm on the host where you
want to run the tracing bits, but if you can that's an easier option.

Wangnan (F) via llvm-dev

2015-Aug-12 05:28 UTC

head link

[llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 2015/8/12 12:57, Alexei Starovoitov wrote:> On Wed, Aug 12, 2015 at 10:34:43AM +0800, Wangnan (F) via llvm-dev wrote:
>> Think about a program like this:
>>
>> struct strA { int a; }
>> struct strB { int b; }
>> int func() {
>>    struct strA a;
>>    struct strB b;
>>
>>    a.a = 1;
>>    b.b = 2;
>>    bpf_output(gettype(a), &a);
>>    bpf_output(gettype(b), &b);
>>    return 0;
>> }
>>
>> BPF backend can't (and needn't) tell the difference between
local
>> variables a and b in theory. In LLVM implementation, it filters type
>> information out using ComputeValueVTs().  Please have a look at
>> SelectionDAGBuilder::visitIntrinsicCall in
>> lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp and
>> SelectionDAGBuilder::visitTargetIntrinsic in the same file. in
>> visitTargetIntrinsic, ComputeValueVTs acts as a barrier which strips
>> type information out from CallInst ("I"), and leave SDValue
and SDVTList
>> ("Ops" and "VTs") to target code. SDValue and
SDVTList are wrappers of
>> EVT and MVT, all information we concern won't be passed here.
>>
>> I think now we have 2 choices:
>>
>> 1. Hacking into clang, implement target specific builtin function. Now
I
>>     have worked out a ugly but workable patch which setup a builtin
function:
>>     __builtin_bpf_typeid(), which accepts local or global variable then
>>     returns different constant for different types.
>>
>> 2. Implementing an LLVM intrinsic call (llvm.typeid), make it be
processed
>> in
>>     visitIntrinsicCall(). I think we can get something useful if it is
>> processed
>>     with that function.
> Yeah. You're right about pure target intrinsics.
> I think llvm.typeid might work. imo it's cleaner than
> doing it at clang level.
>
>> The next thing should be generating debug information to map type and
>> constants which issued by __builtin_bpf_typeid() or llvm.typeid. Now we
>> have a crazy idea that, if we limit the name of the structure to 8
bytes,
>> we can insert the name into a u64, then there would be no need to
consider
>> type information in DWARF. For example, in the above sample code,
gettype(a)
>> will issue 0x0000000041727473 because its type is "strA".
What do you think?
> that's way too hacky.
> I was thinking when compiling we can keep llvm ir along with .o
> instead of dwarf and extract type info from there.
> dwarf has names and other things that we don't need. We only
> care about actual field layout of the structs.
> But it probably won't be easy to parse llvm ir on perf side
> instead of dwarf.
Shipping both llvm IR and .o to perf makes it harder to use. I'm
not sure whether it is a good idea. If we are unable to encode the
structure using a u64, let's still dig into dwarf.

We have another idea that we can utilize dwarf's existing feature.
For example, when __buildin_bpf_typeid() get called, define an enumerate
type in dwarf info, so you'll find:

  <1><2a>: Abbrev Number: 2 (DW_TAG_enumeration_type)
     <2b>   DW_AT_name        : (indirect string, offset: 0xec): TYPEINFO
     <2f>   DW_AT_byte_size   : 4
     <30>   DW_AT_decl_file   : 1
     <31>   DW_AT_decl_line   : 3
  <2><32>: Abbrev Number: 3 (DW_TAG_enumerator)
     <33>   DW_AT_name        : (indirect string, offset: 0xcc): 
__typeinfo_strA
     <37>   DW_AT_const_value : 2
  <2><38>: Abbrev Number: 3 (DW_TAG_enumerator)
     <39>   DW_AT_name        : (indirect string, offset: 0xdc): 
__typeinfo_strB
     <3d>   DW_AT_const_value : 3

or this:

  <3><54>: Abbrev Number: 4 (DW_TAG_variable)
     <55>   DW_AT_const_value : 2
     <66>   DW_AT_name        : (indirect string, offset: 0x1e): 
__typeinfo_strA
     <6a>   DW_AT_decl_file   : 1
     <6b>   DW_AT_decl_line   : 29
     <6c>   DW_AT_type        : <0x72>

then from DW_AT_name and DW_AT_const_value we can do the mapping. 
Drawback is that
all __typeinfo_ prefixed names become reserved.

> btw, if you haven't looked at iovisor/bcc, there we're solving
> similar problem differently. There we use clang rewriter, so all
> structs fields are visible at this level, then we use bpf backend
> in JIT mode and push bpf instructions into the kernel on the fly
> completely skipping ELF and .o
> For example in:
>
https://github.com/iovisor/bcc/blob/master/examples/distributed_bridge/tunnel.c
> when you see
> struct ethernet_t {
>    unsigned long long  dst:48;
>    unsigned long long  src:48;
>    unsigned int        type:16;
> } BPF_PACKET_HEADER;
> struct ethernet_t *ethernet = cursor_advance(cursor, sizeof(*ethernet));
> ... ethernet->src ...
> is recognized by clang rewriter and ->src is converted to a different
> C code that is sent again into clang.
> So there is no need to use dwarf or patch clang/llvm. clang rewriter
> has all the info.
Could you please give us further information about your clang rewriter?
I guess you need a new .so when injecting those code into kernel?
> I'm not sure you can live with clang/llvm on the host where you
> want to run the tracing bits, but if you can that's an easier option.
>
I'm not sure. Our target platform should be embedded devices like 
smartphone.
Bringing full clang/llvm environment there is not acceptable.

Thank you.

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Aug 2015 - llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

[llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

[llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

[llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Maybe Matching Threads