Y Song via llvm-dev
2021-Jun-15 02:51 UTC
[llvm-dev] [cfe-dev] put "str" in __attribute__((annotate("str"))) to dwarf
On Mon, Jun 14, 2021 at 6:44 PM David Blaikie <dblaikie at gmail.com> wrote:> > > > On Mon, Jun 14, 2021 at 4:54 PM David Rector <davrecthreads at gmail.com> wrote: >> >> >> >> On Jun 14, 2021, at 5:33 PM, Y Song via cfe-dev <cfe-dev at lists.llvm.org> wrote: >> >> On Mon, Jun 14, 2021 at 1:25 PM David Blaikie <dblaikie at gmail.com> wrote: >> >> >> >> >> On Mon, Jun 14, 2021 at 12:25 PM Y Song <ys114321 at gmail.com> wrote: >> >> >> On Fri, Jun 11, 2021 at 9:59 AM Alexei Starovoitov >> <alexei.starovoitov at gmail.com> wrote: >> >> >> On Fri, Jun 11, 2021 at 07:17:32AM -0400, Aaron Ballman wrote: >> >> On Thu, Jun 10, 2021 at 8:47 PM Alexei Starovoitov >> <alexei.starovoitov at gmail.com> wrote: >> >> >> On Thu, Jun 10, 2021 at 12:42 PM David Blaikie <dblaikie at gmail.com> wrote: >> >> >> >> >> Any suggestions/preferences for the spelling, Aaron? >> >> >> I don't know this domain particularly well, so takes these suggestions >> with a giant grain of salt: >> >> If the concept is specific to DWARF and you don't think it'll need to >> extend into other debug formats, you could go with `dwarf_annotate`. >> If it's not really a DWARF thing but is more about B[P|T]F, then >> `btf_annotate` or `bpf_annotate` could work, but those may be a bit >> mysterious to folks outside of the domain. If it's a generic debug >> info concept, probably `debug_info_annotate` or something. >> >> >> >> Arguably it can/could be a generic debug info or dwarf thing, but for now we don't have any use for it other than to squirrel info along to BTF/BPF so I'm on the fence about which prefix to use exactly >> >> >> A bit more bike shedding colors... >> >> The __rcu and __user annations might be used by the clang itself eventually. >> Currently the "sparse" tool is doing this analysis and warns users >> when __rcu pointer is incorrectly accessed in the kernel C code. >> If clang can do that directly that could be a huge selling point >> for folks to switch from gcc to clang for kernel builds. >> The front-end would treat such annotations as arbitrary string, but >> special "building-linux-kernel-pass" would interpret the semantical context. >> >> >> Are __rcu and __user annotations notionally distinct things from bpf >> (and perhaps each other as well)? Distinct enough that it would make >> sense to use a different attribute name for user source for each need? >> I suspect the answer is yes given that the existing annotations have >> their own names which are distinct, but I don't know this domain >> enough to be sure. >> >> >> __rcu and __user don't overlap. __rcu is not a single annotation though. >> It's a combination of annotations in pointers, functions, macros. >> Some functions have: >> __acquires(rcu) >> another function might have: >> __acquires(rcu_bh) >> There are several flavors of the RCU in the kernel. >> So single __attribute__((rcu_annotate("foo"))) won't work even within RCU scope. >> But if we do: >> struct foo { >> void * __attribute__((tag("ptr.rcu_bh")) ptr; >> }; >> int bar(int) __attribute__((tag("acquires.rcu_bh")) { ... } >> int baz(int) __attribute__((tag("releases.rcu_bh")) { ... } >> int qux(int) __attribute__((tag("acquires.rcu_sched")) { ... } >> ... >> The clang pass can parse these strings and correlate one tag to another. >> RCU flavors come and go, so clang cannot hard code the names. >> >> >> Maybe we can name it as "bpf_tag" as it is a "tag" for "bpf" use case? >> >> David, in one of your early emails, you mentioned: >> >> ==>> Arguably it can/could be a generic debug info or dwarf thing, but for >> now we don't have any use for it other than to squirrel info along to >> BTF/BPF so I'm on the fence about which prefix to use exactly >> ==>> >> and suggests since it might be used in the future for non-bpf things, >> maybe the name could be a little more generic then bpf-specific. >> >> Do you have any suggestions on what name to pick? >> >> >> >> Nah, not especially. bpf_tag sounds OK-ish to me if it suits you. >> >> >> >> The more generic the better IMO. And, the less the need to parse string literals the better. >> >> Why not simply `__attribute__((debuginfo("arg1", "arg2", ...)))`, e.g.: >> >> ``` >> #define BPF_TAG(...) __attribute__((debuginfo("bpf", __VA_ARGS__))) >> struct foo { >> void * BPF_TAG("ptr","rcu","bh") ptr; >> }; >> #define BPF_RCU_TAG(PFX, ...) BPF(PFX, "rcu", __VA_ARGS__) >> int bar(int) BPF_RCU_TAG("acquires","bh") { ... } >> int baz(int) BPF_RCU_TAG("releases","bh") { ... } >> int qux(int) BPF_RCU_TAG("acquires","sched") { ... } >> ``` > > > Unless Paul & Adrian, etc chime in in agreement of a more general name, like 'debuginfo', I'm inclined to avoid that/go with something bpf specific until there's a broader use case/proposal, something we might be able to/want to encourage GCC to support too. Otherwise we're taking a pretty broad attribute name & choosing its behavior when we don't necessarily have a lot of leverage if GCC ends up using that name for something else. > > & as for separate strings - maybe, but I'm not sure what that'll look like in the resulting DWARF, what sort of form would you propose using to encode that? (same question below \/) > >> >> >> Sounds good. I will use "bpf_tag" as the starting point now. >> Also, it is possible "bpf_tag" may appear multiple times for the same >> function, declaration etc. >> >> For example, >> #define __bpf_tag(s) __attribute__((bpf_tag(s))) >> int g __bpf_tag("str1") __bpf_tag("str2"); >> Let us say we introduced a LLVM vendor tag DWARF_AT_LLVM_bpf_tag. >> >> How do you want the above to be represented in dwarf? >> >> My current scheme is to put all bpf_tag's in a string, separated by ",". >> This will make things simpler. So the final output will be >> DWARF_AT_LLVM_bpf_tag "str1,str2" >> I may need to do a discussion with the kernel folks to use a different >> delimiter than ",", but we still represent all tags with ONE string. >> >> But alternatively, it could be represented as a list of strings like >> DWARF_AT_LLVM_bpf_tag >> "str1" >> "str2" >> is similar to DWARF_AT_location. > > > What DWARF form were you thinking of using for this? There isn't a built in form that provides encoding for multiple delimited/separated strings that I know of.Actually I have not looked at the details on how to implement multiple separated strings yet. Since you are mentioning there exists no such a built-in form and the attribute is bpf specific, I will then just go to one string only approach (e.g. "str1;str2" where ";" is the delimiter). I just checked linux:include/linux/compiler_*.h, it is possible "," may appear in some attributes, so I will use ";" as the delimiter. Thanks for the clarification!> >> >> >> The first internal representation >> DWARF_AT_LLVM_bpf_tag "str1,str2" >> should be easier for IR/bitcode read/write and dwarf parsing. >> >> What do you think? >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >> >>
David Blaikie via llvm-dev
2021-Jun-15 03:29 UTC
[llvm-dev] [cfe-dev] put "str" in __attribute__((annotate("str"))) to dwarf
On Mon, Jun 14, 2021 at 7:52 PM Y Song <ys114321 at gmail.com> wrote:> On Mon, Jun 14, 2021 at 6:44 PM David Blaikie <dblaikie at gmail.com> wrote: > > > > > > > > On Mon, Jun 14, 2021 at 4:54 PM David Rector <davrecthreads at gmail.com> > wrote: > >> > >> > >> > >> On Jun 14, 2021, at 5:33 PM, Y Song via cfe-dev <cfe-dev at lists.llvm.org> > wrote: > >> > >> On Mon, Jun 14, 2021 at 1:25 PM David Blaikie <dblaikie at gmail.com> > wrote: > >> > >> > >> > >> > >> On Mon, Jun 14, 2021 at 12:25 PM Y Song <ys114321 at gmail.com> wrote: > >> > >> > >> On Fri, Jun 11, 2021 at 9:59 AM Alexei Starovoitov > >> <alexei.starovoitov at gmail.com> wrote: > >> > >> > >> On Fri, Jun 11, 2021 at 07:17:32AM -0400, Aaron Ballman wrote: > >> > >> On Thu, Jun 10, 2021 at 8:47 PM Alexei Starovoitov > >> <alexei.starovoitov at gmail.com> wrote: > >> > >> > >> On Thu, Jun 10, 2021 at 12:42 PM David Blaikie <dblaikie at gmail.com> > wrote: > >> > >> > >> > >> > >> Any suggestions/preferences for the spelling, Aaron? > >> > >> > >> I don't know this domain particularly well, so takes these suggestions > >> with a giant grain of salt: > >> > >> If the concept is specific to DWARF and you don't think it'll need to > >> extend into other debug formats, you could go with `dwarf_annotate`. > >> If it's not really a DWARF thing but is more about B[P|T]F, then > >> `btf_annotate` or `bpf_annotate` could work, but those may be a bit > >> mysterious to folks outside of the domain. If it's a generic debug > >> info concept, probably `debug_info_annotate` or something. > >> > >> > >> > >> Arguably it can/could be a generic debug info or dwarf thing, but for > now we don't have any use for it other than to squirrel info along to > BTF/BPF so I'm on the fence about which prefix to use exactly > >> > >> > >> A bit more bike shedding colors... > >> > >> The __rcu and __user annations might be used by the clang itself > eventually. > >> Currently the "sparse" tool is doing this analysis and warns users > >> when __rcu pointer is incorrectly accessed in the kernel C code. > >> If clang can do that directly that could be a huge selling point > >> for folks to switch from gcc to clang for kernel builds. > >> The front-end would treat such annotations as arbitrary string, but > >> special "building-linux-kernel-pass" would interpret the semantical > context. > >> > >> > >> Are __rcu and __user annotations notionally distinct things from bpf > >> (and perhaps each other as well)? Distinct enough that it would make > >> sense to use a different attribute name for user source for each need? > >> I suspect the answer is yes given that the existing annotations have > >> their own names which are distinct, but I don't know this domain > >> enough to be sure. > >> > >> > >> __rcu and __user don't overlap. __rcu is not a single annotation though. > >> It's a combination of annotations in pointers, functions, macros. > >> Some functions have: > >> __acquires(rcu) > >> another function might have: > >> __acquires(rcu_bh) > >> There are several flavors of the RCU in the kernel. > >> So single __attribute__((rcu_annotate("foo"))) won't work even within > RCU scope. > >> But if we do: > >> struct foo { > >> void * __attribute__((tag("ptr.rcu_bh")) ptr; > >> }; > >> int bar(int) __attribute__((tag("acquires.rcu_bh")) { ... } > >> int baz(int) __attribute__((tag("releases.rcu_bh")) { ... } > >> int qux(int) __attribute__((tag("acquires.rcu_sched")) { ... } > >> ... > >> The clang pass can parse these strings and correlate one tag to another. > >> RCU flavors come and go, so clang cannot hard code the names. > >> > >> > >> Maybe we can name it as "bpf_tag" as it is a "tag" for "bpf" use case? > >> > >> David, in one of your early emails, you mentioned: > >> > >> ==> >> Arguably it can/could be a generic debug info or dwarf thing, but for > >> now we don't have any use for it other than to squirrel info along to > >> BTF/BPF so I'm on the fence about which prefix to use exactly > >> ==> >> > >> and suggests since it might be used in the future for non-bpf things, > >> maybe the name could be a little more generic then bpf-specific. > >> > >> Do you have any suggestions on what name to pick? > >> > >> > >> > >> Nah, not especially. bpf_tag sounds OK-ish to me if it suits you. > >> > >> > >> > >> The more generic the better IMO. And, the less the need to parse > string literals the better. > >> > >> Why not simply `__attribute__((debuginfo("arg1", "arg2", ...)))`, e.g.: > >> > >> ``` > >> #define BPF_TAG(...) __attribute__((debuginfo("bpf", __VA_ARGS__))) > >> struct foo { > >> void * BPF_TAG("ptr","rcu","bh") ptr; > >> }; > >> #define BPF_RCU_TAG(PFX, ...) BPF(PFX, "rcu", __VA_ARGS__) > >> int bar(int) BPF_RCU_TAG("acquires","bh") { ... } > >> int baz(int) BPF_RCU_TAG("releases","bh") { ... } > >> int qux(int) BPF_RCU_TAG("acquires","sched") { ... } > >> ``` > > > > > > Unless Paul & Adrian, etc chime in in agreement of a more general name, > like 'debuginfo', I'm inclined to avoid that/go with something bpf specific > until there's a broader use case/proposal, something we might be able > to/want to encourage GCC to support too. Otherwise we're taking a pretty > broad attribute name & choosing its behavior when we don't necessarily have > a lot of leverage if GCC ends up using that name for something else. > > > > & as for separate strings - maybe, but I'm not sure what that'll look > like in the resulting DWARF, what sort of form would you propose using to > encode that? (same question below \/) > > > >> > >> > >> Sounds good. I will use "bpf_tag" as the starting point now. > >> Also, it is possible "bpf_tag" may appear multiple times for the same > >> function, declaration etc. > >> > >> For example, > >> #define __bpf_tag(s) __attribute__((bpf_tag(s))) > >> int g __bpf_tag("str1") __bpf_tag("str2"); > >> Let us say we introduced a LLVM vendor tag DWARF_AT_LLVM_bpf_tag. > >> > >> How do you want the above to be represented in dwarf? > >> > >> My current scheme is to put all bpf_tag's in a string, separated by ",". > >> This will make things simpler. So the final output will be > >> DWARF_AT_LLVM_bpf_tag "str1,str2" > >> I may need to do a discussion with the kernel folks to use a different > >> delimiter than ",", but we still represent all tags with ONE string. > >> > >> But alternatively, it could be represented as a list of strings like > >> DWARF_AT_LLVM_bpf_tag > >> "str1" > >> "str2" > >> is similar to DWARF_AT_location. > > > > > > What DWARF form were you thinking of using for this? There isn't a built > in form that provides encoding for multiple delimited/separated strings > that I know of. > > Actually I have not looked at the details on how to implement multiple > separated strings yet. Since you are mentioning there exists no such a > built-in form and the attribute is bpf specific, I will then just go > to one string only approach (e.g. "str1;str2" where ";" is the > delimiter). I just checked linux:include/linux/compiler_*.h, it is > possible "," may appear in some attributes, so I will use ";" as the > delimiter. Thanks for the clarification! >Do you need to support multiple distinct __attribute__((XXX("stuff"))) on one entity? If so, maybe it's worth considering how to encode them separately, rather than having the frontend have to concatenate them together? One option would be to support multiple of the same attribute on the DIE in question - though that's probably still difficult to encode in the LLVM IR metadata (we don't have any repeating fields in the LLVM IR debug info metadata) - which, maybe comes back to the idea of having the frontend concatenate all the attributes together with some separator like ";". -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210614/830df931/attachment.html>