David Blaikie via llvm-dev
2021-Jun-15 01:44 UTC
[llvm-dev] [cfe-dev] put "str" in __attribute__((annotate("str"))) to dwarf
On Mon, Jun 14, 2021 at 4:54 PM David Rector <davrecthreads at gmail.com> wrote:> > > On Jun 14, 2021, at 5:33 PM, Y Song via cfe-dev <cfe-dev at lists.llvm.org> > wrote: > > On Mon, Jun 14, 2021 at 1:25 PM David Blaikie <dblaikie at gmail.com> wrote: > > > > > On Mon, Jun 14, 2021 at 12:25 PM Y Song <ys114321 at gmail.com> wrote: > > > On Fri, Jun 11, 2021 at 9:59 AM Alexei Starovoitov > <alexei.starovoitov at gmail.com> wrote: > > > On Fri, Jun 11, 2021 at 07:17:32AM -0400, Aaron Ballman wrote: > > On Thu, Jun 10, 2021 at 8:47 PM Alexei Starovoitov > <alexei.starovoitov at gmail.com> wrote: > > > On Thu, Jun 10, 2021 at 12:42 PM David Blaikie <dblaikie at gmail.com> wrote: > > > > > Any suggestions/preferences for the spelling, Aaron? > > > I don't know this domain particularly well, so takes these suggestions > with a giant grain of salt: > > If the concept is specific to DWARF and you don't think it'll need to > extend into other debug formats, you could go with `dwarf_annotate`. > If it's not really a DWARF thing but is more about B[P|T]F, then > `btf_annotate` or `bpf_annotate` could work, but those may be a bit > mysterious to folks outside of the domain. If it's a generic debug > info concept, probably `debug_info_annotate` or something. > > > > Arguably it can/could be a generic debug info or dwarf thing, but for now > we don't have any use for it other than to squirrel info along to BTF/BPF > so I'm on the fence about which prefix to use exactly > > > A bit more bike shedding colors... > > The __rcu and __user annations might be used by the clang itself > eventually. > Currently the "sparse" tool is doing this analysis and warns users > when __rcu pointer is incorrectly accessed in the kernel C code. > If clang can do that directly that could be a huge selling point > for folks to switch from gcc to clang for kernel builds. > The front-end would treat such annotations as arbitrary string, but > special "building-linux-kernel-pass" would interpret the semantical > context. > > > Are __rcu and __user annotations notionally distinct things from bpf > (and perhaps each other as well)? Distinct enough that it would make > sense to use a different attribute name for user source for each need? > I suspect the answer is yes given that the existing annotations have > their own names which are distinct, but I don't know this domain > enough to be sure. > > > __rcu and __user don't overlap. __rcu is not a single annotation though. > It's a combination of annotations in pointers, functions, macros. > Some functions have: > __acquires(rcu) > another function might have: > __acquires(rcu_bh) > There are several flavors of the RCU in the kernel. > So single __attribute__((rcu_annotate("foo"))) won't work even within RCU > scope. > But if we do: > struct foo { > void * __attribute__((tag("ptr.rcu_bh")) ptr; > }; > int bar(int) __attribute__((tag("acquires.rcu_bh")) { ... } > int baz(int) __attribute__((tag("releases.rcu_bh")) { ... } > int qux(int) __attribute__((tag("acquires.rcu_sched")) { ... } > ... > The clang pass can parse these strings and correlate one tag to another. > RCU flavors come and go, so clang cannot hard code the names. > > > Maybe we can name it as "bpf_tag" as it is a "tag" for "bpf" use case? > > David, in one of your early emails, you mentioned: > > ==> Arguably it can/could be a generic debug info or dwarf thing, but for > now we don't have any use for it other than to squirrel info along to > BTF/BPF so I'm on the fence about which prefix to use exactly > ==> > and suggests since it might be used in the future for non-bpf things, > maybe the name could be a little more generic then bpf-specific. > > Do you have any suggestions on what name to pick? > > > > Nah, not especially. bpf_tag sounds OK-ish to me if it suits you. > > > > The more generic the better IMO. And, the less the need to parse string > literals the better. > > Why not simply `__attribute__((debuginfo("arg1", "arg2", ...)))`, e.g.: > > ``` > #define BPF_TAG(...) __attribute__((debuginfo("bpf", __VA_ARGS__))) > struct foo { > void * BPF_TAG("ptr","rcu","bh") ptr; > }; > #define BPF_RCU_TAG(PFX, ...) BPF(PFX, "rcu", __VA_ARGS__) > int bar(int) BPF_RCU_TAG("acquires","bh") { ... } > int baz(int) BPF_RCU_TAG("releases","bh") { ... } > int qux(int) BPF_RCU_TAG("acquires","sched") { ... } > ``` >Unless Paul & Adrian, etc chime in in agreement of a more general name, like 'debuginfo', I'm inclined to avoid that/go with something bpf specific until there's a broader use case/proposal, something we might be able to/want to encourage GCC to support too. Otherwise we're taking a pretty broad attribute name & choosing its behavior when we don't necessarily have a lot of leverage if GCC ends up using that name for something else. & as for separate strings - maybe, but I'm not sure what that'll look like in the resulting DWARF, what sort of form would you propose using to encode that? (same question below \/)> > Sounds good. I will use "bpf_tag" as the starting point now. > Also, it is possible "bpf_tag" may appear multiple times for the same > function, declaration etc. > > For example, > #define __bpf_tag(s) __attribute__((bpf_tag(s))) > int g __bpf_tag("str1") __bpf_tag("str2"); > Let us say we introduced a LLVM vendor tag DWARF_AT_LLVM_bpf_tag. > > How do you want the above to be represented in dwarf? > > My current scheme is to put all bpf_tag's in a string, separated by ",". > This will make things simpler. So the final output will be > DWARF_AT_LLVM_bpf_tag "str1,str2" > I may need to do a discussion with the kernel folks to use a different > delimiter than ",", but we still represent all tags with ONE string. > > But alternatively, it could be represented as a list of strings like > DWARF_AT_LLVM_bpf_tag > "str1" > "str2" > is similar to DWARF_AT_location. > >What DWARF form were you thinking of using for this? There isn't a built in form that provides encoding for multiple delimited/separated strings that I know of.> > The first internal representation > DWARF_AT_LLVM_bpf_tag "str1,str2" > should be easier for IR/bitcode read/write and dwarf parsing. > > What do you think? > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210614/451bf5a1/attachment.html>
Y Song via llvm-dev
2021-Jun-15 02:51 UTC
[llvm-dev] [cfe-dev] put "str" in __attribute__((annotate("str"))) to dwarf
On Mon, Jun 14, 2021 at 6:44 PM David Blaikie <dblaikie at gmail.com> wrote:> > > > On Mon, Jun 14, 2021 at 4:54 PM David Rector <davrecthreads at gmail.com> wrote: >> >> >> >> On Jun 14, 2021, at 5:33 PM, Y Song via cfe-dev <cfe-dev at lists.llvm.org> wrote: >> >> On Mon, Jun 14, 2021 at 1:25 PM David Blaikie <dblaikie at gmail.com> wrote: >> >> >> >> >> On Mon, Jun 14, 2021 at 12:25 PM Y Song <ys114321 at gmail.com> wrote: >> >> >> On Fri, Jun 11, 2021 at 9:59 AM Alexei Starovoitov >> <alexei.starovoitov at gmail.com> wrote: >> >> >> On Fri, Jun 11, 2021 at 07:17:32AM -0400, Aaron Ballman wrote: >> >> On Thu, Jun 10, 2021 at 8:47 PM Alexei Starovoitov >> <alexei.starovoitov at gmail.com> wrote: >> >> >> On Thu, Jun 10, 2021 at 12:42 PM David Blaikie <dblaikie at gmail.com> wrote: >> >> >> >> >> Any suggestions/preferences for the spelling, Aaron? >> >> >> I don't know this domain particularly well, so takes these suggestions >> with a giant grain of salt: >> >> If the concept is specific to DWARF and you don't think it'll need to >> extend into other debug formats, you could go with `dwarf_annotate`. >> If it's not really a DWARF thing but is more about B[P|T]F, then >> `btf_annotate` or `bpf_annotate` could work, but those may be a bit >> mysterious to folks outside of the domain. If it's a generic debug >> info concept, probably `debug_info_annotate` or something. >> >> >> >> Arguably it can/could be a generic debug info or dwarf thing, but for now we don't have any use for it other than to squirrel info along to BTF/BPF so I'm on the fence about which prefix to use exactly >> >> >> A bit more bike shedding colors... >> >> The __rcu and __user annations might be used by the clang itself eventually. >> Currently the "sparse" tool is doing this analysis and warns users >> when __rcu pointer is incorrectly accessed in the kernel C code. >> If clang can do that directly that could be a huge selling point >> for folks to switch from gcc to clang for kernel builds. >> The front-end would treat such annotations as arbitrary string, but >> special "building-linux-kernel-pass" would interpret the semantical context. >> >> >> Are __rcu and __user annotations notionally distinct things from bpf >> (and perhaps each other as well)? Distinct enough that it would make >> sense to use a different attribute name for user source for each need? >> I suspect the answer is yes given that the existing annotations have >> their own names which are distinct, but I don't know this domain >> enough to be sure. >> >> >> __rcu and __user don't overlap. __rcu is not a single annotation though. >> It's a combination of annotations in pointers, functions, macros. >> Some functions have: >> __acquires(rcu) >> another function might have: >> __acquires(rcu_bh) >> There are several flavors of the RCU in the kernel. >> So single __attribute__((rcu_annotate("foo"))) won't work even within RCU scope. >> But if we do: >> struct foo { >> void * __attribute__((tag("ptr.rcu_bh")) ptr; >> }; >> int bar(int) __attribute__((tag("acquires.rcu_bh")) { ... } >> int baz(int) __attribute__((tag("releases.rcu_bh")) { ... } >> int qux(int) __attribute__((tag("acquires.rcu_sched")) { ... } >> ... >> The clang pass can parse these strings and correlate one tag to another. >> RCU flavors come and go, so clang cannot hard code the names. >> >> >> Maybe we can name it as "bpf_tag" as it is a "tag" for "bpf" use case? >> >> David, in one of your early emails, you mentioned: >> >> ==>> Arguably it can/could be a generic debug info or dwarf thing, but for >> now we don't have any use for it other than to squirrel info along to >> BTF/BPF so I'm on the fence about which prefix to use exactly >> ==>> >> and suggests since it might be used in the future for non-bpf things, >> maybe the name could be a little more generic then bpf-specific. >> >> Do you have any suggestions on what name to pick? >> >> >> >> Nah, not especially. bpf_tag sounds OK-ish to me if it suits you. >> >> >> >> The more generic the better IMO. And, the less the need to parse string literals the better. >> >> Why not simply `__attribute__((debuginfo("arg1", "arg2", ...)))`, e.g.: >> >> ``` >> #define BPF_TAG(...) __attribute__((debuginfo("bpf", __VA_ARGS__))) >> struct foo { >> void * BPF_TAG("ptr","rcu","bh") ptr; >> }; >> #define BPF_RCU_TAG(PFX, ...) BPF(PFX, "rcu", __VA_ARGS__) >> int bar(int) BPF_RCU_TAG("acquires","bh") { ... } >> int baz(int) BPF_RCU_TAG("releases","bh") { ... } >> int qux(int) BPF_RCU_TAG("acquires","sched") { ... } >> ``` > > > Unless Paul & Adrian, etc chime in in agreement of a more general name, like 'debuginfo', I'm inclined to avoid that/go with something bpf specific until there's a broader use case/proposal, something we might be able to/want to encourage GCC to support too. Otherwise we're taking a pretty broad attribute name & choosing its behavior when we don't necessarily have a lot of leverage if GCC ends up using that name for something else. > > & as for separate strings - maybe, but I'm not sure what that'll look like in the resulting DWARF, what sort of form would you propose using to encode that? (same question below \/) > >> >> >> Sounds good. I will use "bpf_tag" as the starting point now. >> Also, it is possible "bpf_tag" may appear multiple times for the same >> function, declaration etc. >> >> For example, >> #define __bpf_tag(s) __attribute__((bpf_tag(s))) >> int g __bpf_tag("str1") __bpf_tag("str2"); >> Let us say we introduced a LLVM vendor tag DWARF_AT_LLVM_bpf_tag. >> >> How do you want the above to be represented in dwarf? >> >> My current scheme is to put all bpf_tag's in a string, separated by ",". >> This will make things simpler. So the final output will be >> DWARF_AT_LLVM_bpf_tag "str1,str2" >> I may need to do a discussion with the kernel folks to use a different >> delimiter than ",", but we still represent all tags with ONE string. >> >> But alternatively, it could be represented as a list of strings like >> DWARF_AT_LLVM_bpf_tag >> "str1" >> "str2" >> is similar to DWARF_AT_location. > > > What DWARF form were you thinking of using for this? There isn't a built in form that provides encoding for multiple delimited/separated strings that I know of.Actually I have not looked at the details on how to implement multiple separated strings yet. Since you are mentioning there exists no such a built-in form and the attribute is bpf specific, I will then just go to one string only approach (e.g. "str1;str2" where ";" is the delimiter). I just checked linux:include/linux/compiler_*.h, it is possible "," may appear in some attributes, so I will use ";" as the delimiter. Thanks for the clarification!> >> >> >> The first internal representation >> DWARF_AT_LLVM_bpf_tag "str1,str2" >> should be easier for IR/bitcode read/write and dwarf parsing. >> >> What do you think? >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >> >>
Adrian Prantl via llvm-dev
2021-Jun-16 16:43 UTC
[llvm-dev] [cfe-dev] put "str" in __attribute__((annotate("str"))) to dwarf
> On Jun 14, 2021, at 6:44 PM, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > On Mon, Jun 14, 2021 at 4:54 PM David Rector <davrecthreads at gmail.com <mailto:davrecthreads at gmail.com>> wrote: > > >> On Jun 14, 2021, at 5:33 PM, Y Song via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote: >> >> On Mon, Jun 14, 2021 at 1:25 PM David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote: >>> >>> >>> >>> On Mon, Jun 14, 2021 at 12:25 PM Y Song <ys114321 at gmail.com <mailto:ys114321 at gmail.com>> wrote: >>>> >>>> On Fri, Jun 11, 2021 at 9:59 AM Alexei Starovoitov >>>> <alexei.starovoitov at gmail.com <mailto:alexei.starovoitov at gmail.com>> wrote: >>>>> >>>>> On Fri, Jun 11, 2021 at 07:17:32AM -0400, Aaron Ballman wrote: >>>>>> On Thu, Jun 10, 2021 at 8:47 PM Alexei Starovoitov >>>>>> <alexei.starovoitov at gmail.com <mailto:alexei.starovoitov at gmail.com>> wrote: >>>>>>> >>>>>>> On Thu, Jun 10, 2021 at 12:42 PM David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote: >>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Any suggestions/preferences for the spelling, Aaron? >>>>>>>>> >>>>>>>>> I don't know this domain particularly well, so takes these suggestions >>>>>>>>> with a giant grain of salt: >>>>>>>>> >>>>>>>>> If the concept is specific to DWARF and you don't think it'll need to >>>>>>>>> extend into other debug formats, you could go with `dwarf_annotate`. >>>>>>>>> If it's not really a DWARF thing but is more about B[P|T]F, then >>>>>>>>> `btf_annotate` or `bpf_annotate` could work, but those may be a bit >>>>>>>>> mysterious to folks outside of the domain. If it's a generic debug >>>>>>>>> info concept, probably `debug_info_annotate` or something. >>>>>>>> >>>>>>>> >>>>>>>> Arguably it can/could be a generic debug info or dwarf thing, but for now we don't have any use for it other than to squirrel info along to BTF/BPF so I'm on the fence about which prefix to use exactly >>>>>>>> >>>>>>> >>>>>>> A bit more bike shedding colors... >>>>>>> >>>>>>> The __rcu and __user annations might be used by the clang itself eventually. >>>>>>> Currently the "sparse" tool is doing this analysis and warns users >>>>>>> when __rcu pointer is incorrectly accessed in the kernel C code. >>>>>>> If clang can do that directly that could be a huge selling point >>>>>>> for folks to switch from gcc to clang for kernel builds. >>>>>>> The front-end would treat such annotations as arbitrary string, but >>>>>>> special "building-linux-kernel-pass" would interpret the semantical context. >>>>>> >>>>>> Are __rcu and __user annotations notionally distinct things from bpf >>>>>> (and perhaps each other as well)? Distinct enough that it would make >>>>>> sense to use a different attribute name for user source for each need? >>>>>> I suspect the answer is yes given that the existing annotations have >>>>>> their own names which are distinct, but I don't know this domain >>>>>> enough to be sure. >>>>> >>>>> __rcu and __user don't overlap. __rcu is not a single annotation though. >>>>> It's a combination of annotations in pointers, functions, macros. >>>>> Some functions have: >>>>> __acquires(rcu) >>>>> another function might have: >>>>> __acquires(rcu_bh) >>>>> There are several flavors of the RCU in the kernel. >>>>> So single __attribute__((rcu_annotate("foo"))) won't work even within RCU scope. >>>>> But if we do: >>>>> struct foo { >>>>> void * __attribute__((tag("ptr.rcu_bh")) ptr; >>>>> }; >>>>> int bar(int) __attribute__((tag("acquires.rcu_bh")) { ... } >>>>> int baz(int) __attribute__((tag("releases.rcu_bh")) { ... } >>>>> int qux(int) __attribute__((tag("acquires.rcu_sched")) { ... } >>>>> ... >>>>> The clang pass can parse these strings and correlate one tag to another. >>>>> RCU flavors come and go, so clang cannot hard code the names. >>>> >>>> Maybe we can name it as "bpf_tag" as it is a "tag" for "bpf" use case? >>>> >>>> David, in one of your early emails, you mentioned: >>>> >>>> ==>>>> Arguably it can/could be a generic debug info or dwarf thing, but for >>>> now we don't have any use for it other than to squirrel info along to >>>> BTF/BPF so I'm on the fence about which prefix to use exactly >>>> ==>>>> >>>> and suggests since it might be used in the future for non-bpf things, >>>> maybe the name could be a little more generic then bpf-specific. >>>> >>>> Do you have any suggestions on what name to pick? >>> >>> >>> Nah, not especially. bpf_tag sounds OK-ish to me if it suits you. >> > > The more generic the better IMO. And, the less the need to parse string literals the better. > > Why not simply `__attribute__((debuginfo("arg1", "arg2", ...)))`, e.g.: > > ``` > #define BPF_TAG(...) __attribute__((debuginfo("bpf", __VA_ARGS__))) > struct foo { > void * BPF_TAG("ptr","rcu","bh") ptr; > }; > #define BPF_RCU_TAG(PFX, ...) BPF(PFX, "rcu", __VA_ARGS__) > int bar(int) BPF_RCU_TAG("acquires","bh") { ... } > int baz(int) BPF_RCU_TAG("releases","bh") { ... } > int qux(int) BPF_RCU_TAG("acquires","sched") { ... } > ``` > > Unless Paul & Adrian, etc chime in in agreement of a more general name, like 'debuginfo', I'm inclined to avoid that/go with something bpf specific until there's a broader use case/proposal, something we might be able to/want to encourage GCC to support too. Otherwise we're taking a pretty broad attribute name & choosing its behavior when we don't necessarily have a lot of leverage if GCC ends up using that name for something else.There are definitely use-cases for threading a general string attribute through LLVM IR all the way to DWARF. Recently I thought about how to best encode API Swiftification attributes (e.g., https://developer.apple.com/documentation/swift/objective-c_and_c_code_customization/renaming_objective-c_apis_for_swift <https://developer.apple.com/documentation/swift/objective-c_and_c_code_customization/renaming_objective-c_apis_for_swift>) in DWARF. These are Clang attributes put on (Objective-)C type declarations that tell the Swift compiler how to map C and Objective-C types into Swift. The attributes range from nullability of pointers to renaming types to better fit into the Swift world. Having a generic DWARF facility to encode any Clang __attribute__(()) declaration would make this very straightforward to implement. Maybe this is a good opportunity to design a generic mechanism that works for all attributes? We probably need to add a little more structure than just encoding a single string with the attribute contents to make the encoding more efficient, but we could probably have something generic enough to be useful across many use-cases. Is there any interest in attempting this or do you prefer with targeted extensions for each kind of attribute? -- adrian> > & as for separate strings - maybe, but I'm not sure what that'll look like in the resulting DWARF, what sort of form would you propose using to encode that? (same question below \/) > > >> Sounds good. I will use "bpf_tag" as the starting point now. >> Also, it is possible "bpf_tag" may appear multiple times for the same >> function, declaration etc. >> >> For example, >> #define __bpf_tag(s) __attribute__((bpf_tag(s))) >> int g __bpf_tag("str1") __bpf_tag("str2"); >> Let us say we introduced a LLVM vendor tag DWARF_AT_LLVM_bpf_tag. >> >> How do you want the above to be represented in dwarf? >> >> My current scheme is to put all bpf_tag's in a string, separated by ",". >> This will make things simpler. So the final output will be >> DWARF_AT_LLVM_bpf_tag "str1,str2" >> I may need to do a discussion with the kernel folks to use a different >> delimiter than ",", but we still represent all tags with ONE string. >> >> But alternatively, it could be represented as a list of strings like >> DWARF_AT_LLVM_bpf_tag >> "str1" >> "str2" >> is similar to DWARF_AT_location. > > > What DWARF form were you thinking of using for this? There isn't a built in form that provides encoding for multiple delimited/separated strings that I know of. > >> >> The first internal representation >> DWARF_AT_LLVM_bpf_tag "str1,str2" >> should be easier for IR/bitcode read/write and dwarf parsing. >> >> What do you think? >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210616/d19179d7/attachment.html>