David Blaikie via llvm-dev
2021-Jun-10 19:42 UTC
[llvm-dev] put "str" in __attribute__((annotate("str"))) to dwarf
On Thu, Jun 10, 2021 at 12:31 PM Aaron Ballman <aaron at aaronballman.com> wrote:> On Thu, Jun 10, 2021 at 3:16 PM David Blaikie <dblaikie at gmail.com> wrote: > > > > On Thu, Jun 10, 2021 at 12:10 PM Aaron Ballman <aaron at aaronballman.com> > wrote: > >> > >> On Thu, Jun 10, 2021 at 2:45 PM Y Song <ys114321 at gmail.com> wrote: > >> > > >> > On Thu, Jun 10, 2021 at 11:29 AM David Blaikie <dblaikie at gmail.com> > wrote: > >> > > > >> > > On Thu, Jun 10, 2021 at 11:09 AM Y Song <ys114321 at gmail.com> wrote: > >> > >> > >> > >> On Thu, Jun 10, 2021 at 10:05 AM David Blaikie <dblaikie at gmail.com> > wrote: > >> > >> > > >> > >> > (Crossposting to cfe-dev because this includes a proposal for a > new C/C++ level attribute) > >> > >> > > >> > >> > These attributes are all effectively hand-written (with or > without macros) in the input source? None of them are derived by the > compiler frontend based on other characteristics? > >> > >> > >> > >> Yes, they are hand-written in the input source and fit into the > clang > >> > >> compiler. They are not derived inside the clang/llvm. > >> > > > >> > > > >> > > Good to know/understand. > >> > > > >> > >> > >> > >> > >> > >> > > >> > >> > And I'm guessing maybe we'd want the name to be a bit narrower, > like bpf_annotate, perhaps - taking such a generic term as "annotate" in > the global attribute namespace seems fairly bold for what's currently a > fairly narrow use case. +Aaron Ballman thoughts on this? > >> > >> > >> > >> I am okay with something like bpf_annotate as the existing annotate > >> > >> attribute will generate global variables or codes for annotations > >> > >> which is unnecessary for bpf use case, > >> > >> although the overhead should be quite small. > >> > > > >> > > > >> > > Ah, there's an existing annotate attribute you're proposing > leveraging/reusing that? Got a pointer to the documentation for that? I > don't see it documented here: > https://clang.llvm.org/docs/AttributeReference.html > >> > > >> > Looks like this attribute is not well documented. > >> > >> Correct -- it's an ancient attribute that predates us documenting > >> attributes at all. > >> > >> > I forgot how I found it. But below is a public blog on how it could > be used: > >> > > https://blog.quarkslab.com/implementing-a-custom-directive-handler-in-clang.html > >> > I then went to > >> > clang/include/clang/Basic/Attr.td > >> > and found > >> > > >> > def Annotate : InheritableParamAttr { > >> > let Spellings = [Clang<"annotate">]; > >> > let Args = [StringArgument<"Annotation">, > VariadicExprArgument<"Args">]; > >> > // Ensure that the annotate attribute can be used with > >> > // '#pragma clang attribute' even though it has no subject list. > >> > let AdditionalMembers = [{ > >> > static AnnotateAttr *Create(ASTContext &Ctx, llvm::StringRef > Annotation, \ > >> > const AttributeCommonInfo &CommonInfo) { > >> > return AnnotateAttr::Create(Ctx, Annotation, nullptr, 0, > CommonInfo); > >> > } > >> > static AnnotateAttr *CreateImplicit(ASTContext &Ctx, llvm::StringRef > >> > Annotation, \ > >> > const AttributeCommonInfo &CommonInfo > {SourceRange{}}) { > >> > return AnnotateAttr::CreateImplicit(Ctx, Annotation, nullptr, 0, > >> > CommonInfo); > >> > } > >> > }]; > >> > let PragmaAttributeSupport = 1; > >> > let Documentation = [Undocumented]; > >> > } > >> > > >> > and tried to use it for places BPF cares about and it all covers. > >> > >> I don't think it's a good idea to use annotate for BPF needs. The > >> basic idea behind annotate is that it's a way to pass arbitrary string > >> (and starting very recently, other kinds of constant expressions) from > >> the frontend to the backend. So it's a general-purpose tool that's > >> used for one-off situations. As an example, attribute plugins will use > >> it because they cannot currently create their own semantic attribute > >> easily, and I think the static analyzer may make use of the feature as > >> well. Because the BPF needs are so specific, I think it'd be better to > >> use an attribute dedicated to those needs rather than using a > >> general-purpose attribute like annotate -- this will reduce the > >> likelihood of conflicts with the other creative uses people put > >> annotate to. > > > > > > Any suggestions/preferences for the spelling, Aaron? > > I don't know this domain particularly well, so takes these suggestions > with a giant grain of salt: > > If the concept is specific to DWARF and you don't think it'll need to > extend into other debug formats, you could go with `dwarf_annotate`. > If it's not really a DWARF thing but is more about B[P|T]F, then > `btf_annotate` or `bpf_annotate` could work, but those may be a bit > mysterious to folks outside of the domain. If it's a generic debug > info concept, probably `debug_info_annotate` or something. >Arguably it can/could be a generic debug info or dwarf thing, but for now we don't have any use for it other than to squirrel info along to BTF/BPF so I'm on the fence about which prefix to use exactly> > My primary concern with reusing `annotate` itself is because user > programs are likely already using that attribute for basically > arbitrary purposes, so I worry reusing it for this purpose may > accidentally expose annotations in debug info that the user never > really expected to be there (which may confuse whatever is reading the > annotations from the debug info). >Yeah, +1 there.> > ~Aaron > > > > >> > >> > >> > BTW, the above attr definition does say Undocumented. > >> > >> Yeah, the build requires there to be some documentation for every > >> attribute, and Undocumented is what we use for attributes that we > >> elect not to document because they're implementation details (rarely) > >> or have failed to document yet (much more common). > >> > >> HTH! > >> > >> ~Aaron > >> > >> > > >> > > > >> > >> > >> > >> > >> > >> > > >> > >> > > >> > >> > On Wed, Jun 9, 2021 at 7:42 PM Y Song <ys114321 at gmail.com> > wrote: > >> > >> >> > >> > >> >> Hi, > >> > >> >> > >> > >> >> This feature is for the BPF community. The detailed use case is > >> > >> >> described in https://reviews.llvm.org/D103549. And I have > crafted a > >> > >> >> WIP patch https://reviews.llvm.org/D103667 which implements > necessary > >> > >> >> frontend and codegen (plus others) to show the scope of the > work. > >> > >> >> > >> > >> >> To elaborate the use case a little bit more. Basically, we want > to put > >> > >> >> some annotations into variables (include parameters), functions, > >> > >> >> structure/union types and structure/union members. The string > >> > >> >> arguments in annotations will not > >> > >> >> be interpreted inside the compiler. The compiler should just > emit > >> > >> >> these annotations into dwarf. Currently in the linux build > system, > >> > >> >> pahole will convert dwarf to BTF which will encode these > annotation > >> > >> >> strings into BTF. The following is a C example how annotations > look > >> > >> >> like at source level: > >> > >> >> > >> > >> >> $ cat t1.c > >> > >> >> /* a pointer pointing to user memory */ > >> > >> >> #define __user __attribute__((annotate("user"))) > >> > >> >> /* a pointer protected by rcu */ > >> > >> >> #define __rcu __attribute__((annotate("rcu"))) > >> > >> >> /* the struct has some special property */ > >> > >> >> #define __special_struct > __attribute__((annotate("special_struct"))) > >> > >> >> /* sock_lock is held for the function */ > >> > >> >> #define __sock_lock_held > __attribute((annotate("sock_lock_held"))) > >> > >> >> /* the hash table element type is socket */ > >> > >> >> #define __special_info > __attribute__((annotate("elem_type:socket"))) > >> > >> >> > >> > >> >> struct hlist_node; > >> > >> >> struct hlist_head { > >> > >> >> struct hlist_node *prev; > >> > >> >> struct hlist_node *next; > >> > >> >> } __special_struct; > >> > >> >> struct hlist { > >> > >> >> struct hlist_head head __special_info; > >> > >> >> }; > >> > >> >> > >> > >> >> extern void bar(struct hlist *); > >> > >> >> int foo(struct hlist *h, int *a __user, int *b __rcu) > __sock_lock_held { > >> > >> >> bar(h); > >> > >> >> return *a + *b; > >> > >> >> } > >> > >> >> > >> > >> >> In https://reviews.llvm.org/D103667, I implemented a LLVM > extended attribute > >> > >> >> DWARF_AT_LLVM_annotations. But this might not be the right > thing to do > >> > >> >> as it is not clear whether there are use cases beyond BPF. > >> > >> >> David suggested that we discuss this in llvm-dev to get > consensus on > >> > >> >> how this feature may be supported in LLVM. Hence this email. > >> > >> >> > >> > >> >> Please share your comments, suggestions on how to support this > feature > >> > >> >> in LLVM. Thanks! > >> > >> >> > >> > >> >> Yonghong >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210610/d63101e5/attachment.html>
Alexei Starovoitov via llvm-dev
2021-Jun-11 00:47 UTC
[llvm-dev] put "str" in __attribute__((annotate("str"))) to dwarf
On Thu, Jun 10, 2021 at 12:42 PM David Blaikie <dblaikie at gmail.com> wrote:> >> > >> > >> > Any suggestions/preferences for the spelling, Aaron? >> >> I don't know this domain particularly well, so takes these suggestions >> with a giant grain of salt: >> >> If the concept is specific to DWARF and you don't think it'll need to >> extend into other debug formats, you could go with `dwarf_annotate`. >> If it's not really a DWARF thing but is more about B[P|T]F, then >> `btf_annotate` or `bpf_annotate` could work, but those may be a bit >> mysterious to folks outside of the domain. If it's a generic debug >> info concept, probably `debug_info_annotate` or something. > > > Arguably it can/could be a generic debug info or dwarf thing, but for now we don't have any use for it other than to squirrel info along to BTF/BPF so I'm on the fence about which prefix to use exactly >A bit more bike shedding colors... The __rcu and __user annations might be used by the clang itself eventually. Currently the "sparse" tool is doing this analysis and warns users when __rcu pointer is incorrectly accessed in the kernel C code. If clang can do that directly that could be a huge selling point for folks to switch from gcc to clang for kernel builds. The front-end would treat such annotations as arbitrary string, but special "building-linux-kernel-pass" would interpret the semantical context. Considering above the dwarf_annotate, btf_annotate, debug_info_annotate names don't fit that well. The accuracy of the annotations is important unlike debug info that can be dropped on a whim of some optimization pass. bpf_annotate wouldn't fit either, since the kernel might use that without any bpf bits. kernel_annotate might sound like it's not applicable to user space. How about __attribute__((note("str"))) or __attribute__((tag("str"))) ?