Ivan Baev
2015-Apr-15 15:11 UTC
[LLVMdev] RFC: Metadata attachments to function definitions
> Date: Tue, 14 Apr 2015 21:33:03 -0700 > From: "Duncan P. N. Exon Smith" <dexonsmith at apple.com> > To: LLVM Developers Mailing List <llvmdev at cs.uiuc.edu> > Subject: [LLVMdev] RFC: Metadata attachments to function definitions > Message-ID: <BF4002F0-06DC-4A25-AF84-7D21AD48121A at apple.com> > Content-Type: text/plain; charset=us-ascii > > > `Function` definitions should support `MDNode` attachments, with a > similar syntax to instructions: > > define void @foo() nounwind !attach !0 { > unreachable > } > !0 = !{} > > Attachments wouldn't be allowed on declarations, just definitions. > > There are two open problems this can help with: > > 1. For PGO, we need somewhere to attach the function entry count. > Attaching to the function definition is a simple solution. > > define void @foo() !prof !0 { > unreachable > } > !0 = !{i32 987}That will be great. Would it be better if the type is i64? We really want to avoid an overflow if possible. What is the type for the raw profile data for region(0)? Ivan> > 2. In debug info, we repeatedly build up a map from `Function` to the > canonical `MDSubrogram` for it. Keeping this mapping accurate takes > subtle logic in `lib/Linker` (see PR21910/PR22792) and it's > expensive to compute and maintain. Attaching it directly to the > `Function` designs away the problem. > > define void @foo() !dbg !0 { > unreachable > } > !0 = !MDSubprogram(name: "foo", function: void ()* @foo) > > Thoughts? > > Moving onto implementation, I'd provide the same generic API that > `Instruction` has, and wouldn't bother with the "fast path" API for > `!dbg`. Moreover, the generic path wouldn't be slow. Since there are > fewer functions than instructions, we can afford to store the > attachments directly on the `Function` instead of off in the > `LLVMContext`. > > It's not clear to me just how precious memory is in `Function`; IIRC > it's sitting at 168B right now for x86-64. IMO, a `SmallVector<..., 1>` > -- cost of 64B -- seems fine. I'll start with this if I don't hear any > objections; we can optimize later if necessary. > > Otherwise, I could hack together a custom vector-like object with the > "small string" optimization. Cost would be 16B per `Function`, with the > same malloc/heap costs as `SmallVector<..., 1>`. But I haven't seen a > profile that suggests this would be worth the complexity... > > Any opinions? >
Duncan P. N. Exon Smith
2015-Apr-15 16:55 UTC
[LLVMdev] RFC: Metadata attachments to function definitions
> On 2015-Apr-15, at 08:11, Ivan Baev <ibaev at codeaurora.org> wrote: > >> Date: Tue, 14 Apr 2015 21:33:03 -0700 >> From: "Duncan P. N. Exon Smith" <dexonsmith at apple.com> >> To: LLVM Developers Mailing List <llvmdev at cs.uiuc.edu> >> Subject: [LLVMdev] RFC: Metadata attachments to function definitions >> Message-ID: <BF4002F0-06DC-4A25-AF84-7D21AD48121A at apple.com> >> Content-Type: text/plain; charset=us-ascii >> >> >> `Function` definitions should support `MDNode` attachments, with a >> similar syntax to instructions: >> >> define void @foo() nounwind !attach !0 { >> unreachable >> } >> !0 = !{} >> >> Attachments wouldn't be allowed on declarations, just definitions. >> >> There are two open problems this can help with: >> >> 1. For PGO, we need somewhere to attach the function entry count. >> Attaching to the function definition is a simple solution. >> >> define void @foo() !prof !0 { >> unreachable >> } >> !0 = !{i32 987} > > > That will be great. Would it be better if the type is i64? We really want > to avoid an overflow if possible. What is the type for the raw profile > data for region(0)?Right, of course; `i64` makes more sense for entry counts. BTW, I'm not volunteering to implement this one myself, just to provide the metadata infrastructure.>> 2. In debug info, we repeatedly build up a map from `Function` to the >> canonical `MDSubrogram` for it. Keeping this mapping accurate takes >> subtle logic in `lib/Linker` (see PR21910/PR22792) and it's >> expensive to compute and maintain. Attaching it directly to the >> `Function` designs away the problem. >> >> define void @foo() !dbg !0 { >> unreachable >> } >> !0 = !MDSubprogram(name: "foo", function: void ()* @foo) >> >> Thoughts? >> >> Moving onto implementation, I'd provide the same generic API that >> `Instruction` has, and wouldn't bother with the "fast path" API for >> `!dbg`. Moreover, the generic path wouldn't be slow. Since there are >> fewer functions than instructions, we can afford to store the >> attachments directly on the `Function` instead of off in the >> `LLVMContext`. >> >> It's not clear to me just how precious memory is in `Function`; IIRC >> it's sitting at 168B right now for x86-64. IMO, a `SmallVector<..., 1>` >> -- cost of 64B -- seems fine. I'll start with this if I don't hear any >> objections; we can optimize later if necessary. >> >> Otherwise, I could hack together a custom vector-like object with the >> "small string" optimization. Cost would be 16B per `Function`, with the >> same malloc/heap costs as `SmallVector<..., 1>`. But I haven't seen a >> profile that suggests this would be worth the complexity... >> >> Any opinions? >> > >
Ivan Baev
2015-Apr-15 18:10 UTC
[LLVMdev] RFC: Metadata attachments to function definitions
> >> On 2015-Apr-15, at 08:11, Ivan Baev <ibaev at codeaurora.org> wrote: >> >>> Date: Tue, 14 Apr 2015 21:33:03 -0700 >>> From: "Duncan P. N. Exon Smith" <dexonsmith at apple.com> >>> To: LLVM Developers Mailing List <llvmdev at cs.uiuc.edu> >>> Subject: [LLVMdev] RFC: Metadata attachments to function definitions >>> Message-ID: <BF4002F0-06DC-4A25-AF84-7D21AD48121A at apple.com> >>> Content-Type: text/plain; charset=us-ascii >>> >>> >>> `Function` definitions should support `MDNode` attachments, with a >>> similar syntax to instructions: >>> >>> define void @foo() nounwind !attach !0 { >>> unreachable >>> } >>> !0 = !{} >>> >>> Attachments wouldn't be allowed on declarations, just definitions. >>> >>> There are two open problems this can help with: >>> >>> 1. For PGO, we need somewhere to attach the function entry count. >>> Attaching to the function definition is a simple solution. >>> >>> define void @foo() !prof !0 { >>> unreachable >>> } >>> !0 = !{i32 987} >> >> >> That will be great. Would it be better if the type is i64? We really >> want >> to avoid an overflow if possible. What is the type for the raw profile >> data for region(0)? > > Right, of course; `i64` makes more sense for entry counts. BTW, I'm not > volunteering to implement this one myself, just to provide the metadata > infrastructure. >Providing the metadata infrastructure - API for accessing the function entry count profile metadata - will be more than enough. Thank you for doing that. The type in clang/lib/CodeGen/CodeGenPGO.h is a 64 bit: uint64_t getRegionCount(unsigned Counter) Ivan
Diego Novillo
2015-Apr-16 19:15 UTC
[LLVMdev] RFC: Metadata attachments to function definitions
On 04/15/15 12:55, Duncan P. N. Exon Smith wrote:>> On 2015-Apr-15, at 08:11, Ivan Baev <ibaev at codeaurora.org> wrote: >> >>> Date: Tue, 14 Apr 2015 21:33:03 -0700 >>> From: "Duncan P. N. Exon Smith" <dexonsmith at apple.com> >>> To: LLVM Developers Mailing List <llvmdev at cs.uiuc.edu> >>> Subject: [LLVMdev] RFC: Metadata attachments to function definitions >>> Message-ID: <BF4002F0-06DC-4A25-AF84-7D21AD48121A at apple.com> >>> Content-Type: text/plain; charset=us-ascii >>> >>> >>> `Function` definitions should support `MDNode` attachments, with a >>> similar syntax to instructions: >>> >>> define void @foo() nounwind !attach !0 { >>> unreachable >>> } >>> !0 = !{} >>> >>> Attachments wouldn't be allowed on declarations, just definitions. >>> >>> There are two open problems this can help with: >>> >>> 1. For PGO, we need somewhere to attach the function entry count. >>> Attaching to the function definition is a simple solution. >>> >>> define void @foo() !prof !0 { >>> unreachable >>> } >>> !0 = !{i32 987} >> >> That will be great. Would it be better if the type is i64? We really want >> to avoid an overflow if possible. What is the type for the raw profile >> data for region(0)? > Right, of course; `i64` makes more sense for entry counts. BTW, I'm not > volunteering to implement this one myself, just to provide the metadata > infrastructure.Since I need to add this, I'll put it on my todo list. Diego.