(background) The CoreCLR expects a JIT to produce a MSIL bytecode offset to code offset mapping annotated with a few extra bits denoting if it’s prolog/epilog, or it’s a call, or if there’s operands remaining on the MSIL virtual stack in some cases. Our initial prototype has the MSIL offset stashed in the line number field. We could stash the extra bits in the column info but that’s starting to feel too much like a hack. We’re looking for a way to 1) extend the debug metadata to hold our info and get it dumped into the in memory object – a new section would be fine if it’s not too complicated. Or 2) a place to extract the data we need when we have both encoded offset and access to the instructions. We’re looking for some advice. ☺ -R From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Eric Christopher Sent: Wednesday, May 13, 2015 2:55 PM To: Michelle McDaniel; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Extending AsmPrinterHandler You could write the debug information you want into just a section in memory and have your external/alternate/other process/thread/etc pick it up on the other end? I don't see how the extra info you want to send is important here, you'd just be extending the existing debug support. Or I'm missing something at which point I'm not sure which additional questions to ask :) -eric On Wed, May 13, 2015 at 2:45 PM Michelle McDaniel <michelm at microsoft.com<mailto:michelm at microsoft.com>> wrote: I work on the LLILC team, and we are trying to send debug line info through to the CoreCLR EE without using an EventListener because we need to send extra info (more than just available in DebugLoc) like if it’s a call instruction, a call site, etc. We thought extending AsmPrinterHandler would be useful since it seems to have information about debug locations, label offsets, and instruction specific information. From: Eric Christopher [mailto:echristo at gmail.com<mailto:echristo at gmail.com>] Sent: Wednesday, May 13, 2015 2:37 PM To: Michelle McDaniel; llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu> Subject: Re: [LLVMdev] Extending AsmPrinterHandler We generally don't extend the AsmPrinter... can you be more specific about what you're trying to do? -eric On Wed, May 13, 2015 at 2:34 PM Michelle McDaniel <michelm at microsoft.com<mailto:michelm at microsoft.com>> wrote: Hey everyone, I’m looking into extending AsmPrinterHandler out of tree, to communicate information back to the CoreCLR EE, but all of the headers are in the lib directory. Is there any way to extend it out of tree, or is that not supported? The reason I’d like to do this out of tree is because we want to include clr headers as well, and we don’t want to introduce that into llvm sources. Thanks, Michelle _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150513/95b0c00d/attachment.html>
Hi Russell, Instead of hiding bits inside the debug metadata, why not just attach additional metadata to each instruction and look for that during emission? You'll probably want to take a look at how things like the vectorizer are taking advantage of metadata if you want to encode things. Then during your front end compilation for MSIL->LLVM IR you can just attach random metadata to some instructions. This does have the drawback that, theoretically at least, you can strip metadata from LLVM IR and get a working binary. If that's not the case for CoreCLR you might want to look into a way to overload some of the instructions or ... something. Or just require people not delete your metadata I guess. Does this help? -eric On Wed, May 13, 2015 at 3:27 PM Russell Hadley <rhadley at microsoft.com> wrote:> (background) The CoreCLR expects a JIT to produce a MSIL bytecode offset > to code offset mapping annotated with a few extra bits denoting if it’s > prolog/epilog, or it’s a call, or if there’s operands remaining on the MSIL > virtual stack in some cases. Our initial prototype has the MSIL offset > stashed in the line number field. We could stash the extra bits in the > column info but that’s starting to feel too much like a hack. We’re > looking for a way to 1) extend the debug metadata to hold our info and get > it dumped into the in memory object – a new section would be fine if it’s > not too complicated. Or 2) a place to extract the data we need when we have > both encoded offset and access to the instructions. We’re looking for some > advice. J > > > > -R > > > > *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On > Behalf Of *Eric Christopher > *Sent:* Wednesday, May 13, 2015 2:55 PM > > > *To:* Michelle McDaniel; llvmdev at cs.uiuc.edu > *Subject:* Re: [LLVMdev] Extending AsmPrinterHandler > > > > You could write the debug information you want into just a section in > memory and have your external/alternate/other process/thread/etc pick it up > on the other end? I don't see how the extra info you want to send is > important here, you'd just be extending the existing debug support. Or I'm > missing something at which point I'm not sure which additional questions to > ask :) > > > > -eric > > > > On Wed, May 13, 2015 at 2:45 PM Michelle McDaniel <michelm at microsoft.com> > wrote: > > I work on the LLILC team, and we are trying to send debug line info > through to the CoreCLR EE without using an EventListener because we need to > send extra info (more than just available in DebugLoc) like if it’s a call > instruction, a call site, etc. We thought extending AsmPrinterHandler would > be useful since it seems to have information about debug locations, label > offsets, and instruction specific information. > > > > > > > > *From:* Eric Christopher [mailto:echristo at gmail.com] > *Sent:* Wednesday, May 13, 2015 2:37 PM > *To:* Michelle McDaniel; llvmdev at cs.uiuc.edu > *Subject:* Re: [LLVMdev] Extending AsmPrinterHandler > > > > We generally don't extend the AsmPrinter... can you be more specific about > what you're trying to do? > > -eric > > > > On Wed, May 13, 2015 at 2:34 PM Michelle McDaniel <michelm at microsoft.com> > wrote: > > Hey everyone, > > > > I’m looking into extending AsmPrinterHandler out of tree, to communicate > information back to the CoreCLR EE, but all of the headers are in the lib > directory. Is there any way to extend it out of tree, or is that not > supported? The reason I’d like to do this out of tree is because we want to > include clr headers as well, and we don’t want to introduce that into llvm > sources. > > > > Thanks, > > Michelle > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150513/b60b9b00/attachment.html>
Thanks Eric, the pointes are appreciated. ☺ I was a little put off by the documentation for metadata that said it could be dropped by LLVM at any time, as well as the extra assertion that the debug metadata was “special”. Is there a reasonable expectation that added metadata will make it to encode? (or AsmPrinter? I’m llvm jargon isn’t down yet) Is the debug metadata handled specially such that it has priority over other metadata? Also, if we go the separate metadata route, we’d need to extend all the debug helpers in the AsmPrinter to extract that data to a special section. Is that what you’re suggesting? In terms of correctness, not returning the debug info to CoreCLR only impacts debugging. The executable will still run. ☺ Thanks, -R From: Eric Christopher [mailto:echristo at gmail.com] Sent: Wednesday, May 13, 2015 3:41 PM To: Russell Hadley; Michelle McDaniel; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Extending AsmPrinterHandler Hi Russell, Instead of hiding bits inside the debug metadata, why not just attach additional metadata to each instruction and look for that during emission? You'll probably want to take a look at how things like the vectorizer are taking advantage of metadata if you want to encode things. Then during your front end compilation for MSIL->LLVM IR you can just attach random metadata to some instructions. This does have the drawback that, theoretically at least, you can strip metadata from LLVM IR and get a working binary. If that's not the case for CoreCLR you might want to look into a way to overload some of the instructions or ... something. Or just require people not delete your metadata I guess. Does this help? -eric On Wed, May 13, 2015 at 3:27 PM Russell Hadley <rhadley at microsoft.com<mailto:rhadley at microsoft.com>> wrote: (background) The CoreCLR expects a JIT to produce a MSIL bytecode offset to code offset mapping annotated with a few extra bits denoting if it’s prolog/epilog, or it’s a call, or if there’s operands remaining on the MSIL virtual stack in some cases. Our initial prototype has the MSIL offset stashed in the line number field. We could stash the extra bits in the column info but that’s starting to feel too much like a hack. We’re looking for a way to 1) extend the debug metadata to hold our info and get it dumped into the in memory object – a new section would be fine if it’s not too complicated. Or 2) a place to extract the data we need when we have both encoded offset and access to the instructions. We’re looking for some advice. ☺ -R From: llvmdev-bounces at cs.uiuc.edu<mailto:llvmdev-bounces at cs.uiuc.edu> [mailto:llvmdev-bounces at cs.uiuc.edu<mailto:llvmdev-bounces at cs.uiuc.edu>] On Behalf Of Eric Christopher Sent: Wednesday, May 13, 2015 2:55 PM To: Michelle McDaniel; llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu> Subject: Re: [LLVMdev] Extending AsmPrinterHandler You could write the debug information you want into just a section in memory and have your external/alternate/other process/thread/etc pick it up on the other end? I don't see how the extra info you want to send is important here, you'd just be extending the existing debug support. Or I'm missing something at which point I'm not sure which additional questions to ask :) -eric On Wed, May 13, 2015 at 2:45 PM Michelle McDaniel <michelm at microsoft.com<mailto:michelm at microsoft.com>> wrote: I work on the LLILC team, and we are trying to send debug line info through to the CoreCLR EE without using an EventListener because we need to send extra info (more than just available in DebugLoc) like if it’s a call instruction, a call site, etc. We thought extending AsmPrinterHandler would be useful since it seems to have information about debug locations, label offsets, and instruction specific information. From: Eric Christopher [mailto:echristo at gmail.com<mailto:echristo at gmail.com>] Sent: Wednesday, May 13, 2015 2:37 PM To: Michelle McDaniel; llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu> Subject: Re: [LLVMdev] Extending AsmPrinterHandler We generally don't extend the AsmPrinter... can you be more specific about what you're trying to do? -eric On Wed, May 13, 2015 at 2:34 PM Michelle McDaniel <michelm at microsoft.com<mailto:michelm at microsoft.com>> wrote: Hey everyone, I’m looking into extending AsmPrinterHandler out of tree, to communicate information back to the CoreCLR EE, but all of the headers are in the lib directory. Is there any way to extend it out of tree, or is that not supported? The reason I’d like to do this out of tree is because we want to include clr headers as well, and we don’t want to introduce that into llvm sources. Thanks, Michelle _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150513/027bbc2d/attachment.html>
On Wed, May 13, 2015 at 3:27 PM, Russell Hadley <rhadley at microsoft.com> wrote:> (background) The CoreCLR expects a JIT to produce a MSIL bytecode offset > to code offset mapping annotated with a few extra bits denoting if it’s > prolog/epilog, or it’s a call, or if there’s operands remaining on the MSIL > virtual stack in some cases. Our initial prototype has the MSIL offset > stashed in the line number field. We could stash the extra bits in the > column info but that’s starting to feel too much like a hack. We’re > looking for a way to 1) extend the debug metadata to hold our info and get > it dumped into the in memory object – a new section would be fine if it’s > not too complicated. Or 2) a place to extract the data we need when we have > both encoded offset and access to the instructions. We’re looking for some > advice. J >This sounds like it's not really debug info so much as a description of the stack frame that is required for correctness, like CFI (call frame info that describes prologues and epilogues) and EH action tables. You probably want to subclass AsmPrinterHandler and hook that into the pipeline along with EH and debug info generation. Today this requires upstream modification, but the actual pass code can live where ever you want. Take a look at how Win64Exception.cpp and others are emitting things like the ip2state table for __CxxFrameHandler3. Long term, if you want to 100% guarantee that the MSIL offset is preserved through LLVM optimizations, I think we need some other solution. Phillip Reames was describing a similar problem, and I was thinking that we should have a way to tack semantically important data onto a function call like this. The best solution I could come up with using existing tools was to use an invoke that unwinds to an artificial landing pad that ends in unreachable and contains the preserved data in its clause operands. LLVM optimizers will only merge such calls if the landingpad destinations are the same, and it can't merge landingpads with different clauses. Alternatively, it occurs to me that call sites support attributes, which are different from metadata in that they are semantically important. Optimizations cannot remove them. Maybe what we need is just an attribute on the call site? Hope that helps. :) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150514/9b8fc279/attachment.html>
On 05/14/2015 03:31 PM, Reid Kleckner wrote:> On Wed, May 13, 2015 at 3:27 PM, Russell Hadley <rhadley at microsoft.com > <mailto:rhadley at microsoft.com>> wrote: > > (background) The CoreCLR expects a JIT to produce a MSIL bytecode > offset to code offset mapping annotated with a few extra bits > denoting if it’s prolog/epilog, or it’s a call, or if there’s > operands remaining on the MSIL virtual stack in some cases. Our > initial prototype has the MSIL offset stashed in the line number > field. We could stash the extra bits in the column info but > that’s starting to feel too much like a hack. We’re looking for a > way to 1) extend the debug metadata to hold our info and get it > dumped into the in memory object – a new section would be fine if > it’s not too complicated. Or 2) a place to extract the data we > need when we have both encoded offset and access to the > instructions. We’re looking for some advice. J > > > This sounds like it's not really debug info so much as a description > of the stack frame that is required for correctness, like CFI (call > frame info that describes prologues and epilogues) and EH action > tables. You probably want to subclass AsmPrinterHandler and hook that > into the pipeline along with EH and debug info generation. Today this > requires upstream modification, but the actual pass code can live > where ever you want. Take a look at how Win64Exception.cpp and others > are emitting things like the ip2state table for __CxxFrameHandler3. > > Long term, if you want to 100% guarantee that the MSIL offset is > preserved through LLVM optimizations, I think we need some other > solution. Phillip Reames was describing a similar problem, and I was > thinking that we should have a way to tack semantically important data > onto a function call like this. The best solution I could come up with > using existing tools was to use an invoke that unwinds to an > artificial landing pad that ends in unreachable and contains the > preserved data in its clause operands. LLVM optimizers will only merge > such calls if the landingpad destinations are the same, and it can't > merge landingpads with different clauses. > > Alternatively, it occurs to me that call sites support attributes, > which are different from metadata in that they are semantically > important. Optimizations cannot remove them. Maybe what we need is > just an attribute on the call site? > > Hope that helps. :)FYI, if these are semantically important (and not just debug info) using metadata is a really bad idea. We've got a similar problem with information required to support deoptimization and have local changes which mostly solve it. I hope to eventually get that upstreamed, but we're not particularly happy with what we've got at the moment and are the process of a rewrite. If you're interested, I can try to do that rewrite upstream. If I do, it'll be with the caveat that the code upstreamed will be *extremely* experimental and likely to change radically over time. Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150515/0b6bb6e9/attachment.html>