Filip Pizlo
2013-Oct-22  22:08 UTC
[LLVMdev] [RFC] Stackmap and Patchpoint Intrinsic Proposal
On Oct 22, 2013, at 1:48 PM, Philip R <listmail at philipreames.com> wrote:> On 10/22/13 10:34 AM, Filip Pizlo wrote: >> On Oct 22, 2013, at 9:53 AM, Philip R <listmail at philipreames.com> wrote: >> >>> On 10/17/13 10:39 PM, Andrew Trick wrote: >>>> This is a proposal for adding Stackmaps and Patchpoints to LLVM. The >>>> first client of these features is the JavaScript compiler within the >>>> open source WebKit project. >>>> >>> I have a couple of comments on your proposal. None of these are major enough to prevent submission. >>> >>> - As others have said, I'd prefer an experimental namespace rather than a webkit namespace. (minor) >>> - Unless I am misreading your proposal, your proposed StackMap intrinsic duplicates existing functionality already in llvm. In particular, much of the StackMap construction seems similar to the Safepoint mechanism used by the in-tree GC support. (See CodeGen/GCStrategy.cpp and CodeGen/GCMetadata.cpp). Have you examined these mechanisms to see if you can share implementations? >>> - To my knowledge, there is nothing that prevents an LLVM optimization pass from manufacturing new pointers which point inside an existing data structure. (e.g. an interior pointer to an array when blocking a loop) Does your StackMap mechanism need to be able to inspect/modify these manufactured temporaries? If so, I don't see how you could generate an intrinsic which would include this manufactured pointer in the live variable list. Is there something I'm missing here? >> These stackmaps have nothing to do with GC. Interior pointers are a problem unique to precise copying collectors. > I would argue that while the use of the stack maps might be different, the mechanism is fairly similar.It's not at all similar. These stackmaps are only useful for deoptimization, since the only way to make use of the live state information is to patch the stackmap with a jump to a deoptimization off-ramp. You won't use these for a GC.> In general, if the expected semantics are the same, a shared implementation would be desirable. This is more a suggestion for future refactoring than anything else.I think that these stackmaps and GC stackmaps are fairly different beasts. While it's possible to unify the two, this isn't the intent here. In particular, you can use these stackmaps for deoptimization without having to unwind the stack.> > I agree that interior pointers are primarily a problem for relocating collectors. (Though I disagree with the characterization of it being *uniquely* a problem for such collectors.) Since I was unaware of what you're using your stackmap mechanism for, I wanted to ask. Sounds like this is not an intended use case for you. >> >> In particular, the stackmaps in this proposal are likely to be used for capturing only a select subset of state and that subset may fail to include all possible GC roots. These stackmaps are meant to be used for reconstructing state-in-bytecode (where bytecode = whatever your baseline execution engine is, could be an AST) for performing a deoptimization, if LLVM was used for compiling code that had some type/value/behavior speculations. > Thanks for the clarification. This is definitely a useful mechanism. Thank you for contributing it back. >> >>> - Your patchpoint mechanism appears to be one very specialized use of a patchable location. Would you mind renaming it to something like patchablecall to reflect this specialization? >> The top use case will be heap access dispatch inline cache, which is not a call. >> You can also use it to implement call inline caches, but that's not the only thing you can use it for. > Er, possibly I'm misunderstanding you. To me, a inline call cache is a mechanism to optimize a dynamic call by adding a typecheck+directcall fastpath.Inline caches don't have to be calls. For example, in JavaScript, the expression "o.f" is fully dynamic but usually does not result in a call. The inline cache - and hence patchpoint - for such an expression will not have a call in the common case. Similar things arise in other dynamic languages. You can have inline caches for arithmetic. Or for array accesses. Or for any other dynamic operation in your language.> (i.e. avoiding the dynamic dispatch logic in the common case) I'm assuming this what you mean with the term "call inline cache", but I have never heard of a "heap access dispatch inline cache". I've done a google search and didn't find a definition. Could you point me to a reference or provide a brief explanation?Every JavaScript engine does it, and usually the term "inline cache" in the context of JS engines implies dispatching on the shape of the object in order to find the offset at which a field is located, rather than dispatching on the class of an object to determine what method to call. -Filip> > Philip-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131022/88102040/attachment.html>
Andrew Trick
2013-Oct-22  23:18 UTC
[LLVMdev] [RFC] Stackmap and Patchpoint Intrinsic Proposal
On Oct 22, 2013, at 3:08 PM, Filip Pizlo <fpizlo at apple.com> wrote:> On Oct 22, 2013, at 1:48 PM, Philip R <listmail at philipreames.com> wrote: > >> On 10/22/13 10:34 AM, Filip Pizlo wrote: >>> On Oct 22, 2013, at 9:53 AM, Philip R <listmail at philipreames.com> wrote: >>> >>>> On 10/17/13 10:39 PM, Andrew Trick wrote: >>>>> This is a proposal for adding Stackmaps and Patchpoints to LLVM. The >>>>> first client of these features is the JavaScript compiler within the >>>>> open source WebKit project. >>>>> >>>> I have a couple of comments on your proposal. None of these are major enough to prevent submission. >>>> >>>> - As others have said, I'd prefer an experimental namespace rather than a webkit namespace. (minor) >>>> - Unless I am misreading your proposal, your proposed StackMap intrinsic duplicates existing functionality already in llvm. In particular, much of the StackMap construction seems similar to the Safepoint mechanism used by the in-tree GC support. (See CodeGen/GCStrategy.cpp and CodeGen/GCMetadata.cpp). Have you examined these mechanisms to see if you can share implementations? >>>> - To my knowledge, there is nothing that prevents an LLVM optimization pass from manufacturing new pointers which point inside an existing data structure. (e.g. an interior pointer to an array when blocking a loop) Does your StackMap mechanism need to be able to inspect/modify these manufactured temporaries? If so, I don't see how you could generate an intrinsic which would include this manufactured pointer in the live variable list. Is there something I'm missing here? >>> These stackmaps have nothing to do with GC. Interior pointers are a problem unique to precise copying collectors. >> I would argue that while the use of the stack maps might be different, the mechanism is fairly similar. > > It's not at all similar. These stackmaps are only useful for deoptimization, since the only way to make use of the live state information is to patch the stackmap with a jump to a deoptimization off-ramp. You won't use these for a GC. > >> In general, if the expected semantics are the same, a shared implementation would be desirable. This is more a suggestion for future refactoring than anything else. > > I think that these stackmaps and GC stackmaps are fairly different beasts. While it's possible to unify the two, this isn't the intent here. In particular, you can use these stackmaps for deoptimization without having to unwind the stack.I think Philip R is asking a good question. To paraphrase: If we introduce a generically named feature, shouldn’t it be generically useful? Stack maps are used in other ways, and there are other kinds of patching. I agree and I think these are intended to be generically useful features, but not necessarily sufficient for every use. The proposed stack maps are very different from LLVM’s gcroot because gcroot does not provide stack maps! llvm.gcroot effectively designates a stack location for each root for the duration of the current function, and forces the root to be spilled to the stack at all call sites (the client needs to disable StackColoring). This is really the opposite of a stack map and I’m not aware of any functionality that can be shared. It also requires a C++ plugin to process the roots. llvm.stackmap generates data in a section that MCJIT clients can parse. If someone wanted to use stack maps for GC, I don’t know why they wouldn’t leverage llvm.stackmap. Maybe Filip can see a problem with this that I can't. The runtime can add GC roots to the stack map just like other live value, and it should know how to interpret the records. The intrinsic doesn’t bake in any particular interpretation of the mapped values. That said, my proposal deliberately does not cover GC. I think that stack maps are the easy part of the problem. The hard problem is tracking interior pointers, or for that matter exterior/out-of-bounds or swizzled pointers. LLVM’s machine IR simply doesn’t have the necessary facilities for doing this. But if you don’t need a moving collector, then you don’t need to track derived pointers as long as the roots are kept live. In that case, llvm.stackmap might be a nice optimization over llvm.gcroot. Now with regard to patching. I think llvm.patchpoint is generally useful for any type of patching I can imagine. It does look like a call site in IR, and it’s nice to be able to leverage calling conventions to inform the location of arguments. But the patchpoint does not have to be a call after patching, and you can specify zero arguments to avoid using a calling convention. In fact, we only currently emit a call out of convenience. We could splat nops in place and assume the runtime will immediately find and patch all occurrences before the code executes. In the future we may want to handle NULL call target, bypass call emission, and allow the reserved bytes to be less than that required to emit a call. -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131022/f6015e6a/attachment.html>
On 10/22/13 3:08 PM, Filip Pizlo wrote:> > On Oct 22, 2013, at 1:48 PM, Philip R <listmail at philipreames.com > <mailto:listmail at philipreames.com>> wrote: > >> On 10/22/13 10:34 AM, Filip Pizlo wrote: >>> On Oct 22, 2013, at 9:53 AM, Philip R <listmail at philipreames.com >>> <mailto:listmail at philipreames.com>> wrote: >>> >>>> On 10/17/13 10:39 PM, Andrew Trick wrote: >>>>> This is a proposal for adding Stackmaps and Patchpoints to LLVM. The >>>>> first client of these features is the JavaScript compiler within the >>>>> open source WebKit project. >>>>> >>>> I have a couple of comments on your proposal. None of these are >>>> major enough to prevent submission. >>>> >>>> - As others have said, I'd prefer an experimental namespace rather >>>> than a webkit namespace. (minor) >>>> - Unless I am misreading your proposal, your proposed StackMap >>>> intrinsic duplicates existing functionality already in llvm. In >>>> particular, much of the StackMap construction seems similar to the >>>> Safepoint mechanism used by the in-tree GC support. (See >>>> CodeGen/GCStrategy.cpp and CodeGen/GCMetadata.cpp). Have you >>>> examined these mechanisms to see if you can share implementations? >>>> - To my knowledge, there is nothing that prevents an LLVM >>>> optimization pass from manufacturing new pointers which point >>>> inside an existing data structure. (e.g. an interior pointer to an >>>> array when blocking a loop) Does your StackMap mechanism need to >>>> be able to inspect/modify these manufactured temporaries? If so, I >>>> don't see how you could generate an intrinsic which would include >>>> this manufactured pointer in the live variable list. Is there >>>> something I'm missing here? >>> These stackmaps have nothing to do with GC. Interior pointers are a >>> problem unique to precise copying collectors. >> I would argue that while the use of the stack maps might be >> different, the mechanism is fairly similar. > > It's not at all similar. These stackmaps are only useful for > deoptimization, since the only way to make use of the live state > information is to patch the stackmap with a jump to a deoptimization > off-ramp. You won't use these for a GC. > >> In general, if the expected semantics are the same, a shared >> implementation would be desirable. This is more a suggestion for >> future refactoring than anything else. > > I think that these stackmaps and GC stackmaps are fairly different > beasts. While it's possible to unify the two, this isn't the intent > here. In particular, you can use these stackmaps for deoptimization > without having to unwind the stack.I'm going to respond to Andrew Trick's followup for this portion.> >> >> I agree that interior pointers are primarily a problem for relocating >> collectors. (Though I disagree with the characterization of it being >> *uniquely* a problem for such collectors.) Since I was unaware of >> what you're using your stackmap mechanism for, I wanted to ask. >> Sounds like this is not an intended use case for you. >>> >>> In particular, the stackmaps in this proposal are likely to be used >>> for capturing only a select subset of state and that subset may fail >>> to include all possible GC roots. These stackmaps are meant to be >>> used for reconstructing state-in-bytecode (where bytecode = whatever >>> your baseline execution engine is, could be an AST) for performing a >>> deoptimization, if LLVM was used for compiling code that had some >>> type/value/behavior speculations. >> Thanks for the clarification. This is definitely a useful mechanism. >> Thank you for contributing it back. >>> >>>> - Your patchpoint mechanism appears to be one very specialized use >>>> of a patchable location. Would you mind renaming it to something >>>> like patchablecall to reflect this specialization? >>> The top use case will be heap access dispatch inline cache, which is >>> not a call. >>> You can also use it to implement call inline caches, but that's not >>> the only thing you can use it for. >> Er, possibly I'm misunderstanding you. To me, a inline call cache is >> a mechanism to optimize a dynamic call by adding a >> typecheck+directcall fastpath. > > Inline caches don't have to be calls. For example, in JavaScript, the > expression "o.f" is fully dynamic but usually does not result in a > call. The inline cache - and hence patchpoint - for such an > expression will not have a call in the common case. > > Similar things arise in other dynamic languages. You can have inline > caches for arithmetic. Or for array accesses. Or for any other > dynamic operation in your language. > >> (i.e. avoiding the dynamic dispatch logic in the common case) I'm >> assuming this what you mean with the term "call inline cache", but I >> have never heard of a "heap access dispatch inline cache". I've done >> a google search and didn't find a definition. Could you point me to >> a reference or provide a brief explanation? > > Every JavaScript engine does it, and usually the term "inline cache" > in the context of JS engines implies dispatching on the shape of the > object in order to find the offset at which a field is located, rather > than dispatching on the class of an object to determine what method to > call.Thank you for the clarification. I am familiar with the patching optimizations performed for property access, but had not been aware of the modified usage of the term "inline cache". I was also unaware of the term "heap access dispatch inline cache". I believe I now understand your intent. Taking a step back in the conversation, my original question was about the naming of the patchpoint intrinsic. I am now convinced that you could use your patchpoint intrinsic for a number of different inline caching schemes (method dispatch, property access, etc..). Given that, my concern about naming is diminished, but not completely eliminated. I don't really have a suggestion for a better name, but given that a "stackmap" intrinsic can be patched, the "patchpoint" intrinsic name doesn't seem particularly descriptive. To put it another way, how are the stackmap and patchpoint intrinsics different? Can this difference be encoded in a descriptive name for one or the other? As a secondary point, it would be good to update the proposed documentation with a brief description of the intended usage (i.e. inline caching). This might prevent a future developer from being confused on the same issues. Yours, Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131022/95a66c5f/attachment.html>
Filip Pizlo
2013-Oct-23  01:23 UTC
[LLVMdev] [RFC] Stackmap and Patchpoint Intrinsic Proposal
On Oct 22, 2013, at 4:18 PM, Andrew Trick <atrick at apple.com> wrote:> On Oct 22, 2013, at 3:08 PM, Filip Pizlo <fpizlo at apple.com> wrote: > >> On Oct 22, 2013, at 1:48 PM, Philip R <listmail at philipreames.com> wrote: >> >>> On 10/22/13 10:34 AM, Filip Pizlo wrote: >>>> On Oct 22, 2013, at 9:53 AM, Philip R <listmail at philipreames.com> wrote: >>>> >>>>> On 10/17/13 10:39 PM, Andrew Trick wrote: >>>>>> This is a proposal for adding Stackmaps and Patchpoints to LLVM. The >>>>>> first client of these features is the JavaScript compiler within the >>>>>> open source WebKit project. >>>>>> >>>>> I have a couple of comments on your proposal. None of these are major enough to prevent submission. >>>>> >>>>> - As others have said, I'd prefer an experimental namespace rather than a webkit namespace. (minor) >>>>> - Unless I am misreading your proposal, your proposed StackMap intrinsic duplicates existing functionality already in llvm. In particular, much of the StackMap construction seems similar to the Safepoint mechanism used by the in-tree GC support. (See CodeGen/GCStrategy.cpp and CodeGen/GCMetadata.cpp). Have you examined these mechanisms to see if you can share implementations? >>>>> - To my knowledge, there is nothing that prevents an LLVM optimization pass from manufacturing new pointers which point inside an existing data structure. (e.g. an interior pointer to an array when blocking a loop) Does your StackMap mechanism need to be able to inspect/modify these manufactured temporaries? If so, I don't see how you could generate an intrinsic which would include this manufactured pointer in the live variable list. Is there something I'm missing here? >>>> These stackmaps have nothing to do with GC. Interior pointers are a problem unique to precise copying collectors. >>> I would argue that while the use of the stack maps might be different, the mechanism is fairly similar. >> >> It's not at all similar. These stackmaps are only useful for deoptimization, since the only way to make use of the live state information is to patch the stackmap with a jump to a deoptimization off-ramp. You won't use these for a GC. >> >>> In general, if the expected semantics are the same, a shared implementation would be desirable. This is more a suggestion for future refactoring than anything else. >> >> I think that these stackmaps and GC stackmaps are fairly different beasts. While it's possible to unify the two, this isn't the intent here. In particular, you can use these stackmaps for deoptimization without having to unwind the stack. > > I think Philip R is asking a good question. To paraphrase: If we introduce a generically named feature, shouldn’t it be generically useful? Stack maps are used in other ways, and there are other kinds of patching. I agree and I think these are intended to be generically useful features, but not necessarily sufficient for every use. > > The proposed stack maps are very different from LLVM’s gcroot because gcroot does not provide stack maps! llvm.gcroot effectively designates a stack location for each root for the duration of the current function, and forces the root to be spilled to the stack at all call sites (the client needs to disable StackColoring). This is really the opposite of a stack map and I’m not aware of any functionality that can be shared. It also requires a C++ plugin to process the roots. llvm.stackmap generates data in a section that MCJIT clients can parse. > > If someone wanted to use stack maps for GC, I don’t know why they wouldn’t leverage llvm.stackmap. Maybe Filip can see a problem with this that I can't.You're right, it could work. If you were happy with spilling all of your GC roots, then you could put them into allocas and then pass the allocas' addresses to a stackmap. This will give you a FP offset of the roots. If you were happy with an accurate GC that couldn't move objects referenced from the stack then you could have each safepoint call use patchpoint, and then if you also implemented stack unwinding, you could use the patchpoints' implicit stackmaps to figure out which registers (or stack slots) contained pointers. These would be niche uses, I think. If you care about performance then you're not going to use an accurate GC that requires spilling roots; you'll go for some GC algorithm that can handle conservative stack roots. If you're using accurate GC support for moving objects then it's usually because you need to move *all* objects (after all you can move *most* objects without any GC roots or stackmaps by using Bartlett's algorithm or similar) so the calls-as-patchpoints approach won't work. I could kind of see some real-time GC's using the alloca+stackmap approach, but it's a bit of a stretch. So, I don't see stackmaps as being particularly practical for accurate GC, but I do concede that you *could* implement some kind of accurate GC that uses stackmaps for some part of its stack scanning.> The runtime can add GC roots to the stack map just like other live value, and it should know how to interpret the records. The intrinsic doesn’t bake in any particular interpretation of the mapped values. That said, my proposal deliberately does not cover GC. I think that stack maps are the easy part of the problem. The hard problem is tracking interior pointers, or for that matter exterior/out-of-bounds or swizzled pointers. LLVM’s machine IR simply doesn’t have the necessary facilities for doing this. But if you don’t need a moving collector, then you don’t need to track derived pointers as long as the roots are kept live. In that case, llvm.stackmap might be a nice optimization over llvm.gcroot. > > Now with regard to patching. I think llvm.patchpoint is generally useful for any type of patching I can imagine. It does look like a call site in IR, and it’s nice to be able to leverage calling conventions to inform the location of arguments. But the patchpoint does not have to be a call after patching, and you can specify zero arguments to avoid using a calling convention. In fact, we only currently emit a call out of convenience. We could splat nops in place and assume the runtime will immediately find and patch all occurrences before the code executes. In the future we may want to handle NULL call target, bypass call emission, and allow the reserved bytes to be less than that required to emit a call. > > -Andy-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131022/2fdb4774/attachment.html>
Adding Gael as someone who has previously discussed vmkit topics on the list. Since I'm assuming this is where the GC support came from, I wanted to draw this conversation to the attention of someone more familiar with the LLVM implementation than myself. On 10/22/13 4:18 PM, Andrew Trick wrote:> On Oct 22, 2013, at 3:08 PM, Filip Pizlo <fpizlo at apple.com > <mailto:fpizlo at apple.com>> wrote: > >> On Oct 22, 2013, at 1:48 PM, Philip R <listmail at philipreames.com >> <mailto:listmail at philipreames.com>> wrote: >> >>> On 10/22/13 10:34 AM, Filip Pizlo wrote: >>>> On Oct 22, 2013, at 9:53 AM, Philip R <listmail at philipreames.com >>>> <mailto:listmail at philipreames.com>> wrote: >>>> >>>>> On 10/17/13 10:39 PM, Andrew Trick wrote: >>>>>> This is a proposal for adding Stackmaps and Patchpoints to LLVM. The >>>>>> first client of these features is the JavaScript compiler within the >>>>>> open source WebKit project. >>>>>> >>>>> I have a couple of comments on your proposal. None of these are >>>>> major enough to prevent submission. >>>>> >>>>> - As others have said, I'd prefer an experimental namespace rather >>>>> than a webkit namespace. (minor) >>>>> - Unless I am misreading your proposal, your proposed StackMap >>>>> intrinsic duplicates existing functionality already in llvm. In >>>>> particular, much of the StackMap construction seems similar to the >>>>> Safepoint mechanism used by the in-tree GC support. (See >>>>> CodeGen/GCStrategy.cpp and CodeGen/GCMetadata.cpp). Have you >>>>> examined these mechanisms to see if you can share implementations? >>>>> - To my knowledge, there is nothing that prevents an LLVM >>>>> optimization pass from manufacturing new pointers which point >>>>> inside an existing data structure. (e.g. an interior pointer to >>>>> an array when blocking a loop) Does your StackMap mechanism need >>>>> to be able to inspect/modify these manufactured temporaries? If >>>>> so, I don't see how you could generate an intrinsic which would >>>>> include this manufactured pointer in the live variable list. Is >>>>> there something I'm missing here? >>>> These stackmaps have nothing to do with GC. Interior pointers are >>>> a problem unique to precise copying collectors. >>> I would argue that while the use of the stack maps might be >>> different, the mechanism is fairly similar. >> >> It's not at all similar. These stackmaps are only useful for >> deoptimization, since the only way to make use of the live state >> information is to patch the stackmap with a jump to a deoptimization >> off-ramp. You won't use these for a GC. >> >>> In general, if the expected semantics are the same, a shared >>> implementation would be desirable. This is more a suggestion for >>> future refactoring than anything else. >> >> I think that these stackmaps and GC stackmaps are fairly different >> beasts. While it's possible to unify the two, this isn't the intent >> here. In particular, you can use these stackmaps for deoptimization >> without having to unwind the stack. > > I think Philip R is asking a good question. To paraphrase: If we > introduce a generically named feature, shouldn’t it be generically > useful? Stack maps are used in other ways, and there are other kinds > of patching. I agree and I think these are intended to be generically > useful features, but not necessarily sufficient for every use.Thank you for the restatement. You summarized my view well.> > The proposed stack maps are very different from LLVM’s gcroot because > gcroot does not provide stack maps! llvm.gcroot effectively designates > a stack location for each root for the duration of the current > function, and forces the root to be spilled to the stack at all call > sites (the client needs to disable StackColoring). This is really the > opposite of a stack map and I’m not aware of any functionality that > can be shared. It also requires a C++ plugin to process the roots. > llvm.stackmap generates data in a section that MCJIT clients can parse.Er, I think we're talking past each other again. Let me lay out my current understanding of the terminology and existing infrastructure in LLVM. Please correct me where I go wrong. stack map - A mapping from "values" to storage locations. Storage locations primarily take the form of register, or stack offsets, but could in principal refer to other well known locations (i.e. offsets into thread local state). A stack map is specific to a particular PC and describes the state at that instruction only. In a precise garbage collector, stack maps are used to ensure that the stack can be understood by the collector. When a stop-the-world safepoint is reached, the collector needs to be able to identify any pointers to heap objects which may exist on the stack. This explicitly includes both the frame which actually contains the safepoint and any caller frames back to the root of thread. To accomplish this, a stack map is generated at any call site and a stack map is generated for the safepoint itself. In LLVM currently, the GCStrategy records "safepoints" which are really points at which stack maps need to be remembered. (i.e. calls and actual stop-the-world safepoints) The GCMetadata mechanism gives a generic way to emit the binary encoding of a stack map in a collector specific way. The current stack maps supported by this mechanism only allow abstract locations on the stack which force all registers to be spilled around "safepoints" (i.e. calls and stop-the-world safepoints). Also, the set of roots (which are recorded in the stack map) must be provided separately using the gcroot intrinsic. In code: - GCPoint in llvm/include/llvm/CodeGen/GCMetadata.h describes a request for a location with a stack map. The SafePoints structure in GCFunctionInfo contains a list of these locations. - The Ocaml GC is probably the best example of usage. See llvm/lib/CodeGen/AsmPrinter/OcamlGCPrinter.cpp Note: The summary of existing LLVM details above is based on reading the code. I haven't actually implemented anything which used this mechanism yet. As such, take it with a grain of salt. In your change, you are adding a mechanism which is intended to enable runtime calls and inline cache patching. (Right?) Your stack maps seem to match the definition of a stack map I gave above and (I believe) the implementation currently in LLVM. The only difference might be that your stack maps are partial (i.e. might not contain all "values" which are live at a particular PC) and your implementation includes Register locations which the current implementation in LLVM does not. One other possible difference, are you intending to include "values" which aren't of pointer type? Before moving on, am I interpreting your proposal and changes correctly? Assuming I'm still correct so far, how might we combine these implementations? It looks like your implementation is much more mature than what exists in tree at the moment. One possibility would be to express the needed GC stack maps in terms of your new infrastructure. (i.e. convert a GCStrategy request for a safepoint into a StackMap (as you've implemented it) with the list of explicit GC roots as it's arguments). What would you think of this? p.s. This discussion has gotten sufficiently abstract that it should in no way block your plan to submit these changes. I appreciate your willingness to discuss.> > If someone wanted to use stack maps for GC, I don’t know why they > wouldn’t leverage llvm.stackmap. Maybe Filip can see a problem with > this that I can't. The runtime can add GC roots to the stack map just > like other live value, and it should know how to interpret the > records. The intrinsic doesn’t bake in any particular interpretation > of the mapped values.I think this a restatement of my last paragraph above which would mean we're actually in agreement.> That said, my proposal deliberately does not cover GC. I think that > stack maps are the easy part of the problem. The hard problem is > tracking interior pointers, or for that matter exterior/out-of-bounds > or swizzled pointers. LLVM’s machine IR simply doesn’t have the > necessary facilities for doing this. But if you don’t need a moving > collector, then you don’t need to track derived pointers as long as > the roots are kept live. In that case, llvm.stackmap might be a nice > optimization over llvm.gcroot.Oddly enough, I'll be raising the issue of how to go about supporting a relocating collector on list shortly. We've looking into this independently, but are at the point we'd like to get feedback from others. :)> > Now with regard to patching. I think llvm.patchpoint is generally > useful for any type of patching I can imagine. It does look like a > call site in IR, and it’s nice to be able to leverage calling > conventions to inform the location of arguments.Agreed. My concern is mostly about naming and documentation of intended usages. Speaking as someone who's likely to be using this in the very near future, I'd like to make sure I understand how you intend it to be used. The last thing I want to do is misconstrue your intent and become reliant on a quirk of the implementation you later want to change.> But the patchpoint does not have to be a call after patching, and you > can specify zero arguments to avoid using a calling convention.Er, not quite true. Your calling convention also influences what registers stay live across the call. But in general, I see your point. (Again, this is touching an area of LLVM I'm not particularly familiar with.)> In fact, we only currently emit a call out of convenience. We could > splat nops in place and assume the runtime will immediately find and > patch all occurrences before the code executes. In the future we may > want to handle NULL call target, bypass call emission, and allow the > reserved bytes to be less than that required to emit a call.If you were to do that, how would the implementation be different then the new stackmap intrinsic? Does that difference imply a clarification in intended usage or naming? p.s. The naming discussion has gotten rather abstract and is starting to feel like a "what color is the bikeshed" discussion. Feel free to just tell me to go away at some point. :) Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131022/cc89da09/attachment.html>
Reasonably Related Threads
- [LLVMdev] [RFC] Stackmap and Patchpoint Intrinsic Proposal
- [LLVMdev] GC StackMaps (was Stackmap and Patchpoint Intrinsic Proposal)
- [LLVMdev] GC StackMaps (was Stackmap and Patchpoint Intrinsic Proposal)
- [LLVMdev] [RFC] Stackmap and Patchpoint Intrinsic Proposal
- [LLVMdev] [RFC] Stackmap and Patchpoint Intrinsic Proposal