Kavon Farvardin via llvm-dev
2017-Apr-17 23:04 UTC
[llvm-dev] [RFC] Adding CPS call support
> Is there a reason you can't use the algorithm from the paper "A Correspondence between Continuation Passing Style and Static Single Assignment Form" to convert your IR to LLVM's SSA IR?Yes, there are a few reasons. Undoing the CPS transformation earlier in the pipeline would mean that we are using LLVM's built-in stack. The special layout and usage of the stack in GHC is achieved through CPS, so it is baked the compiler and garbage-collected runtime system. ~kavon> On Apr 17, 2017, at 8:56 PM, Manuel Jacob <me at manueljacob.de> wrote: > > Hi Kavon, > > Is there a reason you can't use the algorithm from the paper "A Correspondence between Continuation Passing Style and Static Single Assignment Form" to convert your IR to LLVM's SSA IR? > > -Manuel > > On 2017-04-17 17:30, Kavon Farvardin via llvm-dev wrote: >> Summary >> ======>> There is a need for dedicated continuation-passing style (CPS) calls in LLVM to >> support functional languages. Herein I describe the problem and propose a >> solution. Feedback and/or tips are greatly appreciated, as our goal is to >> implement these changes so they can be merged into LLVM trunk. >> Problem >> ======>> Implementations of functional languages like Haskell and ML (e.g., GHC and >> Manticore) use a continuation-passing style (CPS) transformation in order to >> manage the call stack explicitly. This is done prior to generating LLVM IR, so >> the implicit call stack within LLVM is not used for call and return. >> When making a non-tail call while in CPS, we initialize a stack frame for the >> return through our own stack pointer, and then pass that stack pointer to the >> callee when we jump to it. It is here when we run into a problem in LLVM. >> Consider the following CPS call to @bar and how it will return: >> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >> define void @foo (i8** %sp, ...) { >> someBlk: >> ; ... >> ; finish stack frame by writing return address >> %retAddr = blockaddress(@foo, %retpt) >> store i8* %retAddr, i8** %sp >> ; jump to @bar >> tail call void @bar(i8** %sp, ...) >> retpt: ; <- how can @bar "call" %retpt? >> %sp2 = ??? >> %val = ??? >> ; ... >> } >> define void @bar (i8** %sp, ...) { >> ; perform a return >> %retAddr0 = load i8*, i8** %sp >> %retAddr1 = bitcast i8* %retAddr0 to void (i8**, i64)* >> %val = bitcast i64 1 to i64 >> ; jump back to %retpt in @foo, passing %sp and %val >> tail call void %retAddr1(i8** %sp, i64 %val) >> } >> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >> There is currently no way to jump back to %retpt from another function, as block >> addresses have restricted usage in LLVM [1]. Our main difficulty is that we >> cannot jump to a block address without knowing its calling convention, i.e., the >> particular machine registers (or memory locations) that the block expects >> incoming values to be passed in. >> The workaround we have been using in GHC for LLVM is to break apart every >> function, placing the code for the continuation of each call into a new >> function. We do this only so that we can store a function pointer instead of a >> block address to our stack. This particularly gross transformation inhibits >> optimizations in both GHC and LLVM, and we would like to remove the need for it. >> Proposal >> =======>> I believe the lowest-impact method of fixing this problem with LLVM is the >> following: >> First, we add a special 'cps' call instruction marker to be used on non-tail >> calls. Then, we use a specialized calling convention for these non-tail calls, >> which fix the returned values to specific locations in the machine code [2]. >> To help illustrate what's going on, let's rewrite the above example using the >> proposed 'cps' call: >> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >> define { ... } @foo (i8** %sp, ...) { >> someBlk: >> ; ... >> ; finish stack frame by writing return address >> %retAddr = blockaddress(@foo, %retpt) >> store i8* %retAddr, i8** %sp >> ; jump to @bar >> %retVals = cps call ghccc {i8**, i64} @bar (i8** %sp, ...) >> br label %retpt >> retpt: >> %sp2 = extractvalue {i8**, i64} %retVals, 0 >> %val = extractvalue {i8**, i64} %retVals, 1 >> ; ... >> } >> define {i8**, i64} @bar (i8** %sp, ...) { >> ; perform a return >> %retAddr0 = load i8*, i8** %sp >> %retAddr1 = bitcast i8* %retAddr0 to {i8**, i64} (i8**, i64)* >> %val = bitcast i64 1 to i64 >> ; jump back to %retpt in @foo, passing %sp and %val >> tail call ghccc void %retAddr1(i8** %sp, i64 %val) >> unreachable ; <- ideally this would be our terminator, >> ; but 'unreachable' breaks TCO, so we will >> ; emit a ret of the struct "returned" by the call. >> } >> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >> The important point here is that the 'cps' marked call will lower to a jump. The >> 'cps' call marker means that the callee knows how to return using the arguments >> explicitly passed to it, i.e., the stack pointer %sp. The callee cannot use a >> 'ret' instruction if it is 'cps' called. >> Either before or during 'cps' call lowering, any instructions following the >> 'cps' call to @bar are sunk into the the block %retpt, and the unconditional >> branch to %retpt is deleted/ignored. We include that branch to preserve >> control-flow information for LLVM IR optimization passes. >> The 'extractvalue' instructions are what ensure the calling convention of >> %retpt, since the fields of the struct %retVals are returned in physical >> registers dictated by the (modified) ghccc convention. Those same physical >> registers are used by the ghccc tail call in @bar when it jumps back to %retpt. >> So, the call & return convention of ghccc ensures that everything matches up. >> Interaction with LLVM >> ====================>> (1) Caller-saved Values >> One may wonder how this would work if there are caller-saved values of the 'cps' >> call. But, in our first example, which closely matches what CPS code looks like, >> the call to @bar was in tail position. Thus, in the second example, there are no >> caller-saved values for the 'cps' call to @bar, as all live values were passed >> as arguments in the call. >> This caller-saved part is a bit subtle. It works fine in my experience [2] when >> @bar is a function not visible to LLVM. My impression is that even if @bar is >> visible to LLVM, there is still no issue, but if you can think of any corner >> cases that would be great! >> (2) Inlining >> My gut feeling is that we cannot inline a 'cps' marked call-site without more >> effort. This is because we might end up with something odd like this once the >> dust settles: >> %retAddr = blockaddress(@foo, %retpt) >> %retAddr1 = bitcast i8* %retAddr to {i8**, i64} (i8**, i64)* >> tail call ghccc %retAddr1 ( %sp, ... ) >> We could add a pass that turns the above sequence into just an unconditional >> branch to %retpt, using a phi-node to replace each 'extractvalue' instruction in >> that block. >> I'm not sure whether inlining in LLVM is important for us yet, as we tend to >> inline quite a lot before generating LLVM IR. I don't think this additional fix- >> up pass would be too difficult to implement if it's desired. >> Implementation Sketch and Conclusion >> ===================================>> My current plan is to add this special lowering of 'cps' calls during the >> translation from LLVM IR to the SelectionDAG. I welcome any suggestions or tips >> on the best way to approach this. An important goal for us is to merge this into >> trunk since we do not want to bundle a special version of LLVM with GHC. >> Please let me know soon if you have any objections to this feature. >> Thanks for reading, >> Kavon >> References >> =========>> [1] http://llvm.org/docs/LangRef.html#blockaddress >> [2] http://kavon.farvard.in/papers/ml16-cwc-llvm.pdf >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
On 04/17/2017 04:04 PM, Kavon Farvardin via llvm-dev wrote:>> Is there a reason you can't use the algorithm from the paper "A Correspondence between Continuation Passing Style and Static Single Assignment Form" to convert your IR to LLVM's SSA IR? > > Yes, there are a few reasons. > > Undoing the CPS transformation earlier in the pipeline would mean that we are using LLVM's built-in stack. The special layout and usage of the stack in GHC is achieved through CPS, so it is baked the compiler and garbage-collected runtime system.Can you give a bit more detail here? LLVM does provide support for describing GC frame maps. p.s. You're going to have to justify the design of the runtime a bit here. Extending the IR to workaround a buggy or poorly structured runtime is not going to be sufficient justification. *Why* does the runtime need the specific runtime stack structure used? What alternatives exist and why should those be rejected?> > ~kavon > >> On Apr 17, 2017, at 8:56 PM, Manuel Jacob <me at manueljacob.de> wrote: >> >> Hi Kavon, >> >> Is there a reason you can't use the algorithm from the paper "A Correspondence between Continuation Passing Style and Static Single Assignment Form" to convert your IR to LLVM's SSA IR? >> >> -Manuel >> >> On 2017-04-17 17:30, Kavon Farvardin via llvm-dev wrote: >>> Summary >>> ======>>> There is a need for dedicated continuation-passing style (CPS) calls in LLVM to >>> support functional languages. Herein I describe the problem and propose a >>> solution. Feedback and/or tips are greatly appreciated, as our goal is to >>> implement these changes so they can be merged into LLVM trunk. >>> Problem >>> ======>>> Implementations of functional languages like Haskell and ML (e.g., GHC and >>> Manticore) use a continuation-passing style (CPS) transformation in order to >>> manage the call stack explicitly. This is done prior to generating LLVM IR, so >>> the implicit call stack within LLVM is not used for call and return. >>> When making a non-tail call while in CPS, we initialize a stack frame for the >>> return through our own stack pointer, and then pass that stack pointer to the >>> callee when we jump to it. It is here when we run into a problem in LLVM. >>> Consider the following CPS call to @bar and how it will return: >>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >>> define void @foo (i8** %sp, ...) { >>> someBlk: >>> ; ... >>> ; finish stack frame by writing return address >>> %retAddr = blockaddress(@foo, %retpt) >>> store i8* %retAddr, i8** %sp >>> ; jump to @bar >>> tail call void @bar(i8** %sp, ...) >>> retpt: ; <- how can @bar "call" %retpt? >>> %sp2 = ??? >>> %val = ??? >>> ; ... >>> } >>> define void @bar (i8** %sp, ...) { >>> ; perform a return >>> %retAddr0 = load i8*, i8** %sp >>> %retAddr1 = bitcast i8* %retAddr0 to void (i8**, i64)* >>> %val = bitcast i64 1 to i64 >>> ; jump back to %retpt in @foo, passing %sp and %val >>> tail call void %retAddr1(i8** %sp, i64 %val) >>> } >>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >>> There is currently no way to jump back to %retpt from another function, as block >>> addresses have restricted usage in LLVM [1]. Our main difficulty is that we >>> cannot jump to a block address without knowing its calling convention, i.e., the >>> particular machine registers (or memory locations) that the block expects >>> incoming values to be passed in. >>> The workaround we have been using in GHC for LLVM is to break apart every >>> function, placing the code for the continuation of each call into a new >>> function. We do this only so that we can store a function pointer instead of a >>> block address to our stack. This particularly gross transformation inhibits >>> optimizations in both GHC and LLVM, and we would like to remove the need for it. >>> Proposal >>> =======>>> I believe the lowest-impact method of fixing this problem with LLVM is the >>> following: >>> First, we add a special 'cps' call instruction marker to be used on non-tail >>> calls. Then, we use a specialized calling convention for these non-tail calls, >>> which fix the returned values to specific locations in the machine code [2]. >>> To help illustrate what's going on, let's rewrite the above example using the >>> proposed 'cps' call: >>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >>> define { ... } @foo (i8** %sp, ...) { >>> someBlk: >>> ; ... >>> ; finish stack frame by writing return address >>> %retAddr = blockaddress(@foo, %retpt) >>> store i8* %retAddr, i8** %sp >>> ; jump to @bar >>> %retVals = cps call ghccc {i8**, i64} @bar (i8** %sp, ...) >>> br label %retpt >>> retpt: >>> %sp2 = extractvalue {i8**, i64} %retVals, 0 >>> %val = extractvalue {i8**, i64} %retVals, 1 >>> ; ... >>> } >>> define {i8**, i64} @bar (i8** %sp, ...) { >>> ; perform a return >>> %retAddr0 = load i8*, i8** %sp >>> %retAddr1 = bitcast i8* %retAddr0 to {i8**, i64} (i8**, i64)* >>> %val = bitcast i64 1 to i64 >>> ; jump back to %retpt in @foo, passing %sp and %val >>> tail call ghccc void %retAddr1(i8** %sp, i64 %val) >>> unreachable ; <- ideally this would be our terminator, >>> ; but 'unreachable' breaks TCO, so we will >>> ; emit a ret of the struct "returned" by the call. >>> } >>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >>> The important point here is that the 'cps' marked call will lower to a jump. The >>> 'cps' call marker means that the callee knows how to return using the arguments >>> explicitly passed to it, i.e., the stack pointer %sp. The callee cannot use a >>> 'ret' instruction if it is 'cps' called. >>> Either before or during 'cps' call lowering, any instructions following the >>> 'cps' call to @bar are sunk into the the block %retpt, and the unconditional >>> branch to %retpt is deleted/ignored. We include that branch to preserve >>> control-flow information for LLVM IR optimization passes. >>> The 'extractvalue' instructions are what ensure the calling convention of >>> %retpt, since the fields of the struct %retVals are returned in physical >>> registers dictated by the (modified) ghccc convention. Those same physical >>> registers are used by the ghccc tail call in @bar when it jumps back to %retpt. >>> So, the call & return convention of ghccc ensures that everything matches up. >>> Interaction with LLVM >>> ====================>>> (1) Caller-saved Values >>> One may wonder how this would work if there are caller-saved values of the 'cps' >>> call. But, in our first example, which closely matches what CPS code looks like, >>> the call to @bar was in tail position. Thus, in the second example, there are no >>> caller-saved values for the 'cps' call to @bar, as all live values were passed >>> as arguments in the call. >>> This caller-saved part is a bit subtle. It works fine in my experience [2] when >>> @bar is a function not visible to LLVM. My impression is that even if @bar is >>> visible to LLVM, there is still no issue, but if you can think of any corner >>> cases that would be great! >>> (2) Inlining >>> My gut feeling is that we cannot inline a 'cps' marked call-site without more >>> effort. This is because we might end up with something odd like this once the >>> dust settles: >>> %retAddr = blockaddress(@foo, %retpt) >>> %retAddr1 = bitcast i8* %retAddr to {i8**, i64} (i8**, i64)* >>> tail call ghccc %retAddr1 ( %sp, ... ) >>> We could add a pass that turns the above sequence into just an unconditional >>> branch to %retpt, using a phi-node to replace each 'extractvalue' instruction in >>> that block. >>> I'm not sure whether inlining in LLVM is important for us yet, as we tend to >>> inline quite a lot before generating LLVM IR. I don't think this additional fix- >>> up pass would be too difficult to implement if it's desired. >>> Implementation Sketch and Conclusion >>> ===================================>>> My current plan is to add this special lowering of 'cps' calls during the >>> translation from LLVM IR to the SelectionDAG. I welcome any suggestions or tips >>> on the best way to approach this. An important goal for us is to merge this into >>> trunk since we do not want to bundle a special version of LLVM with GHC. >>> Please let me know soon if you have any objections to this feature. >>> Thanks for reading, >>> Kavon >>> References >>> =========>>> [1] http://llvm.org/docs/LangRef.html#blockaddress >>> [2] http://kavon.farvard.in/papers/ml16-cwc-llvm.pdf >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Kavon Farvardin via llvm-dev
2017-Apr-18 20:08 UTC
[llvm-dev] [RFC] Adding CPS call support
Before I try to respond to all of these points, let's consider limiting the scope of the proposed changes: Instead of adding a 'cps' call marker, what if I were to add a custom lowering (during isel) for calls marked with the 'ghccc' calling convention? There is already language specific lowering in isel for Swift, so I imagine this would more acceptable? ~kavon> On Apr 18, 2017, at 7:27 PM, Philip Reames <listmail at philipreames.com> wrote: > > > > On 04/17/2017 04:04 PM, Kavon Farvardin via llvm-dev wrote: >>> Is there a reason you can't use the algorithm from the paper "A Correspondence between Continuation Passing Style and Static Single Assignment Form" to convert your IR to LLVM's SSA IR? >> >> Yes, there are a few reasons. >> >> Undoing the CPS transformation earlier in the pipeline would mean that we are using LLVM's built-in stack. The special layout and usage of the stack in GHC is achieved through CPS, so it is baked the compiler and garbage-collected runtime system. > Can you give a bit more detail here? LLVM does provide support for describing GC frame maps. > > p.s. You're going to have to justify the design of the runtime a bit here. Extending the IR to workaround a buggy or poorly structured runtime is not going to be sufficient justification. *Why* does the runtime need the specific runtime stack structure used? What alternatives exist and why should those be rejected? >> >> ~kavon >> >>> On Apr 17, 2017, at 8:56 PM, Manuel Jacob <me at manueljacob.de> wrote: >>> >>> Hi Kavon, >>> >>> Is there a reason you can't use the algorithm from the paper "A Correspondence between Continuation Passing Style and Static Single Assignment Form" to convert your IR to LLVM's SSA IR? >>> >>> -Manuel >>> >>> On 2017-04-17 17:30, Kavon Farvardin via llvm-dev wrote: >>>> Summary >>>> ======>>>> There is a need for dedicated continuation-passing style (CPS) calls in LLVM to >>>> support functional languages. Herein I describe the problem and propose a >>>> solution. Feedback and/or tips are greatly appreciated, as our goal is to >>>> implement these changes so they can be merged into LLVM trunk. >>>> Problem >>>> ======>>>> Implementations of functional languages like Haskell and ML (e.g., GHC and >>>> Manticore) use a continuation-passing style (CPS) transformation in order to >>>> manage the call stack explicitly. This is done prior to generating LLVM IR, so >>>> the implicit call stack within LLVM is not used for call and return. >>>> When making a non-tail call while in CPS, we initialize a stack frame for the >>>> return through our own stack pointer, and then pass that stack pointer to the >>>> callee when we jump to it. It is here when we run into a problem in LLVM. >>>> Consider the following CPS call to @bar and how it will return: >>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >>>> define void @foo (i8** %sp, ...) { >>>> someBlk: >>>> ; ... >>>> ; finish stack frame by writing return address >>>> %retAddr = blockaddress(@foo, %retpt) >>>> store i8* %retAddr, i8** %sp >>>> ; jump to @bar >>>> tail call void @bar(i8** %sp, ...) >>>> retpt: ; <- how can @bar "call" %retpt? >>>> %sp2 = ??? >>>> %val = ??? >>>> ; ... >>>> } >>>> define void @bar (i8** %sp, ...) { >>>> ; perform a return >>>> %retAddr0 = load i8*, i8** %sp >>>> %retAddr1 = bitcast i8* %retAddr0 to void (i8**, i64)* >>>> %val = bitcast i64 1 to i64 >>>> ; jump back to %retpt in @foo, passing %sp and %val >>>> tail call void %retAddr1(i8** %sp, i64 %val) >>>> } >>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >>>> There is currently no way to jump back to %retpt from another function, as block >>>> addresses have restricted usage in LLVM [1]. Our main difficulty is that we >>>> cannot jump to a block address without knowing its calling convention, i.e., the >>>> particular machine registers (or memory locations) that the block expects >>>> incoming values to be passed in. >>>> The workaround we have been using in GHC for LLVM is to break apart every >>>> function, placing the code for the continuation of each call into a new >>>> function. We do this only so that we can store a function pointer instead of a >>>> block address to our stack. This particularly gross transformation inhibits >>>> optimizations in both GHC and LLVM, and we would like to remove the need for it. >>>> Proposal >>>> =======>>>> I believe the lowest-impact method of fixing this problem with LLVM is the >>>> following: >>>> First, we add a special 'cps' call instruction marker to be used on non-tail >>>> calls. Then, we use a specialized calling convention for these non-tail calls, >>>> which fix the returned values to specific locations in the machine code [2]. >>>> To help illustrate what's going on, let's rewrite the above example using the >>>> proposed 'cps' call: >>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >>>> define { ... } @foo (i8** %sp, ...) { >>>> someBlk: >>>> ; ... >>>> ; finish stack frame by writing return address >>>> %retAddr = blockaddress(@foo, %retpt) >>>> store i8* %retAddr, i8** %sp >>>> ; jump to @bar >>>> %retVals = cps call ghccc {i8**, i64} @bar (i8** %sp, ...) >>>> br label %retpt >>>> retpt: >>>> %sp2 = extractvalue {i8**, i64} %retVals, 0 >>>> %val = extractvalue {i8**, i64} %retVals, 1 >>>> ; ... >>>> } >>>> define {i8**, i64} @bar (i8** %sp, ...) { >>>> ; perform a return >>>> %retAddr0 = load i8*, i8** %sp >>>> %retAddr1 = bitcast i8* %retAddr0 to {i8**, i64} (i8**, i64)* >>>> %val = bitcast i64 1 to i64 >>>> ; jump back to %retpt in @foo, passing %sp and %val >>>> tail call ghccc void %retAddr1(i8** %sp, i64 %val) >>>> unreachable ; <- ideally this would be our terminator, >>>> ; but 'unreachable' breaks TCO, so we will >>>> ; emit a ret of the struct "returned" by the call. >>>> } >>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >>>> The important point here is that the 'cps' marked call will lower to a jump. The >>>> 'cps' call marker means that the callee knows how to return using the arguments >>>> explicitly passed to it, i.e., the stack pointer %sp. The callee cannot use a >>>> 'ret' instruction if it is 'cps' called. >>>> Either before or during 'cps' call lowering, any instructions following the >>>> 'cps' call to @bar are sunk into the the block %retpt, and the unconditional >>>> branch to %retpt is deleted/ignored. We include that branch to preserve >>>> control-flow information for LLVM IR optimization passes. >>>> The 'extractvalue' instructions are what ensure the calling convention of >>>> %retpt, since the fields of the struct %retVals are returned in physical >>>> registers dictated by the (modified) ghccc convention. Those same physical >>>> registers are used by the ghccc tail call in @bar when it jumps back to %retpt. >>>> So, the call & return convention of ghccc ensures that everything matches up. >>>> Interaction with LLVM >>>> ====================>>>> (1) Caller-saved Values >>>> One may wonder how this would work if there are caller-saved values of the 'cps' >>>> call. But, in our first example, which closely matches what CPS code looks like, >>>> the call to @bar was in tail position. Thus, in the second example, there are no >>>> caller-saved values for the 'cps' call to @bar, as all live values were passed >>>> as arguments in the call. >>>> This caller-saved part is a bit subtle. It works fine in my experience [2] when >>>> @bar is a function not visible to LLVM. My impression is that even if @bar is >>>> visible to LLVM, there is still no issue, but if you can think of any corner >>>> cases that would be great! >>>> (2) Inlining >>>> My gut feeling is that we cannot inline a 'cps' marked call-site without more >>>> effort. This is because we might end up with something odd like this once the >>>> dust settles: >>>> %retAddr = blockaddress(@foo, %retpt) >>>> %retAddr1 = bitcast i8* %retAddr to {i8**, i64} (i8**, i64)* >>>> tail call ghccc %retAddr1 ( %sp, ... ) >>>> We could add a pass that turns the above sequence into just an unconditional >>>> branch to %retpt, using a phi-node to replace each 'extractvalue' instruction in >>>> that block. >>>> I'm not sure whether inlining in LLVM is important for us yet, as we tend to >>>> inline quite a lot before generating LLVM IR. I don't think this additional fix- >>>> up pass would be too difficult to implement if it's desired. >>>> Implementation Sketch and Conclusion >>>> ===================================>>>> My current plan is to add this special lowering of 'cps' calls during the >>>> translation from LLVM IR to the SelectionDAG. I welcome any suggestions or tips >>>> on the best way to approach this. An important goal for us is to merge this into >>>> trunk since we do not want to bundle a special version of LLVM with GHC. >>>> Please let me know soon if you have any objections to this feature. >>>> Thanks for reading, >>>> Kavon >>>> References >>>> =========>>>> [1] http://llvm.org/docs/LangRef.html#blockaddress >>>> [2] http://kavon.farvard.in/papers/ml16-cwc-llvm.pdf >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev