Joseph Tremoulet via llvm-dev
2015-Dec-04 21:27 UTC
[llvm-dev] Support token type in struct for landingpad
> I dont have a concrete design right now and I am happy to take any other ideasThree ideas come to mind, none of which are perfect: 1) I'm tempted to say that now that we have token type, landingpad should generally produce a token, the pointer should be extracted with the @llvm.eh.exceptionpointer intrinsic instead of an extractvalue, and the selector should likewise be extracted with a new @llvm.eh.exceptionselector intrinsic instead of extractvalue (and personalities that communicate other things via their landingpads would need to add similar intrinsics to extract them, like the @llvm.eh.exceptioncode intrinsic that SEH uses). But that would require updating all the front-ends generating landingpads, and be awkward for any target personality routines that literally do pass a struct to the landing pad (are there any?), and so probably just reflects my bias coming from dealing with a personality using catchpads/cleanuppads instead of landingpads. 2) Since you're not actually using the landingpad's exception selector nor, if I understand " This is enough to support the gc.statepoint work " correctly, its exception pointer, it's possible that TLI.getExceptionPointerRegister and TLI.getExceptionSelectorRegister should be returning NoRegister for your personality. That would require modifying the EHPersonality enum and corresponding string matching in Analysis/EHPersonalities.h to recognize your personality, but I think that would be fine (it highlights a potential scaling issue if we add lots of targets that each need this, but that's a somewhat independent and pre-existing issue, and in reality I doubt you'd be opening a floodgate here). 3) Maybe the default should be switched, so that TLI.getExceptionPointerRegister and TLI.getExceptionSelectorRegister return NoRegister for EHPersonality::Unknown, and only return actual registers for personalities they recognized. This would require any targets using landingpads with exception pointers / exception selectors to update their code and add themselves to Analysis/EHPersonalities.h, similar to how #2 would require adding your personality, so it seems more disruptive if conceptually a touch cleaner than #2. 4) Explicitly checking for token type in visitLandingPad as you suggest sounds okay to me as a pragmatic approach, too. I'd probably lean toward #2 as being the least disruptive and most explicit/straightforward about the personality's expectations, but I'm curious what others think. Thanks -Joseph From: Chen Li [mailto:meloli87 at gmail.com] Sent: Thursday, December 3, 2015 4:06 PM To: David Majnemer <david.majnemer at gmail.com> Cc: Igor Laevsky <igor at azulsystems.com>; llvm-dev <llvm-dev at lists.llvm.org>; Joseph Tremoulet <jotrem at microsoft.com> Subject: Re: Support token type in struct for landingpad Hi David and Joseph, I’ve just added landingpad with token type locally and changed gc.relocate to work in the following way: %0 = invoke token (i64, i32, void (i64 addrspace(1)*)*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidp1i64f(i64 0, i32 0, void (i64 addrspace(1)*)* @some_call, i32 1, i32 0, i64 addrspace(1)* %obj, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i64 addrspace(1)* %obj, i64 addrspace(1)* %obj1) to label %invoke_safepoint_normal_dest unwind label %exceptional_return invoke_safepoint_normal_dest: ... exceptional_return: %landing_pad = landingpad token cleanup %obj.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(token %landing_pad, i32 13, i32 13) %obj1.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(token %landing_pad, i32 14, i32 14) ret i64 addrspace(1)* %obj1.relocated1 Now gc.statepoint return a token type instead of i32 type, and gc.relocate also takes a token type as its first argument (the first argument should either be the corresponding gc.statepoint for call statepoint or invoke statepoint on the normal path, or a reference that could help find the corresponding gc.statepoint on the unwind the path). And since landingpad produces a token type here as well, it can be passed as the reference to the gc.relocate’s first argument. To make this work, I have changed two parts of the code. First is how gc.relocate looks up for its corresponding gc.statepoint on the unwind path. It used to use the extracted selector value to find the landingpad and then use the landingpad to find the invoke instruction, which is the gc.statepoint. Now, it can use the landingpad directly to find the invoke instruction. The second part is to make landingpad work with token type. In LLVM’s front end (passes before SelectionDAG), there is no restrictions on what type a landingpad should have (there are test cases in LLVM that has landingpad of i8 or i32 type). However, in SelectionDAGBuilder::visitLandingPad, it is enforced that landingpad must be two-valued (type of { i8*, i32 }), in which way it can handle the exception pointer and selector value inside it. As the first step, I’d like to just add a check to see if the landingpad is of token type, and if so stop it and don’t bother to create the DAG nodes for the exception pointer and selector value (same as what happens during SjLj exceptions). This is enough to support the gc.statepoint work but will not support for C++ style exception handling with gc.statepoint. As for follow-up work, I’d like to add some support to extract selector value from token landingpad. I think we could either do it explicitly in IR (maybe add a intrinsic call extract.selector or something similar) or implicitly during SelectionDAG (in visitLandingPad, check if it’s token type, and if so add an additional transform to extract the exception pointer and selector value from the token). I dont have a concrete design right now and I am happy to take any other ideas. My plan is to get the first step checked in and incrementally work on the follow-up work. Does that sound a reasonable approach to you guys? thanks, chen On Dec 2, 2015, at 9:47 AM, Chen Li <meloli87 at gmail.com<mailto:meloli87 at gmail.com>> wrote: On Dec 1, 2015, at 11:14 PM, David Majnemer <david.majnemer at gmail.com<mailto:david.majnemer at gmail.com>> wrote: While we support 'opaque' types nested within struct types, they are not exactly battle tested: $ cat t.ll %opaque_ty = type opaque %struct_ty = type { i32, %opaque_ty } define %struct_ty @f(%struct_ty* %p) { %load = load %struct_ty, %struct_ty* %p ret %struct_ty %load } $ opt -O2 t.ll -S lib/IR/DataLayout.cpp:623: unsigned int llvm::DataLayout::getAlignment(llvm::Type *, bool) const: Assertion `Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"' failed. Thanks David! I’ve actually hacked to add token type into struct type and ended up with the same failure as above. I will take a look at the catchpad and cleanuppad code, and create a patch to add token landingpad and have you review it. thanks, chen As a practical matter, I fear nesting 'token' types within struct types will have similar issues. Beyond that, the design philosophy behind 'token' is that it is incredibly opaque and permitting it to nest inside a struct creates scenarios where one might try to GEP to the end of the field right before the token field in an attempt to examine or manipulate the token. Your other recommendation, having landingpad produce a token, is quite similar to how we've designed catchpad and cleanuppad. I think that direction will be quite nice. On Tue, Dec 1, 2015 at 8:07 PM, Chen Li <meloli87 at gmail.com<mailto:meloli87 at gmail.com>> wrote: Hi David, Sorry to bother you, but I would like to get some suggestions on your recent work of token type. I’m currently working on changing gc.statepoint to return a token type instead of a i32 type. The reason is that with the current implementation, gc.statepoint could potentially be fed into PHI nodes, and break RewriteStatepointsForGC pass later. Using token type would help us to avoid this. I have most of the code work but got a problem when gc.statepint is an InvokeInst and has an unwind path. Currently, gc.statepoint of InvokeInst works as below (the code snippet is from test/CodeGen/X86/statepoint-invoke.ll): %0 = invoke i32 (i64, i32, void (i64 addrspace(1)*)*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidp1i64f(i64 0, i32 0, void (i64 addrspace(1)*)* @some_call, i32 1, i32 0, i64 addrspace(1)* %obj, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i64 addrspace(1)* %obj, i64 addrspace(1)* %obj1) to label %invoke_safepoint_normal_dest unwind label %exceptional_return invoke_safepoint_normal_dest: … exceptional_return: %landing_pad = landingpad { i8*, i32 } cleanup %relocate_token = extractvalue { i8*, i32 } %landing_pad, 1 %obj.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(i32 % relocate_token, i32 13, i32 13) %obj1.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(i32 % relocate_token, i32 14, i32 14) ret i64 addrspace(1)* %obj1.relocated1 Each gc.relocate needs to take its corresponding gc.statepoint as its first argument. However, on the unwind path, there is no way to get gc.statepoint directly because the return value of the InvokeInst is undefined there. In this scenario, we tie gc.relocate to the landingpad, and use the landingpad to find its unique predecessor to get the corresponding gc.statepoint. We pick the selector value from the landingpad to feed into gc.relocate just because it has the same type (i32) as gc.statepoint's return type. The actual value of the selector doesn’t really matter because gc.relocate only uses it as a reference to find gc.statepoint and not consume it during lowering. However, this will no longer work if we change gc.statepoint's return type to token type. To make it work, I could see two potential approaches. 1) add support of token type inside struct type so that we can have a landingpad with result type of { i8*, token }, or 2) add support of landingpad with a token result type. Approach 1 seems to be easier since all the other parts of statepoint handling does not need to be changed at all, and having a selector of token type also seems reasonable (furthermore, we don’t ever need to extract selector value to do exception handling in our code base so I think only supporting token type in struct should be enough for us). Approach 2 requires to modify the way how gc.relocate looks up for its corresponding gc.statepoint through landingpad, but shouldn’t be hard either. Does either of the approaches sound reasonable to you? Other ideas are also welcomed :) Thank you very much! Best, Chen -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151204/d084d66b/attachment.html>
Chen Li via llvm-dev
2015-Dec-05 05:33 UTC
[llvm-dev] Support token type in struct for landingpad
> On Dec 4, 2015, at 1:27 PM, Joseph Tremoulet <jotrem at microsoft.com> wrote: > > > I dont have a concrete design right now and I am happy to take any other ideas > > Three ideas come to mind, none of which are perfect: > > 1) I'm tempted to say that now that we have token type, landingpad should generally produce a token, the pointer should be extracted with the @llvm.eh.exceptionpointer intrinsic instead of an extractvalue, and the selector should likewise be extracted with a new @llvm.eh.exceptionselector intrinsic instead of extractvalue (and personalities that communicate other things via their landingpads would need to add similar intrinsics to extract them, like the @llvm.eh.exceptioncode intrinsic that SEH uses). But that would require updating all the front-ends generating landingpads, and be awkward for any target personality routines that literally do pass a struct to the landing pad (are there any?), and so probably just reflects my bias coming from dealing with a personality using catchpads/cleanuppads instead of landingpads. > > 2) Since you're not actually using the landingpad's exception selector nor, if I understand " This is enough to support the gc.statepoint work " correctly, its exception pointer, it's possible that TLI.getExceptionPointerRegister and TLI.getExceptionSelectorRegister should be returning NoRegister for your personality. That would require modifying the EHPersonality enum and corresponding string matching in Analysis/EHPersonalities.h to recognize your personality, but I think that would be fine (it highlights a potential scaling issue if we add lots of targets that each need this, but that's a somewhat independent and pre-existing issue, and in reality I doubt you'd be opening a floodgate here).Yes, we don’t neither the exception selector nor the exception pointer. I’ve taken a similar approach as #2 you suggested, but instead of modifying the personality, I directly add a check in visitLandingPad to see if landingpad is token type. If so, don’t create DAG node for selector and exception pointer even if TLI.getExceptionPointerRegister and TLI.getExceptionSelectorRegister are not zero or NoRegister. I think this works correctly and I’ve passed all the tests I have. I plan to check this in as the first step, but I’d like to actually support selector and exception pointer extraction for landingpad of token type. Personally I like #1 you suggested, but I am also afraid that requires updating all front-ends generating landingpads. I think #4 might be the most practical approach that I could think of. thanks, chen> > 3) Maybe the default should be switched, so that TLI.getExceptionPointerRegister and TLI.getExceptionSelectorRegister returnNoRegister for EHPersonality::Unknown, and only return actual registers for personalities they recognized. This would require any targets using landingpads with exception pointers / exception selectors to update their code and add themselves to Analysis/EHPersonalities.h, similar to how #2 would require adding your personality, so it seems more disruptive if conceptually a touch cleaner than #2. > > 4) Explicitly checking for token type in visitLandingPad as you suggest sounds okay to me as a pragmatic approach, too. > > I'd probably lean toward #2 as being the least disruptive and most explicit/straightforward about the personality's expectations, but I'm curious what others think. > > Thanks > -Joseph > <> > From: Chen Li [mailto:meloli87 at gmail.com] > Sent: Thursday, December 3, 2015 4:06 PM > To: David Majnemer <david.majnemer at gmail.com> > Cc: Igor Laevsky <igor at azulsystems.com>; llvm-dev <llvm-dev at lists.llvm.org>; Joseph Tremoulet <jotrem at microsoft.com> > Subject: Re: Support token type in struct for landingpad > > Hi David and Joseph, > > I’ve just added landingpad with token type locally and changed gc.relocate to work in the following way: > > %0 = invoke token (i64, i32, void (i64 addrspace(1)*)*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidp1i64f(i64 0, i32 0, void (i64 addrspace(1)*)* @some_call, i32 1, i32 0, i64 addrspace(1)* %obj, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i64 addrspace(1)* %obj, i64 addrspace(1)* %obj1) > to label %invoke_safepoint_normal_dest unwind label %exceptional_return > > invoke_safepoint_normal_dest: > ... > > exceptional_return: > %landing_pad = landingpad token > cleanup > %obj.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(token %landing_pad, i32 13, i32 13) > %obj1.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(token %landing_pad, i32 14, i32 14) > ret i64 addrspace(1)* %obj1.relocated1 > > > Now gc.statepoint return a token type instead of i32 type, and gc.relocate also takes a token type as its first argument (the first argument should either be the corresponding gc.statepoint for call statepoint or invoke statepoint on the normal path, or a reference that could help find the corresponding gc.statepoint on the unwind the path). And since landingpad produces a token type here as well, it can be passed as the reference to the gc.relocate’s first argument. > > To make this work, I have changed two parts of the code. First is how gc.relocate looks up for its corresponding gc.statepoint on the unwind path. It used to use the extracted selector value to find the landingpad and then use the landingpad to find the invoke instruction, which is the gc.statepoint. Now, it can use the landingpad directly to find the invoke instruction. The second part is to make landingpad work with token type. In LLVM’s front end (passes before SelectionDAG), there is no restrictions on what type a landingpad should have (there are test cases in LLVM that has landingpad of i8 or i32 type). However, in SelectionDAGBuilder::visitLandingPad, it is enforced that landingpad must be two-valued (type of { i8*, i32 }), in which way it can handle the exception pointer and selector value inside it. As the first step, I’d like to just add a check to see if the landingpad is of token type, and if so stop it and don’t bother to create the DAG nodes for the exception pointer and selector value (same as what happens during SjLj exceptions). This is enough to support the gc.statepoint work but will not support for C++ style exception handling with gc.statepoint. As for follow-up work, I’d like to add some support to extract selector value from token landingpad. I think we could either do it explicitly in IR (maybe add a intrinsic call extract.selector or something similar) or implicitly during SelectionDAG (in visitLandingPad, check if it’s token type, and if so add an additional transform to extract the exception pointer and selector value from the token). I dont have a concrete design right now and I am happy to take any other ideas. My plan is to get the first step checked in and incrementally work on the follow-up work. Does that sound a reasonable approach to you guys? > > thanks, > chen > > > > On Dec 2, 2015, at 9:47 AM, Chen Li <meloli87 at gmail.com <mailto:meloli87 at gmail.com>> wrote: > > > On Dec 1, 2015, at 11:14 PM, David Majnemer <david.majnemer at gmail.com <mailto:david.majnemer at gmail.com>> wrote: > > While we support 'opaque' types nested within struct types, they are not exactly battle tested: > > $ cat t.ll > %opaque_ty = type opaque > > %struct_ty = type { i32, %opaque_ty } > > define %struct_ty @f(%struct_ty* %p) { > %load = load %struct_ty, %struct_ty* %p > ret %struct_ty %load > } > > $ opt -O2 t.ll -S > lib/IR/DataLayout.cpp:623: unsigned int llvm::DataLayout::getAlignment(llvm::Type *, bool) const: Assertion `Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"' failed. > > Thanks David! I’ve actually hacked to add token type into struct type and ended up with the same failure as above. I will take a look at the catchpad and cleanuppad code, and create a patch to add token landingpad and have you review it. > > thanks, > chen > > > As a practical matter, I fear nesting 'token' types within struct types will have similar issues. Beyond that, the design philosophy behind 'token' is that it is incredibly opaque and permitting it to nest inside a struct creates scenarios where one might try to GEP to the end of the field right before the token field in an attempt to examine or manipulate the token. > > Your other recommendation, having landingpad produce a token, is quite similar to how we've designed catchpad and cleanuppad. I think that direction will be quite nice. > > On Tue, Dec 1, 2015 at 8:07 PM, Chen Li <meloli87 at gmail.com <mailto:meloli87 at gmail.com>> wrote: > Hi David, > > Sorry to bother you, but I would like to get some suggestions on your recent work of token type. > > I’m currently working on changing gc.statepoint to return a token type instead of a i32 type. The reason is that with the current implementation, gc.statepoint could potentially be fed into PHI nodes, and break RewriteStatepointsForGC pass later. Using token type would help us to avoid this. I have most of the code work but got a problem when gc.statepint is an InvokeInst and has an unwind path. > > Currently, gc.statepoint of InvokeInst works as below (the code snippet is from test/CodeGen/X86/statepoint-invoke.ll): > > %0 = invoke i32 (i64, i32, void (i64 addrspace(1)*)*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidp1i64f(i64 0, i32 0, void (i64 addrspace(1)*)* @some_call, i32 1, i32 0, i64 addrspace(1)* %obj, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i64 addrspace(1)* %obj, i64 addrspace(1)* %obj1) > to label %invoke_safepoint_normal_dest unwind label %exceptional_return > > invoke_safepoint_normal_dest: > … > > exceptional_return: > %landing_pad = landingpad { i8*, i32 } > cleanup > %relocate_token = extractvalue { i8*, i32 } %landing_pad, 1 > %obj.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(i32 % relocate_token, i32 13, i32 13) > %obj1.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(i32 % relocate_token, i32 14, i32 14) > ret i64 addrspace(1)* %obj1.relocated1 > > > Each gc.relocate needs to take its corresponding gc.statepoint as its first argument. However, on the unwind path, there is no way to get gc.statepoint directly because the return value of the InvokeInst is undefined there. In this scenario, we tie gc.relocate to the landingpad, and use the landingpad to find its unique predecessor to get the corresponding gc.statepoint. We pick the selector value from the landingpad to feed into gc.relocate just because it has the same type (i32) as gc.statepoint's return type. The actual value of the selector doesn’t really matter because gc.relocate only uses it as a reference to find gc.statepoint and not consume it during lowering. > > However, this will no longer work if we change gc.statepoint's return type to token type. To make it work, I could see two potential approaches. 1) add support of token type inside struct type so that we can have a landingpad with result type of { i8*, token }, or 2) add support of landingpad with a token result type. Approach 1 seems to be easier since all the other parts of statepoint handling does not need to be changed at all, and having a selector of token type also seems reasonable (furthermore, we don’t ever need to extract selector value to do exception handling in our code base so I think only supporting token type in struct should be enough for us). Approach 2 requires to modify the way how gc.relocate looks up for its corresponding gc.statepoint through landingpad, but shouldn’t be hard either. > > Does either of the approaches sound reasonable to you? Other ideas are also welcomed :) > > Thank you very much! > > Best, > Chen-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151204/9cd0db81/attachment-0001.html>
Joseph Tremoulet via llvm-dev
2015-Dec-06 01:58 UTC
[llvm-dev] Support token type in struct for landingpad
It seems a little backwards to me to check the landingpad's type as the first check, since the personality dictates the landingpad's type. That said, yes I see how #4 is expedient in allowing personalities using the two main types to lump themselves into EHPersonality::Unknown. As for supporting selector and exception pointer extraction for landingpad of token type, I think you'll want to look at the handling of the @llvm.eh.exceptionpointer intrinsic that's used with catchpads and basically do the same thing with landingpads. Thanks -Joseph From: Chen Li [mailto:meloli87 at gmail.com] Sent: Saturday, December 5, 2015 12:34 AM To: Joseph Tremoulet <jotrem at microsoft.com> Cc: David Majnemer <david.majnemer at gmail.com>; Igor Laevsky <igor at azulsystems.com>; llvm-dev <llvm-dev at lists.llvm.org>; John McCall <rjmccall at apple.com> Subject: Re: Support token type in struct for landingpad On Dec 4, 2015, at 1:27 PM, Joseph Tremoulet <jotrem at microsoft.com<mailto:jotrem at microsoft.com>> wrote:> I dont have a concrete design right now and I am happy to take any other ideasThree ideas come to mind, none of which are perfect: 1) I'm tempted to say that now that we have token type, landingpad should generally produce a token, the pointer should be extracted with the @llvm.eh.exceptionpointer intrinsic instead of an extractvalue, and the selector should likewise be extracted with a new @llvm.eh.exceptionselector intrinsic instead of extractvalue (and personalities that communicate other things via their landingpads would need to add similar intrinsics to extract them, like the @llvm.eh.exceptioncode intrinsic that SEH uses). But that would require updating all the front-ends generating landingpads, and be awkward for any target personality routines that literally do pass a struct to the landing pad (are there any?), and so probably just reflects my bias coming from dealing with a personality using catchpads/cleanuppads instead of landingpads. 2) Since you're not actually using the landingpad's exception selector nor, if I understand " This is enough to support the gc.statepoint work " correctly, its exception pointer, it's possible that TLI.getExceptionPointerRegister and TLI.getExceptionSelectorRegister should be returning NoRegister for your personality. That would require modifying the EHPersonality enum and corresponding string matching in Analysis/EHPersonalities.h to recognize your personality, but I think that would be fine (it highlights a potential scaling issue if we add lots of targets that each need this, but that's a somewhat independent and pre-existing issue, and in reality I doubt you'd be opening a floodgate here). Yes, we don’t neither the exception selector nor the exception pointer. I’ve taken a similar approach as #2 you suggested, but instead of modifying the personality, I directly add a check in visitLandingPad to see if landingpad is token type. If so, don’t create DAG node for selector and exception pointer even if TLI.getExceptionPointerRegister and TLI.getExceptionSelectorRegister are not zero or NoRegister. I think this works correctly and I’ve passed all the tests I have. I plan to check this in as the first step, but I’d like to actually support selector and exception pointer extraction for landingpad of token type. Personally I like #1 you suggested, but I am also afraid that requires updating all front-ends generating landingpads. I think #4 might be the most practical approach that I could think of. thanks, chen 3) Maybe the default should be switched, so that TLI.getExceptionPointerRegister and TLI.getExceptionSelectorRegister returnNoRegister for EHPersonality::Unknown, and only return actual registers for personalities they recognized. This would require any targets using landingpads with exception pointers / exception selectors to update their code and add themselves to Analysis/EHPersonalities.h, similar to how #2 would require adding your personality, so it seems more disruptive if conceptually a touch cleaner than #2. 4) Explicitly checking for token type in visitLandingPad as you suggest sounds okay to me as a pragmatic approach, too. I'd probably lean toward #2 as being the least disruptive and most explicit/straightforward about the personality's expectations, but I'm curious what others think. Thanks -Joseph From: Chen Li [mailto:meloli87 at gmail.com] Sent: Thursday, December 3, 2015 4:06 PM To: David Majnemer <david.majnemer at gmail.com<mailto:david.majnemer at gmail.com>> Cc: Igor Laevsky <igor at azulsystems.com<mailto:igor at azulsystems.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>; Joseph Tremoulet <jotrem at microsoft.com<mailto:jotrem at microsoft.com>> Subject: Re: Support token type in struct for landingpad Hi David and Joseph, I’ve just added landingpad with token type locally and changed gc.relocate to work in the following way: %0 = invoke token (i64, i32, void (i64 addrspace(1)*)*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidp1i64f(i64 0, i32 0, void (i64 addrspace(1)*)* @some_call, i32 1, i32 0, i64 addrspace(1)* %obj, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i64 addrspace(1)* %obj, i64 addrspace(1)* %obj1) to label %invoke_safepoint_normal_dest unwind label %exceptional_return invoke_safepoint_normal_dest: ... exceptional_return: %landing_pad = landingpad token cleanup %obj.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(token %landing_pad, i32 13, i32 13) %obj1.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(token %landing_pad, i32 14, i32 14) ret i64 addrspace(1)* %obj1.relocated1 Now gc.statepoint return a token type instead of i32 type, and gc.relocate also takes a token type as its first argument (the first argument should either be the corresponding gc.statepoint for call statepoint or invoke statepoint on the normal path, or a reference that could help find the corresponding gc.statepoint on the unwind the path). And since landingpad produces a token type here as well, it can be passed as the reference to the gc.relocate’s first argument. To make this work, I have changed two parts of the code. First is how gc.relocate looks up for its corresponding gc.statepoint on the unwind path. It used to use the extracted selector value to find the landingpad and then use the landingpad to find the invoke instruction, which is the gc.statepoint. Now, it can use the landingpad directly to find the invoke instruction. The second part is to make landingpad work with token type. In LLVM’s front end (passes before SelectionDAG), there is no restrictions on what type a landingpad should have (there are test cases in LLVM that has landingpad of i8 or i32 type). However, in SelectionDAGBuilder::visitLandingPad, it is enforced that landingpad must be two-valued (type of { i8*, i32 }), in which way it can handle the exception pointer and selector value inside it. As the first step, I’d like to just add a check to see if the landingpad is of token type, and if so stop it and don’t bother to create the DAG nodes for the exception pointer and selector value (same as what happens during SjLj exceptions). This is enough to support the gc.statepoint work but will not support for C++ style exception handling with gc.statepoint. As for follow-up work, I’d like to add some support to extract selector value from token landingpad. I think we could either do it explicitly in IR (maybe add a intrinsic call extract.selector or something similar) or implicitly during SelectionDAG (in visitLandingPad, check if it’s token type, and if so add an additional transform to extract the exception pointer and selector value from the token). I dont have a concrete design right now and I am happy to take any other ideas. My plan is to get the first step checked in and incrementally work on the follow-up work. Does that sound a reasonable approach to you guys? thanks, chen On Dec 2, 2015, at 9:47 AM, Chen Li <meloli87 at gmail.com<mailto:meloli87 at gmail.com>> wrote: On Dec 1, 2015, at 11:14 PM, David Majnemer <david.majnemer at gmail.com<mailto:david.majnemer at gmail.com>> wrote: While we support 'opaque' types nested within struct types, they are not exactly battle tested: $ cat t.ll %opaque_ty = type opaque %struct_ty = type { i32, %opaque_ty } define %struct_ty @f(%struct_ty* %p) { %load = load %struct_ty, %struct_ty* %p ret %struct_ty %load } $ opt -O2 t.ll -S lib/IR/DataLayout.cpp:623: unsigned int llvm::DataLayout::getAlignment(llvm::Type *, bool) const: Assertion `Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"' failed. Thanks David! I’ve actually hacked to add token type into struct type and ended up with the same failure as above. I will take a look at the catchpad and cleanuppad code, and create a patch to add token landingpad and have you review it. thanks, chen As a practical matter, I fear nesting 'token' types within struct types will have similar issues. Beyond that, the design philosophy behind 'token' is that it is incredibly opaque and permitting it to nest inside a struct creates scenarios where one might try to GEP to the end of the field right before the token field in an attempt to examine or manipulate the token. Your other recommendation, having landingpad produce a token, is quite similar to how we've designed catchpad and cleanuppad. I think that direction will be quite nice. On Tue, Dec 1, 2015 at 8:07 PM, Chen Li <meloli87 at gmail.com<mailto:meloli87 at gmail.com>> wrote: Hi David, Sorry to bother you, but I would like to get some suggestions on your recent work of token type. I’m currently working on changing gc.statepoint to return a token type instead of a i32 type. The reason is that with the current implementation, gc.statepoint could potentially be fed into PHI nodes, and break RewriteStatepointsForGC pass later. Using token type would help us to avoid this. I have most of the code work but got a problem when gc.statepint is an InvokeInst and has an unwind path. Currently, gc.statepoint of InvokeInst works as below (the code snippet is from test/CodeGen/X86/statepoint-invoke.ll): %0 = invoke i32 (i64, i32, void (i64 addrspace(1)*)*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidp1i64f(i64 0, i32 0, void (i64 addrspace(1)*)* @some_call, i32 1, i32 0, i64 addrspace(1)* %obj, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i64 addrspace(1)* %obj, i64 addrspace(1)* %obj1) to label %invoke_safepoint_normal_dest unwind label %exceptional_return invoke_safepoint_normal_dest: … exceptional_return: %landing_pad = landingpad { i8*, i32 } cleanup %relocate_token = extractvalue { i8*, i32 } %landing_pad, 1 %obj.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(i32 % relocate_token, i32 13, i32 13) %obj1.relocated1 = call coldcc i64 addrspace(1)* @llvm.experimental.gc.relocate.p1i64(i32 % relocate_token, i32 14, i32 14) ret i64 addrspace(1)* %obj1.relocated1 Each gc.relocate needs to take its corresponding gc.statepoint as its first argument. However, on the unwind path, there is no way to get gc.statepoint directly because the return value of the InvokeInst is undefined there. In this scenario, we tie gc.relocate to the landingpad, and use the landingpad to find its unique predecessor to get the corresponding gc.statepoint. We pick the selector value from the landingpad to feed into gc.relocate just because it has the same type (i32) as gc.statepoint's return type. The actual value of the selector doesn’t really matter because gc.relocate only uses it as a reference to find gc.statepoint and not consume it during lowering. However, this will no longer work if we change gc.statepoint's return type to token type. To make it work, I could see two potential approaches. 1) add support of token type inside struct type so that we can have a landingpad with result type of { i8*, token }, or 2) add support of landingpad with a token result type. Approach 1 seems to be easier since all the other parts of statepoint handling does not need to be changed at all, and having a selector of token type also seems reasonable (furthermore, we don’t ever need to extract selector value to do exception handling in our code base so I think only supporting token type in struct should be enough for us). Approach 2 requires to modify the way how gc.relocate looks up for its corresponding gc.statepoint through landingpad, but shouldn’t be hard either. Does either of the approaches sound reasonable to you? Other ideas are also welcomed :) Thank you very much! Best, Chen -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151206/6cb49de6/attachment.html>