thr3ads.net - llvm dev - [llvm-dev] Extracting values from tokens [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Joseph Tremoulet via llvm-dev

2015-Aug-26 18:38 UTC

[llvm-dev] Extracting values from tokens

Hi,

Now that we have the token type (http://reviews.llvm.org/rL245029), I need an
operation that will "extract" a non-token value from a token.  I know
people have several use cases in mind for tokens, so I wanted to solicit
feedback on how general the solution should be (so I've cc'ed the people
from the review of the token change).  I'm also interested in getting
consensus so that as "extraction"s get added for each use case they
have similar look-and-feel.

My particular need here is very narrow:  I need the 'catchpad' operation
to define a value which is a pointer to the on-heap exception object it catches
(which my target's personality routine will supply to the handler code). 
Since the 'catchpad' operation is defined as producing a token, in order
to get at the exception pointer I need some operation that can take that token
as input and produce the exception pointer as output.

Going fully general, I could imagine having an operator with a name like
'tokenextract' that is parameterized by the type it produces and accepts
one argument of type token plus zero or more arguments of arbitrary type which
indicate what is being extracted.  If we're ever going to want to support
orthogonal kinds of extractions operating on the same token value, I think that
approach would break down because it doesn't give a good way to specify
which kind of extraction is being performed.  On the other hand, I think
it's entirely plausible that each token-producing operator will only ever
have a fixed set of extractions that make sense for it, so this could be a
workable solution under the assumption that the way to interpret '%x =
tokenextract %tok, ty1 %arg1, ty2 %arg2' (for the sake of e.g. lowering out
some construct that is represented using token linkage) is to first look at the
operator defining %tok, and then interpret the selector args in the context of
that operation.  This in turn implies that each token-producing operator's
definition (in the Lang Ref) should spell out what can be extracted from it and
what its convention for selector args is.  To my mind, that's a bit too
convoluted, and the informal description of an operator's selector arg
convention really seems like something that one ought to be able to specify as
typing rules.

So I find myself arguing against a fully general solution here.  I think instead
it makes sense for each kind of extraction to specify an intrinsic that
represents it, with the argument/return types specified in the usual way as the
signature of the intrinsic.  And on a case-by-case basis any intrinsic could be
replaced with an instruction, following the same process that any other
operation follows as it finds its way into the IR.

Ironically, the intrinsic approach that I'm advocating is awkward for my
actual use case of extracting an exception pointer from a catchpad -- the
argument and return types should really be dictated by the personality routine,
and so can vary from function to function, but intrinsics only support a limited
form of overloading.  But I think it would be ok to start with an intrinsic
(called @llvm.eh.get_pad_param or something) that can be overloaded to return
anyptr (or maybe anyptr + anyint) and not worry about more overloading
until/unless we have more use cases.

Thoughts?

Thanks
-Joseph
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150826/4a33a110/attachment.html>

Philip Reames via llvm-dev

2015-Aug-27 22:24 UTC

head link

[llvm-dev] Extracting values from tokens

On 08/26/2015 11:38 AM, Joseph Tremoulet wrote:>
> Hi,
>
> Now that we have the token type (http://reviews.llvm.org/rL245029), I 
> need an operation that will "extract" a non-token value from a
token.
> I know people have several use cases in mind for tokens, so I wanted 
> to solicit feedback on how general the solution should be (so I've 
> cc'ed the people from the review of the token change). I'm also 
> interested in getting consensus so that as "extraction"s get
added for
> each use case they have similar look-and-feel.
>
> My particular need here is very narrow:  I need the 'catchpad' 
> operation to define a value which is a pointer to the on-heap 
> exception object it catches (which my target's personality routine 
> will supply to the handler code).  Since the 'catchpad' operation
is
> defined as producing a token, in order to get at the exception pointer 
> I need some operation that can take that token as input and produce 
> the exception pointer as output.
>
> Going fully general, I could imagine having an operator with a name 
> like 'tokenextract' that is parameterized by the type it produces
and
> accepts one argument of type token plus zero or more arguments of 
> arbitrary type which indicate what is being extracted.  If we're ever 
> going to want to support orthogonal kinds of extractions operating on 
> the same token value, I think that approach would break down because 
> it doesn't give a good way to specify which kind of extraction is 
> being performed.  On the other hand, I think it's entirely plausible 
> that each token-producing operator will only ever have a fixed set of 
> extractions that make sense for it, so this could be a workable 
> solution under the assumption that the way to interpret '%x = 
> tokenextract %tok, ty1 %arg1, ty2 %arg2' (for the sake of e.g. 
> lowering out some construct that is represented using token linkage) 
> is to first look at the operator defining %tok, and then interpret the 
> selector args in the context of that operation.  This in turn implies 
> that each token-producing operator's definition (in the Lang Ref) 
> should spell out what can be extracted from it and what its convention 
> for selector args is.  To my mind, that's a bit too convoluted, and 
> the informal description of an operator's selector arg convention 
> really seems like something that one ought to be able to specify as 
> typing rules.
>
> So I find myself arguing against a fully general solution here.  I 
> think instead it makes sense for each kind of extraction to specify an 
> intrinsic that represents it, with the argument/return types specified 
> in the usual way as the signature of the intrinsic.  And on a 
> case-by-case basis any intrinsic could be replaced with an 
> instruction, following the same process that any other operation 
> follows as it finds its way into the IR.
>After reading your description, I find myself with no strong opinion 
either direction.  Your discussion of the pros and cons of each approach 
covers the topic well.  I'd be perfectly willing to go either direction 
due to the lack of a compelling argument in one direction.  I'd probably 
lean towards the generic version myself, but I'm happy to defer to the 
people actual working on using the mechanism at the
moment.>
> Ironically, the intrinsic approach that I'm advocating is awkward for 
> my actual use case of extracting an exception pointer from a catchpad 
> -- the argument and return types should really be dictated by the 
> personality routine, and so can vary from function to function, but 
> intrinsics only support a limited form of overloading.  But I think it 
> would be ok to start with an intrinsic (called @llvm.eh.get_pad_param 
> or something) that can be overloaded to return anyptr (or maybe anyptr 
> + anyint) and not worry about more overloading until/unless we have 
> more use cases.
>Seems reasonable to me.

We could also go with a generic mechanism based on a variadic intrinsic 
if we wanted.  We have all of the building blocks for this between 
gc.result and gc.statepoint.  If we combined a variadic argument list 
with anyany result, we'd get an intrinsic with close to the semantics of 
the instruction you were considering.  We could potentially use this to 
prototype both approaches and see which one appears less
ugly.>
> Thoughts?
>
> Thanks
>
> -Joseph
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150827/19fd0def/attachment.html>

Reid Kleckner via llvm-dev

2015-Aug-27 23:33 UTC

head link

[llvm-dev] Extracting values from tokens

I think you're right that intrinsics are better than 'extracttoken'.

The intrinsic tells you what kind of data you want out of the token, and
codegen will fail in an obvious way when you use an intrinsic on the wrong
kind of token. For example, if we tried to extract the SEH exception code
from a statepoint, codegen can abort rather than perhaps working
accidentally.

Sounds like a plan. :)

On Wed, Aug 26, 2015 at 11:38 AM, Joseph Tremoulet <jotrem at
microsoft.com>
wrote:
> Hi,
>
>
>
> Now that we have the token type (http://reviews.llvm.org/rL245029), I
> need an operation that will "extract" a non-token value from a
token.  I
> know people have several use cases in mind for tokens, so I wanted to
> solicit feedback on how general the solution should be (so I've
cc'ed the
> people from the review of the token change).  I'm also interested in
> getting consensus so that as "extraction"s get added for each use
case they
> have similar look-and-feel.
>
>
>
> My particular need here is very narrow:  I need the 'catchpad'
operation
> to define a value which is a pointer to the on-heap exception object it
> catches (which my target's personality routine will supply to the
handler
> code).  Since the 'catchpad' operation is defined as producing a
token, in
> order to get at the exception pointer I need some operation that can take
> that token as input and produce the exception pointer as output.
>
>
>
> Going fully general, I could imagine having an operator with a name like
> 'tokenextract' that is parameterized by the type it produces and
accepts
> one argument of type token plus zero or more arguments of arbitrary type
> which indicate what is being extracted.  If we're ever going to want to
> support orthogonal kinds of extractions operating on the same token value,
> I think that approach would break down because it doesn't give a good
way
> to specify which kind of extraction is being performed.  On the other hand,
> I think it's entirely plausible that each token-producing operator will
> only ever have a fixed set of extractions that make sense for it, so this
> could be a workable solution under the assumption that the way to interpret
> '%x = tokenextract %tok, ty1 %arg1, ty2 %arg2' (for the sake of
e.g.
> lowering out some construct that is represented using token linkage) is to
> first look at the operator defining %tok, and then interpret the selector
> args in the context of that operation.  This in turn implies that each
> token-producing operator's definition (in the Lang Ref) should spell
out
> what can be extracted from it and what its convention for selector args
> is.  To my mind, that's a bit too convoluted, and the informal
description
> of an operator's selector arg convention really seems like something
that
> one ought to be able to specify as typing rules.
>
>
>
> So I find myself arguing against a fully general solution here.  I think
> instead it makes sense for each kind of extraction to specify an intrinsic
> that represents it, with the argument/return types specified in the usual
> way as the signature of the intrinsic.  And on a case-by-case basis any
> intrinsic could be replaced with an instruction, following the same process
> that any other operation follows as it finds its way into the IR.
>
>
>
> Ironically, the intrinsic approach that I'm advocating is awkward for
my
> actual use case of extracting an exception pointer from a catchpad -- the
> argument and return types should really be dictated by the personality
> routine, and so can vary from function to function, but intrinsics only
> support a limited form of overloading.  But I think it would be ok to start
> with an intrinsic (called @llvm.eh.get_pad_param or something) that can be
> overloaded to return anyptr (or maybe anyptr + anyint) and not worry about
> more overloading until/unless we have more use cases.
>
>
>
> Thoughts?
>
>
>
> Thanks
>
> -Joseph
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150827/77dc0843/attachment.html>

llvm dev - Aug 2015 - Extracting values from tokens

[llvm-dev] Extracting values from tokens

[llvm-dev] Extracting values from tokens

[llvm-dev] Extracting values from tokens