thr3ads.net - llvm dev - [llvm-dev] [IR question] Switching on pointers [May 2017]

If this information is useful, please help other people find it:
Share via:

Benoit Vey via llvm-dev

2017-May-16 21:33 UTC

[llvm-dev] [IR question] Switching on pointers

Thanks everyone for your answers.

Mats, my goal is do an alternate implementation of the subtyping test  
in the Pony compiler. I'll try to keep the explanation short to avoid  
cluttering the mail with irrelevant information.

Every object in Pony has a type descriptor, which is used in the  
subtyping test to determine the real type of an object. The current  
algorithm is suboptimal and we're trying out various alternatives. The  
full details are on this github pull request:  
https://github.com/ponylang/ponyc/pull/1752

One of the alternatives, described by Sylvan Clebsch in the 6th  
message on the PR, would involve looking at whether the type  
descriptor of an object with an abstract type is the same as one of  
the subtypes of the abstract type being tested. Here's an equivalent C  
code from the github PR:

     bool SomeTraitTypeName__istype(pony_type_t* desc)
     {
       switch((uintptr_t)desc)
       {
         case 0x878AB00:
         case 0x878AB08:
         case 0x878AB16:
         case 0x878AC00:
         case 0x878AC08:
         case 0x878AD00:
           return true;

         default: {}
       }

       return false;
     }

Since this can be implemented with either a switch or an if/else  
sequence, I think I'll simply do the if/else instead.

mats petersson <mats at planetcatfish.com> a écrit :
> On 15 May 2017 at 17:29, Benoit Vey via llvm-dev <llvm-dev at
lists.llvm.org>
> wrote:
>
>> Hi.
>>
>> First of all, some context. I'm trying to implement a new
functionality in
>> an LLVM-based compiler and I need to take various actions based on the
>> value of a given pointer, the possible values being the addresses of
>> various global constants. I tried to use a `switch` instruction but I
>> encountered several problems.
>>
>> The "ideal switch" I'd like to have would look something
like this (with
>> way more cases):
>>
>>    ; ModuleID = 'module'
>>    source_filename = "module"
>>
>>    @var0 = global i32 0
>>    @var1 = global i32 0
>>
>>    define i32 @func(i32*) {
>>    entry:
>>      switch i32* %0, label %unreachable [
>>        i32* @var0, label %var0
>>        i32* @var1, label %var1
>>      ]
>>
>>    var0:                                             ; preds = %entry
>>      br label %post
>>
>>    var1:                                             ; preds = %entry
>>      br label %post
>>
>>    unreachable:                                      ; preds = %entry
>>      unreachable
>>
>>    post:                                             ; preds = %var1,
>> %var0
>>      %1 = phi i32 [ 0, %var0 ], [ 1, %var1 ]
>>      ret i32 %1
>>    }
>>
>> This example is impossible because a `switch` cannot have pointer
>> operands. So I tried with `ptrtoint`, turning the `switch` into this:
>>
>>      %1 = ptrtoint i32* %0 to i64
>>      switch i64 %1, label %unreachable [
>>        i64 ptrtoint (i32* @var0 to i64), label %var0
>>        i64 ptrtoint (i32* @var1 to i64), label %var1
>>      ]
>>
>> I'm a bit baffled by that one. According to the documentation the
address
>> of a global value is a constant, yet this `switch` is impossible
because
>> the result of the `ptrtoint` isn't a constant integer.
>>
>> At that point, I don't really know how to proceed. My questions
are:
>>
>> - Why is `switch` restricted to integers?
>> - What is the result of a constant `ptrtoint` and why isn't it a
constant
>> integer?
>> - Is there a way to switch on pointers despite these restrictions?
>>
>> Thanks.
>>
>> ----------------------------------------------------------------
>> This message was sent using IMP, the Internet Messaging Program.
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
> Aside from the obvious answer of "the switch needs compile time
constants"
> (what I call "true constants" - they are constants that the LLVM
compiler
> knows the actual value of, not constants as in "won't change
during the
> execution of the program", such as variable and function addresses),
it
> should be noted that unless the compiler & linker can be certain that
for
> example the var0 and var1 address are "next to each other", this
sort of
> construct will not benefit from any "switch" optimisation, it
will be
> exactly like a switch with large, non-close values: an if/else if/else ...
> chain - probably not even able to do any cleverness of building a tree of
> which comes first [in a generic case of this, at the very least]. So if you
> have a case where you need to resolve a language construct where you can
> switch on a reference to a variable referring to a particular variable,
> then you probably have to make it into a compare + branch - or perhaps have
> a table and iterate over the choices. Even if LLVM knew how to do this, it
> wouldn't be possible to produce anything noticeably better [at least I
> can't think of anything - if anyone has some clever ways to solve this
kind
> of problem, I would appreciate a comment...]
>
> I still would like to hear from Benoit as to what the goal is here. I'm
> only guessing that it's a language where you can do something like
"ref > var0; ... switch(ref) case var0: ... ; case var1: ... " or
some such. Not
> that I know for sure what language that is.
>
> --
> Mats
>


----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Sanjoy Das via llvm-dev

2017-May-16 21:51 UTC

head link

[llvm-dev] [IR question] Switching on pointers

Hi Benoit,

This is slightly off topic for the list, but have you checked to see
if "Fast Subtype Checking in the HotSpot JVM" by Cliff Click and John
Rose applies to your use case?

-- Sanjoy

On Tue, May 16, 2017 at 2:33 PM, Benoit Vey via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> Thanks everyone for your answers.
>
> Mats, my goal is do an alternate implementation of the subtyping test in
the
> Pony compiler. I'll try to keep the explanation short to avoid
cluttering
> the mail with irrelevant information.
>
> Every object in Pony has a type descriptor, which is used in the subtyping
> test to determine the real type of an object. The current algorithm is
> suboptimal and we're trying out various alternatives. The full details
are
> on this github pull request: https://github.com/ponylang/ponyc/pull/1752
>
> One of the alternatives, described by Sylvan Clebsch in the 6th message on
> the PR, would involve looking at whether the type descriptor of an object
> with an abstract type is the same as one of the subtypes of the abstract
> type being tested. Here's an equivalent C code from the github PR:
>
>     bool SomeTraitTypeName__istype(pony_type_t* desc)
>     {
>       switch((uintptr_t)desc)
>       {
>         case 0x878AB00:
>         case 0x878AB08:
>         case 0x878AB16:
>         case 0x878AC00:
>         case 0x878AC08:
>         case 0x878AD00:
>           return true;
>
>         default: {}
>       }
>
>       return false;
>     }
>
> Since this can be implemented with either a switch or an if/else sequence,
I
> think I'll simply do the if/else instead.
>
> mats petersson <mats at planetcatfish.com> a écrit :
>
>
>> On 15 May 2017 at 17:29, Benoit Vey via llvm-dev <llvm-dev at
lists.llvm.org>
>> wrote:
>>
>>> Hi.
>>>
>>> First of all, some context. I'm trying to implement a new
functionality
>>> in
>>> an LLVM-based compiler and I need to take various actions based on
the
>>> value of a given pointer, the possible values being the addresses
of
>>> various global constants. I tried to use a `switch` instruction but
I
>>> encountered several problems.
>>>
>>> The "ideal switch" I'd like to have would look
something like this (with
>>> way more cases):
>>>
>>>    ; ModuleID = 'module'
>>>    source_filename = "module"
>>>
>>>    @var0 = global i32 0
>>>    @var1 = global i32 0
>>>
>>>    define i32 @func(i32*) {
>>>    entry:
>>>      switch i32* %0, label %unreachable [
>>>        i32* @var0, label %var0
>>>        i32* @var1, label %var1
>>>      ]
>>>
>>>    var0:                                             ; preds =
%entry
>>>      br label %post
>>>
>>>    var1:                                             ; preds =
%entry
>>>      br label %post
>>>
>>>    unreachable:                                      ; preds =
%entry
>>>      unreachable
>>>
>>>    post:                                             ; preds =
%var1,
>>> %var0
>>>      %1 = phi i32 [ 0, %var0 ], [ 1, %var1 ]
>>>      ret i32 %1
>>>    }
>>>
>>> This example is impossible because a `switch` cannot have pointer
>>> operands. So I tried with `ptrtoint`, turning the `switch` into
this:
>>>
>>>      %1 = ptrtoint i32* %0 to i64
>>>      switch i64 %1, label %unreachable [
>>>        i64 ptrtoint (i32* @var0 to i64), label %var0
>>>        i64 ptrtoint (i32* @var1 to i64), label %var1
>>>      ]
>>>
>>> I'm a bit baffled by that one. According to the documentation
the address
>>> of a global value is a constant, yet this `switch` is impossible
because
>>> the result of the `ptrtoint` isn't a constant integer.
>>>
>>> At that point, I don't really know how to proceed. My questions
are:
>>>
>>> - Why is `switch` restricted to integers?
>>> - What is the result of a constant `ptrtoint` and why isn't it
a constant
>>> integer?
>>> - Is there a way to switch on pointers despite these restrictions?
>>>
>>> Thanks.
>>>
>>> ----------------------------------------------------------------
>>> This message was sent using IMP, the Internet Messaging Program.
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
>> Aside from the obvious answer of "the switch needs compile time
constants"
>> (what I call "true constants" - they are constants that the
LLVM compiler
>> knows the actual value of, not constants as in "won't change
during the
>> execution of the program", such as variable and function
addresses), it
>> should be noted that unless the compiler & linker can be certain
that for
>> example the var0 and var1 address are "next to each other",
this sort of
>> construct will not benefit from any "switch" optimisation, it
will be
>> exactly like a switch with large, non-close values: an if/else if/else
...
>> chain - probably not even able to do any cleverness of building a tree
of
>> which comes first [in a generic case of this, at the very least]. So if
>> you
>> have a case where you need to resolve a language construct where you
can
>> switch on a reference to a variable referring to a particular variable,
>> then you probably have to make it into a compare + branch - or perhaps
>> have
>> a table and iterate over the choices. Even if LLVM knew how to do this,
it
>> wouldn't be possible to produce anything noticeably better [at
least I
>> can't think of anything - if anyone has some clever ways to solve
this
>> kind
>> of problem, I would appreciate a comment...]
>>
>> I still would like to hear from Benoit as to what the goal is here.
I'm
>> only guessing that it's a language where you can do something like
"ref >> var0; ... switch(ref) case var0: ... ; case var1: ... "
or some such. Not
>> that I know for sure what language that is.
>>
>> --
>> Mats
>>
>
>
>
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Jeremy Lakeman via llvm-dev

2017-May-17 03:45 UTC

head link

[llvm-dev] [IR question] Switching on pointers

I would consider putting all the data tables in a new section in the
compiled binary. After runtime linking, you can then enumerate through all
the tables in this section and sort them. You can then write a single
istype function that performs a binary search at runtime.

On Wed, May 17, 2017 at 7:21 AM, Sanjoy Das via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi Benoit,
>
> This is slightly off topic for the list, but have you checked to see
> if "Fast Subtype Checking in the HotSpot JVM" by Cliff Click and
John
> Rose applies to your use case?
>
> -- Sanjoy
>
> On Tue, May 16, 2017 at 2:33 PM, Benoit Vey via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > Thanks everyone for your answers.
> >
> > Mats, my goal is do an alternate implementation of the subtyping test
in
> the
> > Pony compiler. I'll try to keep the explanation short to avoid
cluttering
> > the mail with irrelevant information.
> >
> > Every object in Pony has a type descriptor, which is used in the
> subtyping
> > test to determine the real type of an object. The current algorithm is
> > suboptimal and we're trying out various alternatives. The full
details
> are
> > on this github pull request:
https://github.com/ponylang/ponyc/pull/1752
> >
> > One of the alternatives, described by Sylvan Clebsch in the 6th
message
> on
> > the PR, would involve looking at whether the type descriptor of an
object
> > with an abstract type is the same as one of the subtypes of the
abstract
> > type being tested. Here's an equivalent C code from the github PR:
> >
> >     bool SomeTraitTypeName__istype(pony_type_t* desc)
> >     {
> >       switch((uintptr_t)desc)
> >       {
> >         case 0x878AB00:
> >         case 0x878AB08:
> >         case 0x878AB16:
> >         case 0x878AC00:
> >         case 0x878AC08:
> >         case 0x878AD00:
> >           return true;
> >
> >         default: {}
> >       }
> >
> >       return false;
> >     }
> >
> > Since this can be implemented with either a switch or an if/else
> sequence, I
> > think I'll simply do the if/else instead.
> >
> > mats petersson <mats at planetcatfish.com> a écrit :
> >
> >
> >> On 15 May 2017 at 17:29, Benoit Vey via llvm-dev <
> llvm-dev at lists.llvm.org>
> >> wrote:
> >>
> >>> Hi.
> >>>
> >>> First of all, some context. I'm trying to implement a new
functionality
> >>> in
> >>> an LLVM-based compiler and I need to take various actions
based on the
> >>> value of a given pointer, the possible values being the
addresses of
> >>> various global constants. I tried to use a `switch`
instruction but I
> >>> encountered several problems.
> >>>
> >>> The "ideal switch" I'd like to have would look
something like this
> (with
> >>> way more cases):
> >>>
> >>>    ; ModuleID = 'module'
> >>>    source_filename = "module"
> >>>
> >>>    @var0 = global i32 0
> >>>    @var1 = global i32 0
> >>>
> >>>    define i32 @func(i32*) {
> >>>    entry:
> >>>      switch i32* %0, label %unreachable [
> >>>        i32* @var0, label %var0
> >>>        i32* @var1, label %var1
> >>>      ]
> >>>
> >>>    var0:                                             ; preds =
%entry
> >>>      br label %post
> >>>
> >>>    var1:                                             ; preds =
%entry
> >>>      br label %post
> >>>
> >>>    unreachable:                                      ; preds =
%entry
> >>>      unreachable
> >>>
> >>>    post:                                             ; preds =
%var1,
> >>> %var0
> >>>      %1 = phi i32 [ 0, %var0 ], [ 1, %var1 ]
> >>>      ret i32 %1
> >>>    }
> >>>
> >>> This example is impossible because a `switch` cannot have
pointer
> >>> operands. So I tried with `ptrtoint`, turning the `switch`
into this:
> >>>
> >>>      %1 = ptrtoint i32* %0 to i64
> >>>      switch i64 %1, label %unreachable [
> >>>        i64 ptrtoint (i32* @var0 to i64), label %var0
> >>>        i64 ptrtoint (i32* @var1 to i64), label %var1
> >>>      ]
> >>>
> >>> I'm a bit baffled by that one. According to the
documentation the
> address
> >>> of a global value is a constant, yet this `switch` is
impossible
> because
> >>> the result of the `ptrtoint` isn't a constant integer.
> >>>
> >>> At that point, I don't really know how to proceed. My
questions are:
> >>>
> >>> - Why is `switch` restricted to integers?
> >>> - What is the result of a constant `ptrtoint` and why
isn't it a
> constant
> >>> integer?
> >>> - Is there a way to switch on pointers despite these
restrictions?
> >>>
> >>> Thanks.
> >>>
> >>>
----------------------------------------------------------------
> >>> This message was sent using IMP, the Internet Messaging
Program.
> >>>
> >>>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>
> >>
> >> Aside from the obvious answer of "the switch needs compile
time
> constants"
> >> (what I call "true constants" - they are constants that
the LLVM
> compiler
> >> knows the actual value of, not constants as in "won't
change during the
> >> execution of the program", such as variable and function
addresses), it
> >> should be noted that unless the compiler & linker can be
certain that
> for
> >> example the var0 and var1 address are "next to each
other", this sort of
> >> construct will not benefit from any "switch"
optimisation, it will be
> >> exactly like a switch with large, non-close values: an if/else
if/else
> ...
> >> chain - probably not even able to do any cleverness of building a
tree
> of
> >> which comes first [in a generic case of this, at the very least].
So if
> >> you
> >> have a case where you need to resolve a language construct where
you can
> >> switch on a reference to a variable referring to a particular
variable,
> >> then you probably have to make it into a compare + branch - or
perhaps
> >> have
> >> a table and iterate over the choices. Even if LLVM knew how to do
this,
> it
> >> wouldn't be possible to produce anything noticeably better [at
least I
> >> can't think of anything - if anyone has some clever ways to
solve this
> >> kind
> >> of problem, I would appreciate a comment...]
> >>
> >> I still would like to hear from Benoit as to what the goal is
here. I'm
> >> only guessing that it's a language where you can do something
like "ref
> > >> var0; ... switch(ref) case var0: ... ; case var1: ... "
or some such.
> Not
> >> that I know for sure what language that is.
> >>
> >> --
> >> Mats
> >>
> >
> >
> >
> > ----------------------------------------------------------------
> > This message was sent using IMP, the Internet Messaging Program.
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170517/a3f4d6d6/attachment.html>

llvm dev - May 2017 - [IR question] Switching on pointers

[llvm-dev] [IR question] Switching on pointers

[llvm-dev] [IR question] Switching on pointers

[llvm-dev] [IR question] Switching on pointers