thr3ads.net - llvm dev - [llvm-dev] [RFC] IR-level Region Annotations [Jan 2017]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2017-Jan-20 04:28 UTC

[llvm-dev] [RFC] IR-level Region Annotations

On 01/19/2017 03:36 PM, Mehdi Amini via llvm-dev wrote:>
>> On Jan 19, 2017, at 1:32 PM, Daniel Berlin <dberlin at dberlin.org 
>> <mailto:dberlin at dberlin.org>> wrote:
>>
>>
>>
>> On Thu, Jan 19, 2017 at 1:12 PM, Mehdi Amini<mehdi.amini at
apple.com
>> <mailto:mehdi.amini at apple.com>>wrote:
>>
>>
>>>     On Jan 19, 2017, at 12:04 PM, Daniel Berlin <dberlin at
dberlin.org
>>>     <mailto:dberlin at dberlin.org>> wrote:
>>>
>>>
>>>
>>>     On Thu, Jan 19, 2017 at 11:46 AM, Mehdi Amini via
>>>     llvm-dev<llvm-dev at lists.llvm.org
>>>     <mailto:llvm-dev at lists.llvm.org>>wrote:
>>>
>>>
>>>         > On Jan 19, 2017, at 11:36 AM, Adve, Vikram Sadanand
via
>>>         llvm-dev <llvm-dev at lists.llvm.org
>>>         <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>         >
>>>         > Hi Johannes,
>>>         >
>>>         >> I am especially curious where you get your data
from.
>>>         Tapir [0] (and to
>>>         >> some degree PIR [1]) have shown that,
counterintuitively,
>>>         only a few changes
>>>         >> to LLVM passes are needed. Tapir was recently used
in an
>>>         MIT class with a
>>>         >> lot of students and it seemed to work well with
only
>>>         minimal changes
>>>         >> to analysis and especially transformation passes.
>>>         >
>>>         > TAPIR is an elegant, small extension and, in
particular, I
>>>         think the idea of asymmetric parallel tasks and control
flow
>>>         is a clever way to express parallelism with serial
>>>         semantics, as in Cilk.  Encoding the control flow
extensions
>>>         as explicit instructions is orthogonal to that, though
>>>         arguably more elegant than using region tags + metadata.
>>>         >
>>>         > However, Cilk is a tiny language compared with the
full
>>>         complexity of other languages, like OpenMP.  To take just
>>>         one example, TAPIR cannot express the ORDERED construct of
>>>         OpenMP.  A more serious concern, IMO, is that TAPIR (like
>>>         Cilk) requires serial semantics, whereas there are many
>>>         parallel languages, OpenMP included, that do not obey that
>>>         restriction. Third, OpenMP has *numerous* clauses, e.g.,
>>>         REDUCTION or PRIVATE,  that are needed because without
that,
>>>         you’d be dependent on fundamentally hard compiler analyses
>>>         to extract the same information for satisfactory parallel
>>>         performance; realistic applications cannot depend on the
>>>         success of such analyses.
>>>
>>>         I agree with this, but I’m also wondering if it needs to be
>>>         first class in the IR?
>>>         For example we know our alias analysis is very basic, and
>>>         C/C++ have a higher constraint thanks to their type system,
>>>         but we didn’t inject this higher level information that
>>>         helps the optimizer as first class IR constructs.
>>>
>>>
>>>     FWIW, while i agree with the general point, i wouldn't use
this
>>>     example.
>>>     Because we pretty much still suffer to this day because of it
>>>     (both in AA, and devirt, and ...)  :)
>>>     We can't always even tell fields apart
>>
>>     Is it inherent to the infrastructure, i.e. using metadata instead
>>     of first class IR construct or is it just a “quality of
>>     implementation” issue?
>>
>>
>> Not to derail this conversation:
>>
>> IMHO, At some point there is no real difference :)
>>
>> Because otherwise, everything is a QOI issue.
>>
>> IE if it's super tricky to get metadata that works well and works 
>> right, doesn't get lost, etc, and that's inherent to using
metadata,
>> that to me is not a QOI issue.
>>
>> So could it be done with metadata? Probably?
>> But at the same time,  if it had been done with more first class 
>> constructs, it would have happened years ago  and been much lower cost.
>
> This is what I meant by “inherent to the infrastructure”, thanks for 
> clarifying.
To clarify, we were proposing metadata that is used as arguments to the 
region-annotation intrinsics. This metadata has the nice property that 
it does not get dropped (so it is just being used as a way of encoding 
whatever data structures are necessary without predefining a syntactic 
schema).

  -Hal
>
> —
> Mehdi
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170119/3d2b7b06/attachment-0001.html>

Sanjoy Das via llvm-dev

2017-Jan-20 05:02 UTC

head link

[llvm-dev] [RFC] IR-level Region Annotations

Hi,

My bias is to use both (b) and (d), since they have complementary
strengths.  We should use (b) for expressing concepts that can't be
semantically modeled as a call or invoke (this branch takes both its
successors), and (d) for expressing things that can be (this call may
never return), and annotation like things (this region (denoted by
def-use of a token) is a reduction).

I don't grok OpenMP, but perhaps we can come with one or two
"generalized control flow"-type instructions that can be used to model
the non-call/invoke like semantics we'd like LLVM to know about, and
model the rest with (d)?

-- Sanjoy

On Thu, Jan 19, 2017 at 8:28 PM, Hal Finkel via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> On 01/19/2017 03:36 PM, Mehdi Amini via llvm-dev wrote:
>
>
> On Jan 19, 2017, at 1:32 PM, Daniel Berlin <dberlin at dberlin.org>
wrote:
>
>
>
> On Thu, Jan 19, 2017 at 1:12 PM, Mehdi Amini <mehdi.amini at
apple.com> wrote:
>>
>>
>> On Jan 19, 2017, at 12:04 PM, Daniel Berlin <dberlin at
dberlin.org> wrote:
>>
>>
>>
>> On Thu, Jan 19, 2017 at 11:46 AM, Mehdi Amini via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>>
>>>
>>> > On Jan 19, 2017, at 11:36 AM, Adve, Vikram Sadanand via
llvm-dev
>>> > <llvm-dev at lists.llvm.org> wrote:
>>> >
>>> > Hi Johannes,
>>> >
>>> >> I am especially curious where you get your data from.
Tapir [0] (and
>>> >> to
>>> >> some degree PIR [1]) have shown that, counterintuitively,
only a few
>>> >> changes
>>> >> to LLVM passes are needed. Tapir was recently used in an
MIT class
>>> >> with a
>>> >> lot of students and it seemed to work well with only
minimal changes
>>> >> to analysis and especially transformation passes.
>>> >
>>> > TAPIR is an elegant, small extension and, in particular, I
think the
>>> > idea of asymmetric parallel tasks and control flow is a clever
way to
>>> > express parallelism with serial semantics, as in Cilk. 
Encoding the control
>>> > flow extensions as explicit instructions is orthogonal to
that, though
>>> > arguably more elegant than using region tags + metadata.
>>> >
>>> > However, Cilk is a tiny language compared with the full
complexity of
>>> > other languages, like OpenMP.  To take just one example, TAPIR
cannot
>>> > express the ORDERED construct of OpenMP.  A more serious
concern, IMO, is
>>> > that TAPIR (like Cilk) requires serial semantics, whereas
there are many
>>> > parallel languages, OpenMP included, that do not obey that
restriction.
>>> > Third, OpenMP has *numerous* clauses, e.g., REDUCTION or
PRIVATE,  that are
>>> > needed because without that, you’d be dependent on
fundamentally hard
>>> > compiler analyses to extract the same information for
satisfactory parallel
>>> > performance; realistic applications cannot depend on the
success of such
>>> > analyses.
>>>
>>> I agree with this, but I’m also wondering if it needs to be first
class
>>> in the IR?
>>> For example we know our alias analysis is very basic, and C/C++
have a
>>> higher constraint thanks to their type system, but we didn’t inject
this
>>> higher level information that helps the optimizer as first class IR
>>> constructs.
>>
>>
>> FWIW, while i agree with the general point, i wouldn't use this
example.
>> Because we pretty much still suffer to this day because of it (both in
AA,
>> and devirt, and ...)  :)
>> We can't always even tell fields apart
>>
>>
>> Is it inherent to the infrastructure, i.e. using metadata instead of
first
>> class IR construct or is it just a “quality of implementation” issue?
>>
>
> Not to derail this conversation:
>
> IMHO, At some point there is no real difference :)
>
> Because otherwise, everything is a QOI issue.
>
> IE if it's super tricky to get metadata that works well and works
right,
> doesn't get lost, etc, and that's inherent to using metadata, that
to me is
> not a QOI issue.
>
> So could it be done with metadata? Probably?
> But at the same time,  if it had been done with more first class
constructs,
> it would have happened years ago  and been much lower cost.
>
>
> This is what I meant by “inherent to the infrastructure”, thanks for
> clarifying.
>
>
> To clarify, we were proposing metadata that is used as arguments to the
> region-annotation intrinsics. This metadata has the nice property that it
> does not get dropped (so it is just being used as a way of encoding
whatever
> data structures are necessary without predefining a syntactic schema).
>
>  -Hal
>
>
> —
> Mehdi
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Adve, Vikram Sadanand via llvm-dev

2017-Jan-20 05:27 UTC

head link

[llvm-dev] [RFC] IR-level Region Annotations

Hi Sanjoy,

Yes, that's exactly what we have been looking at recently here, but the
region tags seem to make it possible to express the control flow as well, so I
think we could start with reg ions+metadata, as Hal and Xinmin proposed, and
then figure out what needs to be first class instructions.

--Vikram Adve


> On Jan 19, 2017, at 11:03 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> 
> Hi,
> 
> My bias is to use both (b) and (d), since they have complementary
> strengths.  We should use (b) for expressing concepts that can't be
> semantically modeled as a call or invoke (this branch takes both its
> successors), and (d) for expressing things that can be (this call may
> never return), and annotation like things (this region (denoted by
> def-use of a token) is a reduction).
> 
> I don't grok OpenMP, but perhaps we can come with one or two
> "generalized control flow"-type instructions that can be used to
model
> the non-call/invoke like semantics we'd like LLVM to know about, and
> model the rest with (d)?
> 
> -- Sanjoy
> 
> On Thu, Jan 19, 2017 at 8:28 PM, Hal Finkel via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> 
>> On 01/19/2017 03:36 PM, Mehdi Amini via llvm-dev wrote:
>> 
>> 
>> On Jan 19, 2017, at 1:32 PM, Daniel Berlin <dberlin at
dberlin.org> wrote:
>> 
>> 
>> 
>>> On Thu, Jan 19, 2017 at 1:12 PM, Mehdi Amini <mehdi.amini at
apple.com> wrote:
>>> 
>>> 
>>> On Jan 19, 2017, at 12:04 PM, Daniel Berlin <dberlin at
dberlin.org> wrote:
>>> 
>>> 
>>> 
>>> On Thu, Jan 19, 2017 at 11:46 AM, Mehdi Amini via llvm-dev
>>> <llvm-dev at lists.llvm.org> wrote:
>>>> 
>>>> 
>>>>> On Jan 19, 2017, at 11:36 AM, Adve, Vikram Sadanand via
llvm-dev
>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>> 
>>>>> Hi Johannes,
>>>>> 
>>>>>> I am especially curious where you get your data from.
Tapir [0] (and
>>>>>> to
>>>>>> some degree PIR [1]) have shown that,
counterintuitively, only a few
>>>>>> changes
>>>>>> to LLVM passes are needed. Tapir was recently used in
an MIT class
>>>>>> with a
>>>>>> lot of students and it seemed to work well with only
minimal changes
>>>>>> to analysis and especially transformation passes.
>>>>> 
>>>>> TAPIR is an elegant, small extension and, in particular, I
think the
>>>>> idea of asymmetric parallel tasks and control flow is a
clever way to
>>>>> express parallelism with serial semantics, as in Cilk. 
Encoding the control
>>>>> flow extensions as explicit instructions is orthogonal to
that, though
>>>>> arguably more elegant than using region tags + metadata.
>>>>> 
>>>>> However, Cilk is a tiny language compared with the full
complexity of
>>>>> other languages, like OpenMP.  To take just one example,
TAPIR cannot
>>>>> express the ORDERED construct of OpenMP.  A more serious
concern, IMO, is
>>>>> that TAPIR (like Cilk) requires serial semantics, whereas
there are many
>>>>> parallel languages, OpenMP included, that do not obey that
restriction.
>>>>> Third, OpenMP has *numerous* clauses, e.g., REDUCTION or
PRIVATE,  that are
>>>>> needed because without that, you’d be dependent on
fundamentally hard
>>>>> compiler analyses to extract the same information for
satisfactory parallel
>>>>> performance; realistic applications cannot depend on the
success of such
>>>>> analyses.
>>>> 
>>>> I agree with this, but I’m also wondering if it needs to be
first class
>>>> in the IR?
>>>> For example we know our alias analysis is very basic, and C/C++
have a
>>>> higher constraint thanks to their type system, but we didn’t
inject this
>>>> higher level information that helps the optimizer as first
class IR
>>>> constructs.
>>> 
>>> 
>>> FWIW, while i agree with the general point, i wouldn't use this
example.
>>> Because we pretty much still suffer to this day because of it (both
in AA,
>>> and devirt, and ...)  :)
>>> We can't always even tell fields apart
>>> 
>>> 
>>> Is it inherent to the infrastructure, i.e. using metadata instead
of first
>>> class IR construct or is it just a “quality of implementation”
issue?
>> 
>> Not to derail this conversation:
>> 
>> IMHO, At some point there is no real difference :)
>> 
>> Because otherwise, everything is a QOI issue.
>> 
>> IE if it's super tricky to get metadata that works well and works
right,
>> doesn't get lost, etc, and that's inherent to using metadata,
that to me is
>> not a QOI issue.
>> 
>> So could it be done with metadata? Probably?
>> But at the same time,  if it had been done with more first class
constructs,
>> it would have happened years ago  and been much lower cost.
>> 
>> 
>> This is what I meant by “inherent to the infrastructure”, thanks for
>> clarifying.
>> 
>> 
>> To clarify, we were proposing metadata that is used as arguments to the
>> region-annotation intrinsics. This metadata has the nice property that
it
>> does not get dropped (so it is just being used as a way of encoding
whatever
>> data structures are necessary without predefining a syntactic schema).
>> 
>> -Hal
>> 
>> 
>> —
>> Mehdi
>> 
>> 
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> 
>> 
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>

llvm dev - Jan 2017 - [RFC] IR-level Region Annotations

[llvm-dev] [RFC] IR-level Region Annotations

[llvm-dev] [RFC] IR-level Region Annotations

[llvm-dev] [RFC] IR-level Region Annotations