thr3ads.net - llvm dev - [llvm-dev] Metadata in LLVM back-end [Oct 2020]

If this information is useful, please help other people find it:
Share via:

Lorenzo Casalino via llvm-dev

2020-Oct-10 11:13 UTC

[llvm-dev] Metadata in LLVM back-end

> That's the place to start, I think.  Gather a list of requirements/use
> cases along with the challenges we've discussed.  Then it's a
matter of
> engineering a solution that fulfills the requirements while hitting as
> few of the challenges as possible.  Let's start by simply gathering
some
> lists.  I'll take a quick stab and you and others can add to/edit it.
>
> Requirements
> ------------
> - Convey information not readily available in existing IR constructs to
>   very late-stage codegen (after regalloc/scheduling, right through
>   asm/object emission)
I see this more as the GOAL of the RFC, rather than a requirement.
> - Flexible format - it should be as simple as possible to express the
>   desired information while minimizing changes to APIsI do not want to raise a philosophical discussion (although, I would
find it quite interesting), but "flexible" does not necessarely mean
"simple".

We could split this requirement as:

- Flexible format - the format should be expressive enough to enable
modelization
  of *virtually* any kind of information type.

- Simple interface - expressing information and attaching them to MIR
elements (e.g.,
  instructions) should be "easy" (what does it mean *easy*?)
> - Preserve information by default, only drop if explicitly told (I'm
>   trying to capture the requirements for your use-case here and this
>   differs from IR-level metadata)What about giving to end-users the possibility to define a custom
defaultpolicy, as
well as the possibility to define different type of policies.

Further, we must cope with the combination of instructions: the
information associated
to two instructions eligible for combination, how are combined?

- Information transformation - the information associated to two
instruction A, B, which
  are combined into an instruction C, should be properly transformed
according to a
  user-specific policy.

  A default policy may be "assign both information of A and B to C"
(gather-all/assign-all
  policy?)
> - No bifurcation between "well-known"/"built-in"
information and things
>   added later/locally
May I ask you to elaborate a bit more about this point?> - Should not impact compile time excessively (what is
"excessive?")
Probably, such estimation should be performed on
 

What about the granularity level?

- Granularity level - metadata information should be attachable with
different
  level of granularity:

  - *Coarse*: MachineFunction level
  - *Medium*: MachineBasicBlock level
  - *Fine*:   MachineInstruction level

Clearly, there are other degree of granularity and/or dimensions to be
considered
(e.g., LiveInterval, MIBundles, Loops, ...).
> Challenges of using intrinsics and other alternatives
> -----------------------------------------------------
> - Post-SSA annotation/how to associate intrinsics with
>   instructions/registers/types
>
> - Instruction selection fallout (inhibiting folding, etc.)
>
> - Register allocation impacts (extending live ranges, etc.)
>
> - Scheduling challenges (ensuring intrinsics can be found
>   post-scheduling, etc.)
>
> - Extending existing constructs (which ones?) requires hard-coding
>   aspects of information, reducing flexibility
>
> This is currently rather weasily-worded, because I didn't want to
impose
> too many restrictions right off the bat.
>
>                   -David

Sorry for the long delay!

-- Lorenzo

David Greene via llvm-dev

2020-Oct-20 16:36 UTC

head link

[llvm-dev] Metadata in LLVM back-end

Lorenzo Casalino <lorenzo.casalino93 at gmail.com> writes:
>> Requirements
>> ------------
>> - Convey information not readily available in existing IR constructs to
>>   very late-stage codegen (after regalloc/scheduling, right through
>>   asm/object emission)
>
> I see this more as the GOAL of the RFC, rather than a requirement.
Fair enough.
>> - Flexible format - it should be as simple as possible to express the
>>   desired information while minimizing changes to APIs
> I do not want to raise a philosophical discussion (although, I would
> find it quite interesting), but "flexible" does not necessarely
mean
> "simple".
>
> We could split this requirement as:
Good idea to separate these.
> - Flexible format - the format should be expressive enough to enable
> modelization
>   of *virtually* any kind of information type.
>
> - Simple interface - expressing information and attaching them to MIR
> elements (e.g.,
>   instructions) should be "easy" (what does it mean *easy*?)
I would say "easy" means:

- Utilities are available to make maintaining information as transparent
  (automatic) as possible.

- When not automatic, it is straightforward to apply the necessary APIs
  to keep information updated.
>> - Preserve information by default, only drop if explicitly told
(I'm
>>   trying to capture the requirements for your use-case here and this
>>   differs from IR-level metadata)
> What about giving to end-users the possibility to define a custom
> defaultpolicy, as
> well as the possibility to define different type of policies.
Possibly, though that might be overkill.  We don't want to bog this down
so much that it doesn't make progress.  I would lean toward picking a
policy and then incrementally adding features as needed.
> Further, we must cope with the combination of instructions: the
> information associated to two instructions eligible for combination,
> how are combined?
>
> - Information transformation - the information associated to two
> instruction A, B, which   are combined into an instruction C, should
> be properly transformed according to a   user-specific policy.
>
>   A default policy may be "assign both information of A and B to
C"
> (gather-all/assign-all   policy?)
Again, I would lean toward just assign both pieces of information and
rpvode utilities to scrub the result if necessary.  If it turns out
that other cases are common, we can add other default policies.
>> - No bifurcation between "well-known"/"built-in"
information and things
>>   added later/locally
> May I ask you to elaborate a bit more about this point?
Sure.  The current IR metadata is bifurcated.  Some pieces of
information are more "first-class" than others.  For example there are
specialized metadata nodes
(https://llvm.org/docs/LangRef.html#specialized-metadata-nodes) while
other pieces of metadata are simple strings or numbers.

It would be simplest/easiest if metadata were handled uniformly.
>> - Should not impact compile time excessively (what is
"excessive?")
>
> Probably, such estimation should be performed on
Did something get cut off here?
> What about the granularity level?
>
> - Granularity level - metadata information should be attachable with
> different
>   level of granularity:
>
>   - *Coarse*: MachineFunction level
>   - *Medium*: MachineBasicBlock level
>   - *Fine*:   MachineInstruction level
>
> Clearly, there are other degree of granularity and/or dimensions to be
> considered
> (e.g., LiveInterval, MIBundles, Loops, ...).
It's probably a good idea to list at least the levels of granularity we
expect to need.  I'd start with function/block/instruction as I can
imagine uses for all three.  I am less sure about the other levels you
mention.  We can add more capability later if needed.
> Sorry for the long delay!
No problem!  I know I'm extremely busy as I'm sure we all are.  :)

Since you initially raised the topic, do you want to take the lead in
writing up a RFC?  I can certainly do it too but I want to give you
right of first refusal.  :)

                    -David

Lorenzo Casalino via llvm-dev

2020-Oct-21 08:49 UTC

head link

[llvm-dev] Metadata in LLVM back-end

> Le 20 oct. 2020 à 6:37 PM, David Greene <dag at hpe.com> a écrit :
> 
> Lorenzo Casalino <lorenzo.casalino93 at gmail.com> writes:
> 
>>> - Flexible format - it should be as simple as possible to express
the
>>>  desired information while minimizing changes to APIs
>> I do not want to raise a philosophical discussion (although, I would
>> find it quite interesting), but "flexible" does not
necessarely mean
>> "simple".
>> 
>> We could split this requirement as:
> 
> Good idea to separate these.
> 
>> - Flexible format - the format should be expressive enough to enable
>> modelization
>>   of *virtually* any kind of information type.
>> 
>> - Simple interface - expressing information and attaching them to MIR
>> elements (e.g.,
>>   instructions) should be "easy" (what does it mean *easy*?)
> 
> I would say "easy" means:
> 
> - Utilities are available to make maintaining information as transparent
>  (automatic) as possible.
> 
> - When not automatic, it is straightforward to apply the necessary APIs
>  to keep information updated.
> 
Ok, perfect!
>>> - Preserve information by default, only drop if explicitly told
(I'm
>>>  trying to capture the requirements for your use-case here and this
>>>  differs from IR-level metadata)
> 
>> What about giving to end-users the possibility to define a custom
>> defaultpolicy, as
>> well as the possibility to define different type of policies.
> 
> Possibly, though that might be overkill.  We don't want to bog this
down
> so much that it doesn't make progress.  I would lean toward picking a
> policy and then incrementally adding features as needed.
> 
>> Further, we must cope with the combination of instructions: the
>> information associated to two instructions eligible for combination,
>> how are combined?
>> 
>> - Information transformation - the information associated to two
>> instruction A, B, which   are combined into an instruction C, should
>> be properly transformed according to a   user-specific policy.
>> 
>>   A default policy may be "assign both information of A and B to
C"
>> (gather-all/assign-all   policy?)
> 
> Again, I would lean toward just assign both pieces of information and
> rpvode utilities to scrub the result if necessary.  If it turns out
> that other cases are common, we can add other default policies.
> 
I agree!
>>> - No bifurcation between
"well-known"/"built-in" information and things
>>>  added later/locally
> 
>> May I ask you to elaborate a bit more about this point?
> 
> Sure.  The current IR metadata is bifurcated.  Some pieces of
> information are more "first-class" than others.  For example
there are
> specialized metadata nodes
> (https://llvm.org/docs/LangRef.html#specialized-metadata-nodes) while
> other pieces of metadata are simple strings or numbers.
> 
> It would be simplest/easiest if metadata were handled uniformly.
> 
Ok, so this boils down to a uniform usage of the metadata.
>>> - Should not impact compile time excessively (what is
"excessive?")
>> 
>> Probably, such estimation should be performed on
> 
> Did something get cut off here?
Uops. Yep, I removed a paragraph, but, apparentely I forgot the first
period. In any case, we should discuss about how to quantitatively
determine an acceptable upper-bound on the overhead on the compilation
time and give a motivation for it. For instance, max n% overhead on the
compilation time must be guaranteed, because ** list of reasons **.

Of course, first we should identify the worst-case scenario; probably
the case where all the MIR elements are decorated with metadata, and all
the API functionalities are employed?
> 
>> What about the granularity level?
>> 
>> - Granularity level - metadata information should be attachable with
>> different
>>   level of granularity:
>> 
>>   - *Coarse*: MachineFunction level
>>   - *Medium*: MachineBasicBlock level
>>   - *Fine*:   MachineInstruction level
>> 
>> Clearly, there are other degree of granularity and/or dimensions to be
>> considered
>> (e.g., LiveInterval, MIBundles, Loops, ...).
> 
> It's probably a good idea to list at least the levels of granularity we
> expect to need.  I'd start with function/block/instruction as I can
> imagine uses for all three.  I am less sure about the other levels you
> mention.  We can add more capability later if needed.
> 
>> Sorry for the long delay!
> 
> No problem!  I know I'm extremely busy as I'm sure we all are.  :)
> 
> Since you initially raised the topic, do you want to take the lead in
> writing up a RFC?  I can certainly do it too but I want to give you
> right of first refusal.  :)
>                    -David
Uhm...actually, it wasn't me but Son Tuan, so the right of refusal
should be granted to him :) And I noticed now that he wasn't included in
CC of all our mails; I hope he was able to follow our discussion
anyways. I am adding him in this mail and let us wait if he has any
critical feature or point to discuss.

Thank you, David :)

-- Lorenzo

llvm dev - Oct 2020 - Metadata in LLVM back-end

[llvm-dev] Metadata in LLVM back-end

[llvm-dev] Metadata in LLVM back-end

[llvm-dev] Metadata in LLVM back-end