> That's the place to start, I think. Gather a list of requirements/use > cases along with the challenges we've discussed. Then it's a matter of > engineering a solution that fulfills the requirements while hitting as > few of the challenges as possible. Let's start by simply gathering some > lists. I'll take a quick stab and you and others can add to/edit it. > > Requirements > ------------ > - Convey information not readily available in existing IR constructs to > very late-stage codegen (after regalloc/scheduling, right through > asm/object emission)I see this more as the GOAL of the RFC, rather than a requirement.> - Flexible format - it should be as simple as possible to express the > desired information while minimizing changes to APIsI do not want to raise a philosophical discussion (although, I would find it quite interesting), but "flexible" does not necessarely mean "simple". We could split this requirement as: - Flexible format - the format should be expressive enough to enable modelization of *virtually* any kind of information type. - Simple interface - expressing information and attaching them to MIR elements (e.g., instructions) should be "easy" (what does it mean *easy*?)> - Preserve information by default, only drop if explicitly told (I'm > trying to capture the requirements for your use-case here and this > differs from IR-level metadata)What about giving to end-users the possibility to define a custom defaultpolicy, as well as the possibility to define different type of policies. Further, we must cope with the combination of instructions: the information associated to two instructions eligible for combination, how are combined? - Information transformation - the information associated to two instruction A, B, which are combined into an instruction C, should be properly transformed according to a user-specific policy. A default policy may be "assign both information of A and B to C" (gather-all/assign-all policy?)> - No bifurcation between "well-known"/"built-in" information and things > added later/locallyMay I ask you to elaborate a bit more about this point?> - Should not impact compile time excessively (what is "excessive?")Probably, such estimation should be performed on What about the granularity level? - Granularity level - metadata information should be attachable with different level of granularity: - *Coarse*: MachineFunction level - *Medium*: MachineBasicBlock level - *Fine*: MachineInstruction level Clearly, there are other degree of granularity and/or dimensions to be considered (e.g., LiveInterval, MIBundles, Loops, ...).> Challenges of using intrinsics and other alternatives > ----------------------------------------------------- > - Post-SSA annotation/how to associate intrinsics with > instructions/registers/types > > - Instruction selection fallout (inhibiting folding, etc.) > > - Register allocation impacts (extending live ranges, etc.) > > - Scheduling challenges (ensuring intrinsics can be found > post-scheduling, etc.) > > - Extending existing constructs (which ones?) requires hard-coding > aspects of information, reducing flexibility > > This is currently rather weasily-worded, because I didn't want to impose > too many restrictions right off the bat. > > -DavidSorry for the long delay! -- Lorenzo
Lorenzo Casalino <lorenzo.casalino93 at gmail.com> writes:>> Requirements >> ------------ >> - Convey information not readily available in existing IR constructs to >> very late-stage codegen (after regalloc/scheduling, right through >> asm/object emission) > > I see this more as the GOAL of the RFC, rather than a requirement.Fair enough.>> - Flexible format - it should be as simple as possible to express the >> desired information while minimizing changes to APIs > I do not want to raise a philosophical discussion (although, I would > find it quite interesting), but "flexible" does not necessarely mean > "simple". > > We could split this requirement as:Good idea to separate these.> - Flexible format - the format should be expressive enough to enable > modelization > of *virtually* any kind of information type. > > - Simple interface - expressing information and attaching them to MIR > elements (e.g., > instructions) should be "easy" (what does it mean *easy*?)I would say "easy" means: - Utilities are available to make maintaining information as transparent (automatic) as possible. - When not automatic, it is straightforward to apply the necessary APIs to keep information updated.>> - Preserve information by default, only drop if explicitly told (I'm >> trying to capture the requirements for your use-case here and this >> differs from IR-level metadata)> What about giving to end-users the possibility to define a custom > defaultpolicy, as > well as the possibility to define different type of policies.Possibly, though that might be overkill. We don't want to bog this down so much that it doesn't make progress. I would lean toward picking a policy and then incrementally adding features as needed.> Further, we must cope with the combination of instructions: the > information associated to two instructions eligible for combination, > how are combined? > > - Information transformation - the information associated to two > instruction A, B, which are combined into an instruction C, should > be properly transformed according to a user-specific policy. > > A default policy may be "assign both information of A and B to C" > (gather-all/assign-all policy?)Again, I would lean toward just assign both pieces of information and rpvode utilities to scrub the result if necessary. If it turns out that other cases are common, we can add other default policies.>> - No bifurcation between "well-known"/"built-in" information and things >> added later/locally> May I ask you to elaborate a bit more about this point?Sure. The current IR metadata is bifurcated. Some pieces of information are more "first-class" than others. For example there are specialized metadata nodes (https://llvm.org/docs/LangRef.html#specialized-metadata-nodes) while other pieces of metadata are simple strings or numbers. It would be simplest/easiest if metadata were handled uniformly.>> - Should not impact compile time excessively (what is "excessive?") > > Probably, such estimation should be performed onDid something get cut off here?> What about the granularity level? > > - Granularity level - metadata information should be attachable with > different > level of granularity: > > - *Coarse*: MachineFunction level > - *Medium*: MachineBasicBlock level > - *Fine*: MachineInstruction level > > Clearly, there are other degree of granularity and/or dimensions to be > considered > (e.g., LiveInterval, MIBundles, Loops, ...).It's probably a good idea to list at least the levels of granularity we expect to need. I'd start with function/block/instruction as I can imagine uses for all three. I am less sure about the other levels you mention. We can add more capability later if needed.> Sorry for the long delay!No problem! I know I'm extremely busy as I'm sure we all are. :) Since you initially raised the topic, do you want to take the lead in writing up a RFC? I can certainly do it too but I want to give you right of first refusal. :) -David
> Le 20 oct. 2020 à 6:37 PM, David Greene <dag at hpe.com> a écrit : > > Lorenzo Casalino <lorenzo.casalino93 at gmail.com> writes: > >>> - Flexible format - it should be as simple as possible to express the >>> desired information while minimizing changes to APIs >> I do not want to raise a philosophical discussion (although, I would >> find it quite interesting), but "flexible" does not necessarely mean >> "simple". >> >> We could split this requirement as: > > Good idea to separate these. > >> - Flexible format - the format should be expressive enough to enable >> modelization >> of *virtually* any kind of information type. >> >> - Simple interface - expressing information and attaching them to MIR >> elements (e.g., >> instructions) should be "easy" (what does it mean *easy*?) > > I would say "easy" means: > > - Utilities are available to make maintaining information as transparent > (automatic) as possible. > > - When not automatic, it is straightforward to apply the necessary APIs > to keep information updated. >Ok, perfect!>>> - Preserve information by default, only drop if explicitly told (I'm >>> trying to capture the requirements for your use-case here and this >>> differs from IR-level metadata) > >> What about giving to end-users the possibility to define a custom >> defaultpolicy, as >> well as the possibility to define different type of policies. > > Possibly, though that might be overkill. We don't want to bog this down > so much that it doesn't make progress. I would lean toward picking a > policy and then incrementally adding features as needed. > >> Further, we must cope with the combination of instructions: the >> information associated to two instructions eligible for combination, >> how are combined? >> >> - Information transformation - the information associated to two >> instruction A, B, which are combined into an instruction C, should >> be properly transformed according to a user-specific policy. >> >> A default policy may be "assign both information of A and B to C" >> (gather-all/assign-all policy?) > > Again, I would lean toward just assign both pieces of information and > rpvode utilities to scrub the result if necessary. If it turns out > that other cases are common, we can add other default policies. >I agree!>>> - No bifurcation between "well-known"/"built-in" information and things >>> added later/locally > >> May I ask you to elaborate a bit more about this point? > > Sure. The current IR metadata is bifurcated. Some pieces of > information are more "first-class" than others. For example there are > specialized metadata nodes > (https://llvm.org/docs/LangRef.html#specialized-metadata-nodes) while > other pieces of metadata are simple strings or numbers. > > It would be simplest/easiest if metadata were handled uniformly. >Ok, so this boils down to a uniform usage of the metadata.>>> - Should not impact compile time excessively (what is "excessive?") >> >> Probably, such estimation should be performed on > > Did something get cut off here?Uops. Yep, I removed a paragraph, but, apparentely I forgot the first period. In any case, we should discuss about how to quantitatively determine an acceptable upper-bound on the overhead on the compilation time and give a motivation for it. For instance, max n% overhead on the compilation time must be guaranteed, because ** list of reasons **. Of course, first we should identify the worst-case scenario; probably the case where all the MIR elements are decorated with metadata, and all the API functionalities are employed?> >> What about the granularity level? >> >> - Granularity level - metadata information should be attachable with >> different >> level of granularity: >> >> - *Coarse*: MachineFunction level >> - *Medium*: MachineBasicBlock level >> - *Fine*: MachineInstruction level >> >> Clearly, there are other degree of granularity and/or dimensions to be >> considered >> (e.g., LiveInterval, MIBundles, Loops, ...). > > It's probably a good idea to list at least the levels of granularity we > expect to need. I'd start with function/block/instruction as I can > imagine uses for all three. I am less sure about the other levels you > mention. We can add more capability later if needed. > >> Sorry for the long delay! > > No problem! I know I'm extremely busy as I'm sure we all are. :) > > Since you initially raised the topic, do you want to take the lead in > writing up a RFC? I can certainly do it too but I want to give you > right of first refusal. :) > -DavidUhm...actually, it wasn't me but Son Tuan, so the right of refusal should be granted to him :) And I noticed now that he wasn't included in CC of all our mails; I hope he was able to follow our discussion anyways. I am adding him in this mail and let us wait if he has any critical feature or point to discuss. Thank you, David :) -- Lorenzo