Ivan Llopard
2012-Jun-08 15:11 UTC
[LLVMdev] Instruction bundles before RA: Rematerialization
Hi Sergei, Jakob, Thanks for your comments ! On 07/06/2012 20:41, Sergei Larin wrote:> > Jakob, > > Please see my comments below. Hope this helps. > > Sergei > > -- > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. > > *From:*Jakob Stoklund Olesen [mailto:stoklund at 2pi.dk] > *Sent:* Thursday, June 07, 2012 1:02 PM > *To:* Sergei Larin > *Cc:* 'Ivan Llopard'; 'LLVM Developers Mailing List' > *Subject:* Re: [LLVMdev] Instruction bundles before RA: Rematerialization > > On Jun 7, 2012, at 10:25 AM, "Sergei Larin" <slarin at codeaurora.org > <mailto:slarin at codeaurora.org>> wrote: > > > > Generally as far as I concern, there is no way "generic" (platform > independent) code can add instructions to bundles optimally > > I agree, there are too many ways of modeling stuff with bundles. That > is why I took the philosophical stance of treating bundles as black > boxes during RA. I think the target should be involved whenever > bundles are formed, and we shouldn't delete instructions from inside > bundles without permission from the target. >I also agree. Adding instructions into a bundle strongly depends on the target and may be a quite complex task, sometimes too complex to be done at allocation time if allocation speed is an issue. Removing them should relax resource constraints in almost every conventional VLIW target and it's something that the RA can potentially handle in a simple way by *consulting the target before*.> I think we need to tweak some of the TargetInstrInfo hooks to make > bundle remat possible. I would like your input. > > Rematerialization has multiple steps: > > 1. Feasibility. RA knows the bundle defining a given SSA value of a > virtual register. It calls TII.isTriviallyReMaterializable() to > determine if the defining instruction can (and should) be > rematerialized. See LiveRangeEdit::anyRematerializable(). > > */[Larin, Sergei] Obviously if you treat bundle as a black box this > does not make much sense... or isTriviallyReMaterializable() should be > able to parse bundle and produce compound answer. Problem is -- you > will not be able to find many opportunities like that. On the other > hand if you detect a _single_ instruction as a remat candidate inside > a bundle, you might chose to dissolve (disassemble) the bundle (if > possible, as I said before) and treat new serial group as you normally > would for remat. There should be a platform dependent pass to rebundle > this new serial sequence again, but even if it is not done, but if > the dissolution was performed properly, this will be a performance, > not a correctness issue. As a side note, you might chose simply remove > desired instruction from the bundle (often it is possible and trivial > to do without affecting semantics) and proceed as described above. > _BUT_ instruction removal does need back end support. Example:/* > > */{ /* > > */r0 = add (r0, r1);/* > > */P0 = cmp (r0.new, r0);/* > > */}/* > > */ The r0.new means that the new value of r0 is used (reg file bypass > in the same cycle). You can see all possible implications of this. To > offload this mental logic to the back end, we need a utility of form > CanMoveMIBeforeBundle(MI, BundleHeader)/ CanMoveMIAfterBundle(MI, > BundleHeader)/ MoveMIBeforeBundle(MI, BundleHeader)/ > MoveMIAfterBundle(MI, BundleHeader). Calling this repeatedly should > achieve desired effect -- remove what could be removed, and live what > needs to remain bundled intact. The move utility can change > instruction properly for each target./* >That's explains the hook isLegalToPruneDependencies() in the packetizer :-). I don't see why that case cannot be handled by the existent model and needs that kind of API's. For example, if 'add' needs to be rematted, it's possible as long as all its operands are available at the remat position. The same holds for 'cmp'. The original 'add' cannot be removed because of its internal read and the RA should consult the target before. Anyway, I think we both agree that instruction removal is a target-specific task. It also remains simple compared to the insertion of instructions into a bundle.> *//* > > 2. Feasibility at desired > location. LiveRangeEdit::canRematerializeAt() then checks that the > instruction can be rematerialized at the new location. This can fail > if the instruction depends on virtual register values that are not > available at the new location. > > */[Larin, Sergei] This looks similar to the previous case. Good thing > that you can potentially have zero cost remat (if you can place your > new instruction inside existing bundle), but to know this you need an > answer _while_ you are computing the cost. For that I can easily > imagine a back end hook of something like CanAddMIToBundle(MI, > BundleHeader). This is often easier than the previous task./* >I like the idea. The cost can potentially be zero, at least from a latency point of view (we care about power consumption also). Conversely, I don't like the fact that the RA looks for bundling opportunities while rematting, but it's just a personal feeling. As you pointed out for Hexagon, bundling is a complex task and sometimes it cannot be managed in a simple way. In our BE, a DFA to model resource constraints is not enough and a wrapper has been created which adds complexity to the process.> 3. Remat. LiveRangeEdit::rematerializeAt() calls TII.reMaterialize() > (sic) to insert a copy of the defining instruction at the new location. > > */[Larin, Sergei] At this point you need to update liveness on bundle > level, and then update global picture. Updating liveness on bundle > level also might need help from the back end. See the above example > with .new, and you can easily imagine local defs/kills inside a bundle > that should not even be visible outside the black box. As of now I > consider this mechanism somewhat broken on trunk (it is overly > pessimistic)... but API in this case is rather straightforward. /* >I thought internals def/use were already modelled, is it right ?> 4. Shrink original live range. The original live range may be smaller > after some uses have been rematerialized. This may discover dead defs > if there are no remaining uses. >IMHO, as long as internal defs/uses are taken into account, I don't see any particular problem.> 5. LiveRangeEdit::eliminateDeadDefs() erases the dead defs. > > */[Larin, Sergei] Last two should be easy with the above API support > -- if you can CanMoveMIBeforeBundle(..) outside the bundle, you can > use existing API to delete it./* >How can be dead defs eliminated by calling CanMoveMIBeforeBundle() ? Ivan> It looks like each of these steps require some help from the target if > they are to handle bundles. > > /jakob >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120608/d69aeeaf/attachment.html>
Ivan Llopard
2012-Jun-08 16:35 UTC
[LLVMdev] Instruction bundles before RA: Rematerialization
Hi again! On 08/06/2012 17:11, Ivan Llopard wrote:> Hi Sergei, Jakob, > > Thanks for your comments ! > > On 07/06/2012 20:41, Sergei Larin wrote: >> >> Jakob, >> >> Please see my comments below. Hope this helps. >> >> Sergei >> >> -- >> >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. >> >> *From:*Jakob Stoklund Olesen [mailto:stoklund at 2pi.dk] >> *Sent:* Thursday, June 07, 2012 1:02 PM >> *To:* Sergei Larin >> *Cc:* 'Ivan Llopard'; 'LLVM Developers Mailing List' >> *Subject:* Re: [LLVMdev] Instruction bundles before RA: Rematerialization >> >> On Jun 7, 2012, at 10:25 AM, "Sergei Larin" <slarin at codeaurora.org >> <mailto:slarin at codeaurora.org>> wrote: >> >> >> >> Generally as far as I concern, there is no way “generic” (platform >> independent) code can add instructions to bundles optimally >> >> I agree, there are too many ways of modeling stuff with bundles. That >> is why I took the philosophical stance of treating bundles as black >> boxes during RA. I think the target should be involved whenever >> bundles are formed, and we shouldn't delete instructions from inside >> bundles without permission from the target. >> > > I also agree. Adding instructions into a bundle strongly depends on the > target and may be a quite complex task, sometimes too complex to be done > at allocation time if allocation speed is an issue. Removing them should > relax resource constraints in almost every conventional VLIW target and > it's something that the RA can potentially handle in a simple way by > *consulting the target before*. > >> I think we need to tweak some of the TargetInstrInfo hooks to make >> bundle remat possible. I would like your input. >> >> Rematerialization has multiple steps: >> >> 1. Feasibility. RA knows the bundle defining a given SSA value of a >> virtual register. It calls TII.isTriviallyReMaterializable() to >> determine if the defining instruction can (and should) be >> rematerialized. See LiveRangeEdit::anyRematerializable(). >> >> */[Larin, Sergei] Obviously if you treat bundle as a black box this >> does not make much sense… or isTriviallyReMaterializable() should be >> able to parse bundle and produce compound answer. Problem is – you >> will not be able to find many opportunities like that. On the other >> hand if you detect a _single_ instruction as a remat candidate inside >> a bundle, you might chose to dissolve (disassemble) the bundle (if >> possible, as I said before) and treat new serial group as you normally >> would for remat. There should be a platform dependent pass to rebundle >> this new serial sequence again, but even if it is not done, but if the >> dissolution was performed properly, this will be a performance, not a >> correctness issue. As a side note, you might chose simply remove >> desired instruction from the bundle (often it is possible and trivial >> to do without affecting semantics) and proceed as described above. >> _BUT_ instruction removal does need back end support. Example:/* >> >> */{ /* >> >> */r0 = add (r0, r1);/* >> >> */P0 = cmp (r0.new, r0);/* >> >> */}/* >> >> */The r0.new means that the new value of r0 is used (reg file bypass >> in the same cycle). You can see all possible implications of this. To >> offload this mental logic to the back end, we need a utility of form >> CanMoveMIBeforeBundle(MI, BundleHeader)/ CanMoveMIAfterBundle(MI, >> BundleHeader)/ MoveMIBeforeBundle(MI, BundleHeader)/ >> MoveMIAfterBundle(MI, BundleHeader). Calling this repeatedly should >> achieve desired effect – remove what could be removed, and live what >> needs to remain bundled intact. The move utility can change >> instruction properly for each target./* >> > > That's explains the hook isLegalToPruneDependencies() in the packetizer :-). > > I don't see why that case cannot be handled by the existent model and > needs that kind of API's. For example, if 'add' needs to be rematted, > it's possible as long as all its operands are available at the remat > position. The same holds for 'cmp'. The original 'add' cannot be removed > because of its internal read and the RA should consult the target before. > Anyway, I think we both agree that instruction removal is a > target-specific task. It also remains simple compared to the insertion > of instructions into a bundle.I rushed my answer, I apologize. I had virtual registers dancing in my head instead of those you put in the example! Please, correct me if I'm wrong: as long as the second operand of 'cmp' has not been allocated (and if it's available of course), it can be remated. Otherwise, it's tied to the bundle. Is that correct?> >> *//* >> >> 2. Feasibility at desired location. >> LiveRangeEdit::canRematerializeAt() then checks that the instruction >> can be rematerialized at the new location. This can fail if the >> instruction depends on virtual register values that are not available >> at the new location. >> >> */[Larin, Sergei] This looks similar to the previous case. Good thing >> that you can potentially have zero cost remat (if you can place your >> new instruction inside existing bundle), but to know this you need an >> answer _while_ you are computing the cost. For that I can easily >> imagine a back end hook of something like CanAddMIToBundle(MI, >> BundleHeader). This is often easier than the previous task./* >> > > I like the idea. The cost can potentially be zero, at least from a > latency point of view (we care about power consumption also). > Conversely, I don't like the fact that the RA looks for bundling > opportunities while rematting, but it's just a personal feeling. As you > pointed out for Hexagon, bundling is a complex task and sometimes it > cannot be managed in a simple way. In our BE, a DFA to model resource > constraints is not enough and a wrapper has been created which adds > complexity to the process. > >> 3. Remat. LiveRangeEdit::rematerializeAt() calls TII.reMaterialize() >> (sic) to insert a copy of the defining instruction at the new location. >> >> */[Larin, Sergei] At this point you need to update liveness on bundle >> level, and then update global picture. Updating liveness on bundle >> level also might need help from the back end. See the above example >> with .new, and you can easily imagine local defs/kills inside a bundle >> that should not even be visible outside the black box. As of now I >> consider this mechanism somewhat broken on trunk (it is overly >> pessimistic)… but API in this case is rather straightforward. /* >> > > I thought internals def/use were already modelled, is it right ? > >> 4. Shrink original live range. The original live range may be smaller >> after some uses have been rematerialized. This may discover dead defs >> if there are no remaining uses. >> > > IMHO, as long as internal defs/uses are taken into account, I don't see > any particular problem. > >> 5. LiveRangeEdit::eliminateDeadDefs() erases the dead defs. >> >> */[Larin, Sergei] Last two should be easy with the above API support – >> if you can CanMoveMIBeforeBundle(..) outside the bundle, you can use >> existing API to delete it./* >> > > How can be dead defs eliminated by calling CanMoveMIBeforeBundle() ? > > > Ivan > >> It looks like each of these steps require some help from the target if >> they are to handle bundles. >> >> /jakob >>
Jakob Stoklund Olesen
2012-Jun-08 17:02 UTC
[LLVMdev] Instruction bundles before RA: Rematerialization
On Jun 8, 2012, at 8:11 AM, Ivan Llopard <ivanllopard at gmail.com> wrote:> Adding instructions into a bundle strongly depends on the target and may be a quite complex task, sometimes too complex to be done at allocation time if allocation speed is an issue.Yes, we won't be adding instructions to bundles during RA. RA is already inserting copies and spill code as free-standing instructions between bundles. The target can add a post-RA bundling pass to mop up all that stuff if required.>> [Larin, Sergei] At this point you need to update liveness on bundle level, and then update global picture. Updating liveness on bundle level also might need help from the back end. See the above example with .new, and you can easily imagine local defs/kills inside a bundle that should not even be visible outside the black box. As of now I consider this mechanism somewhat broken on trunk (it is overly pessimistic)… but API in this case is rather straightforward. >> > I thought internals def/use were already modelled, is it right ?Yes. MO.isInternalRead() models exactly that.>> 4. Shrink original live range. The original live range may be smaller after some uses have been rematerialized. This may discover dead defs if there are no remaining uses. >> > > IMHO, as long as internal defs/uses are taken into account, I don't see any particular problem.It is easy to recognize that a virtual register has no uses outside the bundle, but we need help from the target to determine if that means some instructions are dead and can be deleted, and if so, which instructions to delete. Target-independent code should never be deleting instructions from inside bundles without the target's say so. It is not enough to check for internal uses. There could be other reasons to keep an instruction in a bundle. /jakob -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120608/f9729c05/attachment.html>
Sergei Larin
2012-Jun-08 18:32 UTC
[LLVMdev] Instruction bundles before RA: Rematerialization
Ivan, Jacob, Yes, you are right here. In this trivial example the bundle could be legally dissolved in the following way: { r0 = add (r0, r1); p0 = cmp (r0.new, r2); } ...becomes the following (equivalent) sequence: r0 = add (r0, r1); p0 = cmp (r0, r2); <<< no .new But this bundle: { r0 = add (r0, r1); p0 = cmp (r0.new, r0); } Could only be replaced with something like this: r3 = r0; <<< temp reg r0 = add (r0, r1); p0 = cmp (r0, r3); which of cause is a bit trickier after RA...> > In our BE, a DFA to model > > resource constraints is not enough and a wrapper has been created > > which adds complexity to the process.Exact same here.> > I thought internals def/use were already modelled, is it right ?Yes, as Jakob noted in his answer "MO.isInternalRead() models exactly that." But in the code base I am on at the moment (at about 3.1 branch time), the internal dead def is exposed to external live list, which is not wrong, but conservative. In fairness I just observed the fact, but did not have a chance to determine the root cause. It well might be that I am still missing some back end hooks...> > How can be dead defs eliminated by calling CanMoveMIBeforeBundle() ?Jacob says: " It is easy to recognize that a virtual register has no uses outside the bundle, but we need help from the target to determine if that means some instructions are dead and can be deleted, and if so, which instructions to delete." Here is what I was trying to say - I want to split the decision in two steps - If a dead instruction is detected inside a bundle, try to move that instruction out of the packet/bundle. This is where target gives its "OK". If that is legal, then "regular" DCE actions could be taken against that candidate instruction. This seems to simplify the task considerably and require least amount of massaging to the current implementation. As a summary - we all agree that more BE hooks and utilities are needed for bundle support. I am also very happy that we are discussing this ahead of time :) Sergei -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum.> -----Original Message----- > From: Ivan Llopard [mailto:ivanllopard at gmail.com] > Sent: Friday, June 08, 2012 11:36 AM > To: Sergei Larin > Cc: 'Jakob Stoklund Olesen'; 'LLVM Developers Mailing List'; 'Andrew > Trick' > Subject: Re: [LLVMdev] Instruction bundles before RA: Rematerialization > > Hi again! > > On 08/06/2012 17:11, Ivan Llopard wrote: > > Hi Sergei, Jakob, > > > > Thanks for your comments ! > > > > On 07/06/2012 20:41, Sergei Larin wrote: > >> > >> Jakob, > >> > >> Please see my comments below. Hope this helps. > >> > >> Sergei > >> > >> -- > >> > >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. > >> > >> *From:*Jakob Stoklund Olesen [mailto:stoklund at 2pi.dk] > >> *Sent:* Thursday, June 07, 2012 1:02 PM > >> *To:* Sergei Larin > >> *Cc:* 'Ivan Llopard'; 'LLVM Developers Mailing List' > >> *Subject:* Re: [LLVMdev] Instruction bundles before RA: > >> Rematerialization > >> > >> On Jun 7, 2012, at 10:25 AM, "Sergei Larin" <slarin at codeaurora.org > >> <mailto:slarin at codeaurora.org>> wrote: > >> > >> > >> > >> Generally as far as I concern, there is no way "generic" (platform > >> independent) code can add instructions to bundles optimally > >> > >> I agree, there are too many ways of modeling stuff with bundles. > That > >> is why I took the philosophical stance of treating bundles as black > >> boxes during RA. I think the target should be involved whenever > >> bundles are formed, and we shouldn't delete instructions from inside > >> bundles without permission from the target. > >> > > > > I also agree. Adding instructions into a bundle strongly depends on > > the target and may be a quite complex task, sometimes too complex to > > be done at allocation time if allocation speed is an issue. Removing > > them should relax resource constraints in almost every conventional > > VLIW target and it's something that the RA can potentially handle in > a > > simple way by *consulting the target before*. > > > >> I think we need to tweak some of the TargetInstrInfo hooks to make > >> bundle remat possible. I would like your input. > >> > >> Rematerialization has multiple steps: > >> > >> 1. Feasibility. RA knows the bundle defining a given SSA value of a > >> virtual register. It calls TII.isTriviallyReMaterializable() to > >> determine if the defining instruction can (and should) be > >> rematerialized. See LiveRangeEdit::anyRematerializable(). > >> > >> */[Larin, Sergei] Obviously if you treat bundle as a black box this > >> does not make much sense. or isTriviallyReMaterializable() should be > >> able to parse bundle and produce compound answer. Problem is - you > >> will not be able to find many opportunities like that. On the other > >> hand if you detect a _single_ instruction as a remat candidate > inside > >> a bundle, you might chose to dissolve (disassemble) the bundle (if > >> possible, as I said before) and treat new serial group as you > >> normally would for remat. There should be a platform dependent pass > >> to rebundle this new serial sequence again, but even if it is not > >> done, but if the dissolution was performed properly, this will be a > >> performance, not a correctness issue. As a side note, you might > chose > >> simply remove desired instruction from the bundle (often it is > >> possible and trivial to do without affecting semantics) and proceed > as described above. > >> _BUT_ instruction removal does need back end support. Example:/* > >> > >> */{ /* > >> > >> */r0 = add (r0, r1);/* > >> > >> */P0 = cmp (r0.new, r0);/* > >> > >> */}/* > >> > >> */The r0.new means that the new value of r0 is used (reg file bypass > >> in the same cycle). You can see all possible implications of this. > To > >> offload this mental logic to the back end, we need a utility of form > >> CanMoveMIBeforeBundle(MI, BundleHeader)/ CanMoveMIAfterBundle(MI, > >> BundleHeader)/ MoveMIBeforeBundle(MI, BundleHeader)/ > >> MoveMIAfterBundle(MI, BundleHeader). Calling this repeatedly should > >> achieve desired effect - remove what could be removed, and live what > >> needs to remain bundled intact. The move utility can change > >> instruction properly for each target./* > >> > > > > That's explains the hook isLegalToPruneDependencies() in the > packetizer :-). > > > > I don't see why that case cannot be handled by the existent model and > > needs that kind of API's. For example, if 'add' needs to be rematted, > > it's possible as long as all its operands are available at the remat > > position. The same holds for 'cmp'. The original 'add' cannot be > > removed because of its internal read and the RA should consult the > target before. > > Anyway, I think we both agree that instruction removal is a > > target-specific task. It also remains simple compared to the > insertion > > of instructions into a bundle. > > I rushed my answer, I apologize. I had virtual registers dancing in my > head instead of those you put in the example! > > Please, correct me if I'm wrong: as long as the second operand of 'cmp' > has not been allocated (and if it's available of course), it can be > remated. Otherwise, it's tied to the bundle. Is that correct? > > > > >> *//* > >> > >> 2. Feasibility at desired location. > >> LiveRangeEdit::canRematerializeAt() then checks that the instruction > >> can be rematerialized at the new location. This can fail if the > >> instruction depends on virtual register values that are not > available > >> at the new location. > >> > >> */[Larin, Sergei] This looks similar to the previous case. Good > thing > >> that you can potentially have zero cost remat (if you can place your > >> new instruction inside existing bundle), but to know this you need > an > >> answer _while_ you are computing the cost. For that I can easily > >> imagine a back end hook of something like CanAddMIToBundle(MI, > >> BundleHeader). This is often easier than the previous task./* > >> > > > > I like the idea. The cost can potentially be zero, at least from a > > latency point of view (we care about power consumption also). > > Conversely, I don't like the fact that the RA looks for bundling > > opportunities while rematting, but it's just a personal feeling. As > > you pointed out for Hexagon, bundling is a complex task and sometimes > > it cannot be managed in a simple way. In our BE, a DFA to model > > resource constraints is not enough and a wrapper has been created > > which adds complexity to the process. > > > >> 3. Remat. LiveRangeEdit::rematerializeAt() calls TII.reMaterialize() > >> (sic) to insert a copy of the defining instruction at the new > location. > >> > >> */[Larin, Sergei] At this point you need to update liveness on > bundle > >> level, and then update global picture. Updating liveness on bundle > >> level also might need help from the back end. See the above example > >> with .new, and you can easily imagine local defs/kills inside a > >> bundle that should not even be visible outside the black box. As of > >> now I consider this mechanism somewhat broken on trunk (it is overly > >> pessimistic). but API in this case is rather straightforward. /* > >> > > > > I thought internals def/use were already modelled, is it right ? > > > >> 4. Shrink original live range. The original live range may be > smaller > >> after some uses have been rematerialized. This may discover dead > defs > >> if there are no remaining uses. > >> > > > > IMHO, as long as internal defs/uses are taken into account, I don't > > see any particular problem. > > > >> 5. LiveRangeEdit::eliminateDeadDefs() erases the dead defs. > >> > >> */[Larin, Sergei] Last two should be easy with the above API support > >> - if you can CanMoveMIBeforeBundle(..) outside the bundle, you can > >> use existing API to delete it./* > >> > > > > How can be dead defs eliminated by calling CanMoveMIBeforeBundle() ? > > > > > > Ivan > > > >> It looks like each of these steps require some help from the target > >> if they are to handle bundles. > >> > >> /jakob > >>
Reasonably Related Threads
- [LLVMdev] Instruction bundles before RA: Rematerialization
- [LLVMdev] Instruction bundles before RA: Rematerialization
- [LLVMdev] Instruction bundles before RA: Rematerialization
- [LLVMdev] Instruction bundles before RA: Rematerialization
- [LLVMdev] Instruction bundles before RA: Rematerialization