Steve Montgomery
2012-Aug-13 10:29 UTC
[LLVMdev] Load serialisation during selection DAG building
I've got a question about how SelectionDAGBuilder treats loads. The LLVM Language Reference Manual explicitly states that the order of volatile operations may be changed relative to non-volatile operations. However, when the SelectionDAGBuilder in LLVM 3.1 encounters a volatile load, it flushes all pending loads and then chains the volatile load onto them meaning that the volatile load must be scheduled after those loads. While this behaviour isn't wrong, it seems to reduce the scope for efficient instruction selection. Is there any reason not to modify the behaviour of SelectionDAGBuilder::visitLoad() so that volatile loads don't cause SelectionDAGBuilder::getRoot() to be called. Instead, they can be chained together with the head of the chain being stored in PendingLoads. Then when something else calls SelectionDAGBuilder::getRoot(), the chain of volatile loads is TokenFactored together with the non-volatile loads. I've tried this out and it seems to do what I want but as I'm fairly inexperienced with LLVM, I'm not sure whether there's something else preventing this strategy from working. The reason I noticed this is because I have been developing a back-end for a target in which some instructions are implemented as pseudos which will be expanded into a pair of instructions. Each of the two operands of the pseudo can be a load but as the expansion accesses the right operand twice, I don't want to match a volatile load for this operand. For example, in: %0 = load i16* @y, align 2, !tbaa !0 %1 = load volatile i16* @x, align 2, !tbaa !0 %add = add i16 %0, %1 the volatile load is sequenced after the non-volatile load which means that the non-volatile load can't match the left operand of the add because this would create a scheduling cycle. This means both loads are selected as load instructions, resulting in use of an extra register and an extra instruction. If I change the behaviour of SelectionDAGBuilder::visitLoad() as described then I get just two instructions.
Hal Finkel
2012-Aug-13 19:09 UTC
[LLVMdev] Load serialisation during selection DAG building
Steve, I had created a patch last year that does something similar to what you describe for regular loads and stores using aliasing information. I think that the last message in the thread was: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120402/140299.html This approach has worked for me, but it is not the preferred solution going forward. The preferred solution is to keep the "critical chain" as is (partly because there are places in some of the backend lowering code that assume it exists in its current form), and just update the scheduler to be more intelligent about how it forms the dependency graph. This has since been developed for the new MI scheduling framework by Sergei Larin, and was committed in r156842 (last message in the discussion thread was http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142659.html) I would recommend trying something like Sergei's solution first, and fall back to trying to play with the critical chain only if that can't or won't work. -Hal On Mon, 13 Aug 2012 11:29:18 +0100 Steve Montgomery <stephen.montgomery3 at btinternet.com> wrote:> I've got a question about how SelectionDAGBuilder treats loads. > > The LLVM Language Reference Manual explicitly states that the order > of volatile operations may be changed relative to non-volatile > operations. However, when the SelectionDAGBuilder in LLVM 3.1 > encounters a volatile load, it flushes all pending loads and then > chains the volatile load onto them meaning that the volatile load > must be scheduled after those loads. While this behaviour isn't > wrong, it seems to reduce the scope for efficient instruction > selection. > > Is there any reason not to modify the behaviour of > SelectionDAGBuilder::visitLoad() so that volatile loads don't cause > SelectionDAGBuilder::getRoot() to be called. Instead, they can be > chained together with the head of the chain being stored in > PendingLoads. Then when something else calls > SelectionDAGBuilder::getRoot(), the chain of volatile loads is > TokenFactored together with the non-volatile loads. I've tried this > out and it seems to do what I want but as I'm fairly inexperienced > with LLVM, I'm not sure whether there's something else preventing > this strategy from working. > > The reason I noticed this is because I have been developing a > back-end for a target in which some instructions are implemented as > pseudos which will be expanded into a pair of instructions. Each of > the two operands of the pseudo can be a load but as the expansion > accesses the right operand twice, I don't want to match a volatile > load for this operand. For example, in: > > %0 = load i16* @y, align 2, !tbaa !0 > %1 = load volatile i16* @x, align 2, !tbaa !0 > %add = add i16 %0, %1 > > the volatile load is sequenced after the non-volatile load which > means that the non-volatile load can't match the left operand of the > add because this would create a scheduling cycle. This means both > loads are selected as load instructions, resulting in use of an extra > register and an extra instruction. If I change the behaviour of > SelectionDAGBuilder::visitLoad() as described then I get just two > instructions. _______________________________________________ LLVM > Developers mailing list LLVMdev at cs.uiuc.edu > http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
Steve Montgomery
2012-Aug-13 19:15 UTC
[LLVMdev] Load serialisation during selection DAG building
Thanks Hal, I'll look into it Sergei's solution. Steve On 13 Aug 2012, at 20:09, Hal Finkel wrote:> Steve, > > I had created a patch last year that does something similar to what you > describe for regular loads and stores using aliasing information. I > think that the last message in the thread was: > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120402/140299.html > > This approach has worked for me, but it is not the preferred solution > going forward. The preferred solution is to keep the "critical > chain" as is (partly because there are places in some of the backend > lowering code that assume it exists in its current form), and just > update the scheduler to be more intelligent about how it forms the > dependency graph. This has since been developed for the new MI > scheduling framework by Sergei Larin, and was committed in r156842 > (last message in the discussion thread was > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142659.html) > > I would recommend trying something like Sergei's solution first, and > fall back to trying to play with the critical chain only if that can't > or won't work. > > -Hal > > On Mon, 13 Aug 2012 11:29:18 +0100 > Steve Montgomery <stephen.montgomery3 at btinternet.com> wrote: > >> I've got a question about how SelectionDAGBuilder treats loads. >> >> The LLVM Language Reference Manual explicitly states that the order >> of volatile operations may be changed relative to non-volatile >> operations. However, when the SelectionDAGBuilder in LLVM 3.1 >> encounters a volatile load, it flushes all pending loads and then >> chains the volatile load onto them meaning that the volatile load >> must be scheduled after those loads. While this behaviour isn't >> wrong, it seems to reduce the scope for efficient instruction >> selection. >> >> Is there any reason not to modify the behaviour of >> SelectionDAGBuilder::visitLoad() so that volatile loads don't cause >> SelectionDAGBuilder::getRoot() to be called. Instead, they can be >> chained together with the head of the chain being stored in >> PendingLoads. Then when something else calls >> SelectionDAGBuilder::getRoot(), the chain of volatile loads is >> TokenFactored together with the non-volatile loads. I've tried this >> out and it seems to do what I want but as I'm fairly inexperienced >> with LLVM, I'm not sure whether there's something else preventing >> this strategy from working. >> >> The reason I noticed this is because I have been developing a >> back-end for a target in which some instructions are implemented as >> pseudos which will be expanded into a pair of instructions. Each of >> the two operands of the pseudo can be a load but as the expansion >> accesses the right operand twice, I don't want to match a volatile >> load for this operand. For example, in: >> >> %0 = load i16* @y, align 2, !tbaa !0 >> %1 = load volatile i16* @x, align 2, !tbaa !0 >> %add = add i16 %0, %1 >> >> the volatile load is sequenced after the non-volatile load which >> means that the non-volatile load can't match the left operand of the >> add because this would create a scheduling cycle. This means both >> loads are selected as load instructions, resulting in use of an extra >> register and an extra instruction. If I change the behaviour of >> SelectionDAGBuilder::visitLoad() as described then I get just two >> instructions. _______________________________________________ LLVM >> Developers mailing list LLVMdev at cs.uiuc.edu >> http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > -- > Hal Finkel > Postdoctoral Appointee > Leadership Computing Facility > Argonne National Laboratory
Steve Montgomery
2012-Aug-14 15:43 UTC
[LLVMdev] Load serialisation during selection DAG building
I looked into those patches but I don't think they will help in my situation because my problems occur during instruction selection rather than scheduling. A simple and concrete example is a pattern like: [(set GR:$dst (add GR:$src (nvload addr:$mem)))] where nvload matches a load provided that isVolatile() is false. If the selection DAG looks like: | | LD1 LD2 ^ ^ | | \ / add ^ | \ / ST and the chain like: LD1 LD2 ^ ^ | | \ / TokenFactor ^ | ST then the add instruction is selected. However, if operand 1 of the add is a volatile load, then the chain looks like: LD1 ^ | LD2Volatile | ST In this case the add instruction cannot be selected. The operator is commutative, so instruction selection matches the non-volatile load against the operand 1 but then fails because to select this instruction would mean that the volatile load into a register (to match operand 0) would occur before the non-volatile load that's folded into the instruction. I can get this instruction to be selected by changing the way loads are built into the selection DAG, i.e. making the chain: LD1 LD2Volatile ^ ^ | | \ / TokenFactor ^ | ST Given what the LRM says about volatile accesses, this seems valid but I take Hal's point about there being code that might not be expecting such a chain. It's a matter of changing a few lines in SelectionDAGBuilder.cpp to achieve what I want so I think I'll go ahead and see what happens. If Hal or anyone else has any further thoughts on the potential problems with doing so then I'd be glad to know. Thanks Steve Montgomery On 13 Aug 2012, at 20:09, Hal Finkel wrote:> Steve, > > I had created a patch last year that does something similar to what you > describe for regular loads and stores using aliasing information. I > think that the last message in the thread was: > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120402/140299.html > > This approach has worked for me, but it is not the preferred solution > going forward. The preferred solution is to keep the "critical > chain" as is (partly because there are places in some of the backend > lowering code that assume it exists in its current form), and just > update the scheduler to be more intelligent about how it forms the > dependency graph. This has since been developed for the new MI > scheduling framework by Sergei Larin, and was committed in r156842 > (last message in the discussion thread was > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142659.html) > > I would recommend trying something like Sergei's solution first, and > fall back to trying to play with the critical chain only if that can't > or won't work. > > -Hal > > On Mon, 13 Aug 2012 11:29:18 +0100 > Steve Montgomery <stephen.montgomery3 at btinternet.com> wrote: > >> I've got a question about how SelectionDAGBuilder treats loads. >> >> The LLVM Language Reference Manual explicitly states that the order >> of volatile operations may be changed relative to non-volatile >> operations. However, when the SelectionDAGBuilder in LLVM 3.1 >> encounters a volatile load, it flushes all pending loads and then >> chains the volatile load onto them meaning that the volatile load >> must be scheduled after those loads. While this behaviour isn't >> wrong, it seems to reduce the scope for efficient instruction >> selection. >> >> Is there any reason not to modify the behaviour of >> SelectionDAGBuilder::visitLoad() so that volatile loads don't cause >> SelectionDAGBuilder::getRoot() to be called. Instead, they can be >> chained together with the head of the chain being stored in >> PendingLoads. Then when something else calls >> SelectionDAGBuilder::getRoot(), the chain of volatile loads is >> TokenFactored together with the non-volatile loads. I've tried this >> out and it seems to do what I want but as I'm fairly inexperienced >> with LLVM, I'm not sure whether there's something else preventing >> this strategy from working. >> >> The reason I noticed this is because I have been developing a >> back-end for a target in which some instructions are implemented as >> pseudos which will be expanded into a pair of instructions. Each of >> the two operands of the pseudo can be a load but as the expansion >> accesses the right operand twice, I don't want to match a volatile >> load for this operand. For example, in: >> >> %0 = load i16* @y, align 2, !tbaa !0 >> %1 = load volatile i16* @x, align 2, !tbaa !0 >> %add = add i16 %0, %1 >> >> the volatile load is sequenced after the non-volatile load which >> means that the non-volatile load can't match the left operand of the >> add because this would create a scheduling cycle. This means both >> loads are selected as load instructions, resulting in use of an extra >> register and an extra instruction. If I change the behaviour of >> SelectionDAGBuilder::visitLoad() as described then I get just two >> instructions. _______________________________________________ LLVM >> Developers mailing list LLVMdev at cs.uiuc.edu >> http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > -- > Hal Finkel > Postdoctoral Appointee > Leadership Computing Facility > Argonne National Laboratory-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120814/54546950/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Load serialisation during selection DAG building
- [LLVMdev] Load serialisation during selection DAG building
- [LLVMdev] Load serialisation during selection DAG building
- [LLVMdev] [INCOMPLETE] [GC] Support wrapping vararg functions in statepoint
- [LLVMdev] FoldingSetNodeID operations inefficiency