Jay Foad via llvm-dev
2021-Mar-30 10:32 UTC
[llvm-dev] globalisel: cross-bank constant propagation?
On Mon, 29 Mar 2021 at 19:51, Nicolai Hähnle <nhaehnle at gmail.com> wrote:> On Mon, Mar 29, 2021 at 3:34 PM Jay Foad <jay.foad at gmail.com> wrote: >> On Mon, 29 Mar 2021 at 14:04, Matt Arsenault <arsenm2 at gmail.com> wrote: >> > I don’t think we actually want to have to look through copies, and the places we do are just working around the status quo. >> > >> > The folding SGPR/constants into instructions should be a new and improved version of SIFoldOperands. I think optimizing this is beyond the scope of what RegBankSelect and selection patterns. Far too much code would need to be taught to respect and preserve the constant bus limitation otherwise, so that’s why everything uses VGPRs. >> >> I can understand leaving it to a later pass to fold //sgprs or >> constants// into an instruction. What I can't understand is how you do >> the same kind of thing for more complex selection patterns like: >> >> t:sgpr = G_ADD y:sgpr, z:sgpr >> t':vgpr = COPY t:sgpr >> r:vgpr = G_ADD x:vgpt, t':vgpr >> >> How can we select v_add3_u32 from this? I can only think of two options: >> >> 1. Select s_add and v_add and leave it to a later pass to combine >> them. This seems to be giving up on doing decent pattern-based >> instruction selection. >> 2. Match it in the instruction selector, using a pattern that >> (explicitly or implicitly) looks through the cross-bank copy. But then >> you're back to the problem that two of the inputs are sgprs, which may >> or may not be valid according to complex operand restrictions. > > The SelectionDAG pattern for v_add3 and friends already checks this using a C++ code fragment, doesn't it?Yes but I had always assumed that was just a heuristic. Are we saying it's required for correctness? Actually I'm confused about the whole concept of checking for constant bus violations at this stage. In a normal compiler, the instruction selector would be allowed to select any instruction that works, regardless of register classes, and it would be the register allocator's job to copy the input values into suitable input registers. So given this GMIR: t:sgpr = G_ADD y:sgpr, z:sgpr t':vgpr = COPY t:sgpr r:vgpr = G_ADD x:vgpt, t':vgpr I would naively hope that the instruction selector could select this without worrying about register banks or constant bus restrictions: %4:vgpr_32 = V_ADD3_U32 %0:vgpr_32, %1:vgpr_32, %2:vgpr_32 Then optionally SIFoldOperands (which does know about constant bus restrictions) could modify some of the input register classes to take sgprs instead. Then the register allocator would do its job, inserting sgpr-to-vgpr copies as and where necessary. Jay.
Nicolai Hähnle via llvm-dev
2021-Mar-31 21:30 UTC
[llvm-dev] globalisel: cross-bank constant propagation?
On Tue, Mar 30, 2021 at 12:32 PM Jay Foad <jay.foad at gmail.com> wrote:> On Mon, 29 Mar 2021 at 19:51, Nicolai Hähnle <nhaehnle at gmail.com> wrote: > > On Mon, Mar 29, 2021 at 3:34 PM Jay Foad <jay.foad at gmail.com> wrote: > >> On Mon, 29 Mar 2021 at 14:04, Matt Arsenault <arsenm2 at gmail.com> wrote: > >> > I don’t think we actually want to have to look through copies, and > the places we do are just working around the status quo. > >> > > >> > The folding SGPR/constants into instructions should be a new and > improved version of SIFoldOperands. I think optimizing this is beyond the > scope of what RegBankSelect and selection patterns. Far too much code would > need to be taught to respect and preserve the constant bus limitation > otherwise, so that’s why everything uses VGPRs. > >> > >> I can understand leaving it to a later pass to fold //sgprs or > >> constants// into an instruction. What I can't understand is how you do > >> the same kind of thing for more complex selection patterns like: > >> > >> t:sgpr = G_ADD y:sgpr, z:sgpr > >> t':vgpr = COPY t:sgpr > >> r:vgpr = G_ADD x:vgpt, t':vgpr > >> > >> How can we select v_add3_u32 from this? I can only think of two options: > >> > >> 1. Select s_add and v_add and leave it to a later pass to combine > >> them. This seems to be giving up on doing decent pattern-based > >> instruction selection. > >> 2. Match it in the instruction selector, using a pattern that > >> (explicitly or implicitly) looks through the cross-bank copy. But then > >> you're back to the problem that two of the inputs are sgprs, which may > >> or may not be valid according to complex operand restrictions. > > > > The SelectionDAG pattern for v_add3 and friends already checks this > using a C++ code fragment, doesn't it? > > Yes but I had always assumed that was just a heuristic. Are we saying > it's required for correctness? Actually I'm confused about the whole > concept of checking for constant bus violations at this stage. >Point 1: We need to define what legal MIR must satisfy. Legal final ISA must satisfy the constant bus limitation, so that's what legal MIR must satisfy -- at least for most of the codegen pipeline. Point 2: To help with development, we have a MIR verifier to catch issues. Given point 1, the MIR verifier should catch constant bus limitation violations. Point 3: Therefore, MIR has to satisfy that limitation at all times. There were a few times when I've thought that perhaps we need to be more explicit about different flavors of MIR, which would affect what the MIR verifier checks. So far, there's MIR in SSA form and MIR out of SSA form. That would imply an LLVM-wide change of philosophy, but may well be worth it compared to the status quo which is comparatively unprincipled.> In a normal compiler, the instruction selector would be allowed to > select any instruction that works, regardless of register classes, and > it would be the register allocator's job to copy the input values into > suitable input registers. So given this GMIR: > > t:sgpr = G_ADD y:sgpr, z:sgpr > t':vgpr = COPY t:sgpr > r:vgpr = G_ADD x:vgpt, t':vgpr > > I would naively hope that the instruction selector could select this > without worrying about register banks or constant bus restrictions: > > %4:vgpr_32 = V_ADD3_U32 %0:vgpr_32, %1:vgpr_32, %2:vgpr_32 >This should be possible, no? Actually, I imagine that isel would immediately assign an sgpr class to %1 and %2. I thought the only thing missing is for the pattern to actually match the incoming GMIR despite the COPY? Cheers, Nicolai> Then optionally SIFoldOperands (which does know about constant bus > restrictions) could modify some of the input register classes to take > sgprs instead. > > Then the register allocator would do its job, inserting sgpr-to-vgpr > copies as and where necessary. > > Jay. >-- Lerne, wie die Welt wirklich ist, aber vergiss niemals, wie sie sein sollte. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210331/ec8c6a33/attachment.html>
Jay Foad via llvm-dev
2021-Apr-01 14:19 UTC
[llvm-dev] globalisel: cross-bank constant propagation?
Thanks for your thoughts. I realise there are more and more things about instruction selection and register classes that I don't fully understand yet, so I'll try to educate myself before coming back to this. Jay. On Wed, 31 Mar 2021 at 22:30, Nicolai Hähnle <nhaehnle at gmail.com> wrote:> > On Tue, Mar 30, 2021 at 12:32 PM Jay Foad <jay.foad at gmail.com> wrote: >> >> On Mon, 29 Mar 2021 at 19:51, Nicolai Hähnle <nhaehnle at gmail.com> wrote: >> > On Mon, Mar 29, 2021 at 3:34 PM Jay Foad <jay.foad at gmail.com> wrote: >> >> On Mon, 29 Mar 2021 at 14:04, Matt Arsenault <arsenm2 at gmail.com> wrote: >> >> > I don’t think we actually want to have to look through copies, and the places we do are just working around the status quo. >> >> > >> >> > The folding SGPR/constants into instructions should be a new and improved version of SIFoldOperands. I think optimizing this is beyond the scope of what RegBankSelect and selection patterns. Far too much code would need to be taught to respect and preserve the constant bus limitation otherwise, so that’s why everything uses VGPRs. >> >> >> >> I can understand leaving it to a later pass to fold //sgprs or >> >> constants// into an instruction. What I can't understand is how you do >> >> the same kind of thing for more complex selection patterns like: >> >> >> >> t:sgpr = G_ADD y:sgpr, z:sgpr >> >> t':vgpr = COPY t:sgpr >> >> r:vgpr = G_ADD x:vgpt, t':vgpr >> >> >> >> How can we select v_add3_u32 from this? I can only think of two options: >> >> >> >> 1. Select s_add and v_add and leave it to a later pass to combine >> >> them. This seems to be giving up on doing decent pattern-based >> >> instruction selection. >> >> 2. Match it in the instruction selector, using a pattern that >> >> (explicitly or implicitly) looks through the cross-bank copy. But then >> >> you're back to the problem that two of the inputs are sgprs, which may >> >> or may not be valid according to complex operand restrictions. >> > >> > The SelectionDAG pattern for v_add3 and friends already checks this using a C++ code fragment, doesn't it? >> >> Yes but I had always assumed that was just a heuristic. Are we saying >> it's required for correctness? Actually I'm confused about the whole >> concept of checking for constant bus violations at this stage. > > > Point 1: We need to define what legal MIR must satisfy. Legal final ISA must satisfy the constant bus limitation, so that's what legal MIR must satisfy -- at least for most of the codegen pipeline. > > Point 2: To help with development, we have a MIR verifier to catch issues. Given point 1, the MIR verifier should catch constant bus limitation violations. > > Point 3: Therefore, MIR has to satisfy that limitation at all times. > > There were a few times when I've thought that perhaps we need to be more explicit about different flavors of MIR, which would affect what the MIR verifier checks. So far, there's MIR in SSA form and MIR out of SSA form. That would imply an LLVM-wide change of philosophy, but may well be worth it compared to the status quo which is comparatively unprincipled. > > >> >> In a normal compiler, the instruction selector would be allowed to >> select any instruction that works, regardless of register classes, and >> it would be the register allocator's job to copy the input values into >> suitable input registers. So given this GMIR: >> >> t:sgpr = G_ADD y:sgpr, z:sgpr >> t':vgpr = COPY t:sgpr >> r:vgpr = G_ADD x:vgpt, t':vgpr >> >> I would naively hope that the instruction selector could select this >> without worrying about register banks or constant bus restrictions: >> >> %4:vgpr_32 = V_ADD3_U32 %0:vgpr_32, %1:vgpr_32, %2:vgpr_32 > > > This should be possible, no? Actually, I imagine that isel would immediately assign an sgpr class to %1 and %2. I thought the only thing missing is for the pattern to actually match the incoming GMIR despite the COPY? > > Cheers, > Nicolai > > >> >> Then optionally SIFoldOperands (which does know about constant bus >> restrictions) could modify some of the input register classes to take >> sgprs instead. >> >> Then the register allocator would do its job, inserting sgpr-to-vgpr >> copies as and where necessary. >> >> Jay. > > > > -- > Lerne, wie die Welt wirklich ist, > aber vergiss niemals, wie sie sein sollte.