Dominik Montada via llvm-dev
2020-Feb-27 13:40 UTC
[llvm-dev] Correct modelling of instructions with types smaller than the register class
Hi Quentin, Hi Amara, I was following your discussion on D75086 regarding declaring types as legal even if they are smaller than the actual register class (e.g. s16 and gpr32). We are working on a backend which only has 32 and 64-bit registers and we recently had a problem regarding exactly this where we had to declare G_UNMERGE_VALUES and G_MERGE_VALUES with a smaller type of <s32 as legal, even though we don't have a register class that matches and instruction selection implicitly sign extends them to s32. The reasoning was that for example in the case of G_UNMERGE_VALUES, the legalizer widens the source type when widening the destination type, which we don't want (as the big type then doesn't fit our registers anymore). So we decided to only make sure that the bigger type matches a register class and simply allowed smaller types <s32. What I am gathering now from this discussion is that this is basically an incorrect modelling and only works around another problem. Could you expand on this in more detail? I was thinking about implementing some custom legalization which uses G_EXTRACT or shifts to get what I want, but G_EXTRACT extracts as many bits as the destination type, so I would end up with the same problem again. Shifts are also not optimal, since we have a target instruction available to extract n bits from a given offset. Is it valid to emit target instructions during legalization, i.e. before instruction selection, or would the best modelling be to just use the shifts and then use some pre-instruction-selection combiner to merge them to the desired target instruction? Best regards, Dominik -- ---------------------------------------------------------------------- Dominik Montada Email: dominik.montada at hightec-rt.com HighTec EDV-Systeme GmbH Phone: +49 681 92613 19 Europaallee 19 Fax: +49-681-92613-26 D-66113 Saarbrücken WWW: http://www.hightec-rt.com Managing Director: Vera Strothmann Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222 This e-mail may contain confidential and/or privileged information. If you are not the intended recipient please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden. --- -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5409 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200227/3af96735/attachment.bin>
Quentin Colombet via llvm-dev
2020-Feb-29 02:15 UTC
[llvm-dev] Correct modelling of instructions with types smaller than the register class
Hi Dominik, I’ll do a brief reply here and if you want more information we can talk further :).> On Feb 27, 2020, at 5:40 AM, Dominik Montada <dominik.montada at hightec-rt.com> wrote: > > Hi Quentin, Hi Amara, > > I was following your discussion on D75086 regarding declaring types as legal even if they are smaller than the actual register class (e.g. s16 and gpr32). We are working on a backend which only has 32 and 64-bit registers and we recently had a problem regarding exactly this where we had to declare G_UNMERGE_VALUES and G_MERGE_VALUES with a smaller type of <s32 as legal, even though we don't have a register class that matches and instruction selection implicitly sign extends them to s32. The reasoning was that for example in the case of G_UNMERGE_VALUES, the legalizer widens the source type when widening the destination type, which we don't want (as the big type then doesn't fit our registers anymore). So we decided to only make sure that the bigger type matches a register class and simply allowed smaller types <s32. > > What I am gathering now from this discussion is that this is basically an incorrect modelling and only works around another problem. Could you expand on this in more detail?A few years ago we discussed the possibility and/or desire of allowing registers to be larger than the actual legalized type or whether they should be in sync with the legalized type. For instance, consider s8 = G_AND s8, s8 If your target supports `s32 = G_AND` then allowing registers to be larger than legalized type would yield that `s8 = G_AND s8, s8` is legal (and really any type smaller than or equal to s32 would be legal for that instruction). Then the question was how do we represent that: the relevant bits are the low 8 bits, but the container is 32-bit. E.g., s8(32) = G_AND s8(32), s8(32) We explored that a little bit and found that it was dangerous to carry undefined bits around (the upper 24 bits in that case) and that there was no real use case for this. Tim (Northover) might remember more details (CC’ed). Therefore, we decided that the legal types should exactly match what the container (the register) will be. So in theory, if you’re not following that model, you’re in violation of that and that’s where problems arise (and thus D75086 to workaround them).> > I was thinking about implementing some custom legalization which uses G_EXTRACT or shifts to get what I want, but G_EXTRACT extracts as many bits as the destination type, so I would end up with the same problem again. Shifts are also not optimal, since we have a target instruction available to extract n bits from a given offset. Is it valid to emit target instructions during legalization,Yes, that’s completely legal. The whole idea of GISel is that the backends should be able to insert target specific instructions whenever they want. For instance, we use that extensively in the lowering of calls directly in the IRTranslator, i.e., at the very beginning of GISel.> i.e. before instruction selection, or would the best modelling be to just use the shifts and then use some pre-instruction-selection combiner to merge them to the desired target instruction?That’s a question I cannot answer directly, it really depends on what are the implementation trade-offs on your end. I have a personal preference for generic shifts as it is possible that other generic optimizations can get rid of them. Now, if those shifts are too difficult to combine away or if you know that they are not optimizable anyway, you may want to go directly with the target specific instruction. Just keep in mind that whenever you use a target specific instruction, this is just basically an opaque object that the generic optimizations have to deal with. Cheers, -Quentin> > Best regards, > > Dominik > > -- > ---------------------------------------------------------------------- > Dominik Montada Email: dominik.montada at hightec-rt.com > HighTec EDV-Systeme GmbH Phone: +49 681 92613 19 > Europaallee 19 Fax: +49-681-92613-26 > D-66113 Saarbrücken WWW: http://www.hightec-rt.com > > Managing Director: Vera Strothmann > Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222 > > This e-mail may contain confidential and/or privileged information. If > you are not the intended recipient please notify the sender immediately > and destroy this e-mail. Any unauthorised copying, disclosure or > distribution of the material in this e-mail is strictly forbidden. > --- >
Dominik Montada via llvm-dev
2020-Mar-02 08:44 UTC
[llvm-dev] Correct modelling of instructions with types smaller than the register class
Hi Quentin, thank you for the reply! This clears up a lot of the questions I was having. It seems like we should definitely invest some time in rewriting some of our legalization rules then! I also posted some questions further down below. I would appreciate getting your opinion on them.> Hi Dominik, > > I’ll do a brief reply here and if you want more information we can talk further :). > >> On Feb 27, 2020, at 5:40 AM, Dominik Montada <dominik.montada at hightec-rt.com> wrote: >> >> Hi Quentin, Hi Amara, >> >> I was following your discussion on D75086 regarding declaring types as legal even if they are smaller than the actual register class (e.g. s16 and gpr32). We are working on a backend which only has 32 and 64-bit registers and we recently had a problem regarding exactly this where we had to declare G_UNMERGE_VALUES and G_MERGE_VALUES with a smaller type of <s32 as legal, even though we don't have a register class that matches and instruction selection implicitly sign extends them to s32. The reasoning was that for example in the case of G_UNMERGE_VALUES, the legalizer widens the source type when widening the destination type, which we don't want (as the big type then doesn't fit our registers anymore). So we decided to only make sure that the bigger type matches a register class and simply allowed smaller types <s32. >> >> What I am gathering now from this discussion is that this is basically an incorrect modelling and only works around another problem. Could you expand on this in more detail? > A few years ago we discussed the possibility and/or desire of allowing registers to be larger than the actual legalized type or whether they should be in sync with the legalized type. > > For instance, consider > s8 = G_AND s8, s8 > > If your target supports `s32 = G_AND` then allowing registers to be larger than legalized type would yield that `s8 = G_AND s8, s8` is legal (and really any type smaller than or equal to s32 would be legal for that instruction). > > Then the question was how do we represent that: the relevant bits are the low 8 bits, but the container is 32-bit. > E.g., > s8(32) = G_AND s8(32), s8(32) > > We explored that a little bit and found that it was dangerous to carry undefined bits around (the upper 24 bits in that case) and that there was no real use case for this. Tim (Northover) might remember more details (CC’ed). > > Therefore, we decided that the legal types should exactly match what the container (the register) will be. So in theory, if you’re not following that model, you’re in violation of that and that’s where problems arise (and thus D75086 to workaround them).This leads me to another question. What would the correct modelling look like for special registers that cannot be read/written, like a carry-bit register? The G_UADDO/UADDE instructions have an s1 carry and our target only supports reading this special register, but not writing to it (at least not directly and without doing some hacks). We found this especially hard to model, since GlobalISel doesn't really have a notion of special registers in generic instructions. In the end we went with an approach of defining a pseudo-register-class, which is not allocatable and copyable and emit a COPY to/from our physical carry bit register. This generally works the way we want and the COPY is eliminated. However it is not eliminated when using O0. I guess this probably has something to do with not defining an extra register bank and teaching our regbank info that such a carry-bit vreg should be assigned to this regbank? Right now it is simply assigned to the GPR bank, which seems to be the problem for O0. Nevertheless I still dislike this approach as I don't want to rely on the COPY being eliminated. I would much rather not emit it at all, but this obviously means that I would throw away the carry vreg of G_UADDO/UADDE, potentially changing the semantics of the instruction.> > >> I was thinking about implementing some custom legalization which uses G_EXTRACT or shifts to get what I want, but G_EXTRACT extracts as many bits as the destination type, so I would end up with the same problem again. Shifts are also not optimal, since we have a target instruction available to extract n bits from a given offset. Is it valid to emit target instructions during legalization, > Yes, that’s completely legal. > > The whole idea of GISel is that the backends should be able to insert target specific instructions whenever they want. For instance, we use that extensively in the lowering of calls directly in the IRTranslator, i.e., at the very beginning of GISel.That is very good to know! Thanks for the clarification.> >> i.e. before instruction selection, or would the best modelling be to just use the shifts and then use some pre-instruction-selection combiner to merge them to the desired target instruction? > That’s a question I cannot answer directly, it really depends on what are the implementation trade-offs on your end. > > I have a personal preference for generic shifts as it is possible that other generic optimizations can get rid of them. Now, if those shifts are too difficult to combine away or if you know that they are not optimizable anyway, you may want to go directly with the target specific instruction. > > Just keep in mind that whenever you use a target specific instruction, this is just basically an opaque object that the generic optimizations have to deal with.Hm, good point. It should be a straightforward combine rule, so going with the shifts and the combiner seems to be a better approach at first glance. Thanks! Best regards, Dominik> > Cheers, > -Quentin > >> Best regards, >> >> Dominik >> >> -- >> ---------------------------------------------------------------------- >> Dominik Montada Email: dominik.montada at hightec-rt.com >> HighTec EDV-Systeme GmbH Phone: +49 681 92613 19 >> Europaallee 19 Fax: +49-681-92613-26 >> D-66113 Saarbrücken WWW: http://www.hightec-rt.com >> >> Managing Director: Vera Strothmann >> Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222 >> >> This e-mail may contain confidential and/or privileged information. If >> you are not the intended recipient please notify the sender immediately >> and destroy this e-mail. Any unauthorised copying, disclosure or >> distribution of the material in this e-mail is strictly forbidden. >> --- >> >-- ---------------------------------------------------------------------- Dominik Montada Email: dominik.montada at hightec-rt.com HighTec EDV-Systeme GmbH Phone: +49 681 92613 19 Europaallee 19 Fax: +49-681-92613-26 D-66113 Saarbrücken WWW: http://www.hightec-rt.com Managing Director: Vera Strothmann Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222 This e-mail may contain confidential and/or privileged information. If you are not the intended recipient please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden. --- -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5409 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200302/38fa8991/attachment.bin>
Maybe Matching Threads
- Correct modelling of instructions with types smaller than the register class
- [GlobalISel] Narrowing uneven/non-pow-2 types
- [GlobalISel] Narrowing uneven/non-pow-2 types
- Exceptions not getting caught on bare-metal target
- Correct modelling of instructions with types smaller than the register class