thr3ads.net - llvm dev - [llvm-dev] Lowering ISD::TRUNCATE [Aug 2018]

If this information is useful, please help other people find it:
Share via:

Michael Stellmann via llvm-dev

2018-Aug-06 19:08 UTC

[llvm-dev] Lowering ISD::TRUNCATE

I'm working on defining the instructions and implementing the lowering 
code for a Z80 backend. For now, the backend supports only the native 
CPU-supported datatypes, which are 8 and 16 bits wide (i.e. no 32 bit 
long, float, ... yet).

So far, a lot of the simple stuff like immediate loads and return values 
is very straightforward, but now I got stuck with ISD::TRUNCATE, as in:

     typedef unsigned char uint8_t;
     uint8_t Func(uint8_t val1) { return val1 + val1; }

built with -O0 results in:

     target datalayout =
"e-m:o-S8-p:16:8-p1:8:8-i16:8-i32:8-a:8-n8:16"
     target triple = "z80"
     ; Function Attrs: noinline nounwind optnone
     define dso_local zeroext i8 @Func(i8 zeroext %val1) #0 {
     entry:
       %val1.addr = alloca i8, align 1
       store i8 %val1, i8* %val1.addr, align 1
       %0 = load i8, i8* %val1.addr, align 1
       %conv = zext i8 %0 to i16
       %1 = load i8, i8* %val1.addr, align 1
       %conv1 = zext i8 %1 to i16
       %add = add nsw i16 %conv, %conv1
       %conv2 = trunc i16 %add to i8
       ret i8 %conv2
     }

I looked into the X86 backend, which has a Z80-like register design, 
i.e. being able to access the subregs AL (and AH) from AX directly, 
without any specific truncation operation necessary. But, to be honest, 
I do not really understand from the code where and how the i16 to i8 
case is handled.

So returning an 8 bit result would simply require loading the lower 8 
bits ("AL" on X86) from the resulting value 16 bit (%add) into the 8
bit
return register, as defined by the calling convention.
(Or to be Z80 specific: The 16 bit add operation will be "ADD HL,DE", 
calling conv defined register "A" be the i8 return value, so the last 
two IR lines should emit something like "LD A,L / RET".)

That said, what is the correct way to implement ISD::TRUNCATE this in 
the backend, using the CPU's capability that truncating i16 to i8 is 
simply accessing an i16' register's subreg?

Should this be handled in "LowerOperation" or in
"PerformDAGCombine"?
Or could this be done with a target-independent combine?
Would returning true in "isTruncateFree" suffice?
Is any lowering code needed at all?

The X86 backend seems to do both,
"setTargetDAGCombine(ISD::TRUNCATE)",
but then also registering a lot of MVTs via 
"setOperationAction(...,Custom)", depending on things like soft-float.
I guess I'm

And second:
In my case, with only i16 and i8 data types, And are there other 
truncation operations to be supported? Is there any scenario where i8 to 
i1 is needed? My first guess was for conditional branching, but my tests 
showed that it works with flags, comparing "not equal" or "not
zero", so
I assume not.

Michael

Craig Topper via llvm-dev

2018-Aug-06 19:10 UTC

head link

[llvm-dev] Lowering ISD::TRUNCATE

The X86 i16->i8 case is handled with these two patterns in
X86InstrCompiler.td. One for 32-bit mode where we have to be careful to
ensure we are starting from AX/BX/CX/DX. 64-bit uses a separate simpler
pattern since SP/BP/SI/DI gain SPL/BPL/SIL/DIL in 64-bit mode.

def : Pat<(i8 (trunc GR16:$src)),
          (EXTRACT_SUBREG (i16 (COPY_TO_REGCLASS GR16:$src, GR16_ABCD)),
                          sub_8bit)>,
      Requires<[Not64BitMode]>;

def : Pat<(i8 (trunc GR16:$src)),
          (EXTRACT_SUBREG GR16:$src, sub_8bit)>,
      Requires<[In64BitMode]>;


~Craig


On Mon, Aug 6, 2018 at 12:07 PM Michael Stellmann via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I'm working on defining the instructions and implementing the lowering
> code for a Z80 backend. For now, the backend supports only the native
> CPU-supported datatypes, which are 8 and 16 bits wide (i.e. no 32 bit
> long, float, ... yet).
>
> So far, a lot of the simple stuff like immediate loads and return values
> is very straightforward, but now I got stuck with ISD::TRUNCATE, as in:
>
>      typedef unsigned char uint8_t;
>      uint8_t Func(uint8_t val1) { return val1 + val1; }
>
> built with -O0 results in:
>
>      target datalayout =
"e-m:o-S8-p:16:8-p1:8:8-i16:8-i32:8-a:8-n8:16"
>      target triple = "z80"
>      ; Function Attrs: noinline nounwind optnone
>      define dso_local zeroext i8 @Func(i8 zeroext %val1) #0 {
>      entry:
>        %val1.addr = alloca i8, align 1
>        store i8 %val1, i8* %val1.addr, align 1
>        %0 = load i8, i8* %val1.addr, align 1
>        %conv = zext i8 %0 to i16
>        %1 = load i8, i8* %val1.addr, align 1
>        %conv1 = zext i8 %1 to i16
>        %add = add nsw i16 %conv, %conv1
>        %conv2 = trunc i16 %add to i8
>        ret i8 %conv2
>      }
>
> I looked into the X86 backend, which has a Z80-like register design,
> i.e. being able to access the subregs AL (and AH) from AX directly,
> without any specific truncation operation necessary. But, to be honest,
> I do not really understand from the code where and how the i16 to i8
> case is handled.
>
> So returning an 8 bit result would simply require loading the lower 8
> bits ("AL" on X86) from the resulting value 16 bit (%add) into
the 8 bit
> return register, as defined by the calling convention.
> (Or to be Z80 specific: The 16 bit add operation will be "ADD
HL,DE",
> calling conv defined register "A" be the i8 return value, so the
last
> two IR lines should emit something like "LD A,L / RET".)
>
> That said, what is the correct way to implement ISD::TRUNCATE this in
> the backend, using the CPU's capability that truncating i16 to i8 is
> simply accessing an i16' register's subreg?
>
> Should this be handled in "LowerOperation" or in
"PerformDAGCombine"?
> Or could this be done with a target-independent combine?
> Would returning true in "isTruncateFree" suffice?
> Is any lowering code needed at all?
>
> The X86 backend seems to do both,
"setTargetDAGCombine(ISD::TRUNCATE)",
> but then also registering a lot of MVTs via
> "setOperationAction(...,Custom)", depending on things like
soft-float.
> I guess I'm
>
> And second:
> In my case, with only i16 and i8 data types, And are there other
> truncation operations to be supported? Is there any scenario where i8 to
> i1 is needed? My first guess was for conditional branching, but my tests
> showed that it works with flags, comparing "not equal" or
"not zero", so
> I assume not.
>
> Michael
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180806/d7a76719/attachment.html>

Michael Stellmann via llvm-dev

2018-Aug-06 19:24 UTC

head link

[llvm-dev] Lowering ISD::TRUNCATE

Ah, I see... Clever, no custom code required.

I was hoping for that, but wasn't sure, looking at the X86 code.

Thanks Craig

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Aug 2018 - Lowering ISD::TRUNCATE

[llvm-dev] Lowering ISD::TRUNCATE

[llvm-dev] Lowering ISD::TRUNCATE

[llvm-dev] Lowering ISD::TRUNCATE

Maybe Matching Threads