thr3ads.net - llvm dev - [LLVMdev] Help with new backend: byte-sized loads being generated for 'int' array access [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Jeff Kuskin

2014-Jun-10 15:04 UTC

[LLVMdev] Help with new backend: byte-sized loads being generated for 'int' array access

First, apologies because I'm quite new to LLVM backend development.  I very
much appreciate any help from more experienced folks.


I'm running into a problem in which byte-sized loads are _sometimes_ being
generated for a read access to an external array of 4-byte ints, depending on
how the array is declared.

I am hoping someone can perhaps point me to possible sources of the problem in
my backend code.  I would be happy to supply additional details; I'm trying
to keep this message relatively short.



The issue arises when I compile the following C code:

     extern int EI[];
     int MYFUNC() { return EI[1288]; }



I run the code through 'clang -emit-llvm' and end up with bitcode of:

    ; ModuleID = './clang2.c'
    target datalayout =
"e-m:e-p:32:32-i8:8:32-i16:16:32-i64:64-n32-S64"
    target triple = "dgc"
    @EI = external global [0 x i32]
    ; Function Attrs: nounwind
    define i32 @MYFUNC() #0 {
    entry:
      %0 = load i32* getelementptr inbounds ([0 x i32]* @EI, i32 0, i32 1288),
align 1
      ret i32 %0
    }

    attributes #0 = { nounwind "less-precise-fpmad"="false"
                   "no-frame-pointer-elim"="true"
                   "no-frame-pointer-elim-non-leaf"
                   "no-infs-fp-math"="false"
                   "no-nans-fp-math"="false"
                   "stack-protector-buffer-size"="8"
                   "unsafe-fp-math"="false"
"use-soft-float"="false" }
    !llvm.ident = !{!0}
    !0 = metadata !{metadata !"clang version 3.5.0 (209307)"}



When I then run the bitcode through llc, the memory load for the
'EI[1288]' reference is generated with a series of four byte-sized
loads, followed by the appropriate shifting and OR'ing to get all the bytes
into the proper place in the result.

This is not what I want, of course.  I want a single, word-sized load to be
generated.  I have various sizes of load instructions defined in my TableGen
file, an excerpt of which I've included at the end of this message.

Other backends built from the same source tree -- mipsel and xcore, for instance
-- do indeed generate a single word-sized load, as expected, so I'm
confident the problem is in my backend code.

What's interesting is that my backend *DOES* generate a single word-sized
load if I make either of the following changes to the declaration of
'EI':

   (1) Provide an array size in the EI declaration:
             extern int EI[5000];

       This yields the following in the bitcode, replacing the like lines from
above:
           @EI = external global [5000 x i32]
           ; Function Attrs: nounwind
           define i32 @MYFUNC() #0 {
           entry:
             %0 = load i32* getelementptr inbounds ([5000 x i32]* @EI, i32 0,
i32 1288), align 4
             ret i32 %0
           }



   (2) Change EI to be an int*:
             extern int* EI;

       This yields the following in the bitcode:
         @EI = external global i32*
         ; Function Attrs: nounwind
         define i32 @MYFUNC() #0 {
         entry:
           %0 = load i32** @EI, align 4
           %arrayidx = getelementptr inbounds i32* %0, i32 1288
           %1 = load i32* %arrayidx, align 4
           ret i32 %1
         }



I have tried a number of things to figure out this issue, but to no avail.  For
some reason the 'EI[1288]' reference is being treated as possibly
unaligned ("align=1"), but I can't figure out why.





TD file excerpt (modeled after the MIPS .td file):


def DGCAddrDefault :
        ComplexPattern<iPTR, 2, "selectAddrDefault",
[frameindex]>;
def DGCAddrInt :
        ComplexPattern<iPTR, 2, "selectAddrInt", [frameindex]>;

def DGCMemSrc : Operand<iPTR> {
  let MIOperandInfo = (ops ptr_rc, i32imm);
  let OperandType = "OPERAND_MEMORY";
}

let canFoldAsLoad = 1,
    mayLoad = 1 in
{
def LB : InstrDGC64_s__s_s<
             (outs IntRegs:$rd),
             (ins DGCMemSrc:$addr),
             !strconcat("lb", "\t$rd, $addr"),
             [(set i32:$rd, (sextloadi8 DGCAddrInt:$addr))],
             0b10011, 0b000, 0, 0>;
def LH : InstrDGC64_s__s_s<
             (outs IntRegs:$rd),
             (ins DGCMemSrc:$addr),
             !strconcat("lh", "\t$rd, $addr"),
             [(set i32:$rd, (sextloadi16 DGCAddrDefault:$addr))],
             0b10011, 0b001, 0, 0>;
def LBU : InstrDGC64_s__s_s<
             (outs IntRegs:$rd),
             (ins DGCMemSrc:$addr),
             !strconcat("lbu", "\t$rd, $addr"),
             [(set i32:$rd, (zextloadi8 DGCAddrDefault:$addr))],
             0b10011, 0b100, 0, 0>;
def LHU : InstrDGC64_s__s_s<
             (outs IntRegs:$rd),
             (ins DGCMemSrc:$addr),
             !strconcat("lhu", "\t$rd, $addr"),
             [(set i32:$rd, (zextloadi16 DGCAddrInt:$addr))],
             0b10011, 0b101, 0, 0>;
def LW : InstrDGC64_s__s_s<
             (outs IntRegs:$rd),
             (ins DGCMemSrc:$addr),
             !strconcat("lw", "\t$rd, $addr"),
             [(set i32:$rd, (load DGCAddrDefault:$addr))],
             0b10011, 0b010, 0, 0>;
}

Hal Finkel

2014-Jun-10 15:54 UTC

head link

[LLVMdev] Help with new backend: byte-sized loads being generated for 'int' array access

----- Original Message -----> From: "Jeff Kuskin" <jk500500 at yahoo.com>
> To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, June 10, 2014 10:04:45 AM
> Subject: [LLVMdev] Help with new backend: byte-sized loads being generated
for 'int' array access
> 
> First, apologies because I'm quite new to LLVM backend development.
>   I very much appreciate any help from more experienced folks.
> 
> 
> I'm running into a problem in which byte-sized loads are _sometimes_
> being generated for a read access to an external array of 4-byte
> ints, depending on how the array is declared.
> 
> I am hoping someone can perhaps point me to possible sources of the
> problem in my backend code.  I would be happy to supply additional
> details; I'm trying to keep this message relatively short.
> 
> 
> 
> The issue arises when I compile the following C code:
> 
>      extern int EI[];
>      int MYFUNC() { return EI[1288]; }
> 
> 
> 
> I run the code through 'clang -emit-llvm' and end up with bitcode
of:
> 
>     ; ModuleID = './clang2.c'
>     target datalayout >    
"e-m:e-p:32:32-i8:8:32-i16:16:32-i64:64-n32-S64"
>     target triple = "dgc"
>     @EI = external global [0 x i32]
>     ; Function Attrs: nounwind
>     define i32 @MYFUNC() #0 {
>     entry:
>       %0 = load i32* getelementptr inbounds ([0 x i32]* @EI, i32 0,
>       i32 1288), align 1
>       ret i32 %0
>     }
> 
>     attributes #0 = { nounwind
"less-precise-fpmad"="false"
>                    "no-frame-pointer-elim"="true"
>                    "no-frame-pointer-elim-non-leaf"
>                    "no-infs-fp-math"="false"
>                    "no-nans-fp-math"="false"
>                    "stack-protector-buffer-size"="8"
>                    "unsafe-fp-math"="false"
"use-soft-float"="false"
>                    }
>     !llvm.ident = !{!0}
>     !0 = metadata !{metadata !"clang version 3.5.0 (209307)"}
> 
> 
> 
> When I then run the bitcode through llc, the memory load for the
> 'EI[1288]' reference is generated with a series of four byte-sized
> loads, followed by the appropriate shifting and OR'ing to get all
> the bytes into the proper place in the result.
> 
> This is not what I want, of course.  I want a single, word-sized load
> to be generated.  I have various sizes of load instructions defined
> in my TableGen file, an excerpt of which I've included at the end of
> this message.
> 
> Other backends built from the same source tree -- mipsel and xcore,
> for instance -- do indeed generate a single word-sized load, as
> expected, so I'm confident the problem is in my backend code.
> 
> What's interesting is that my backend *DOES* generate a single
> word-sized load if I make either of the following changes to the
> declaration of 'EI':
> 
>    (1) Provide an array size in the EI declaration:
>              extern int EI[5000];
> 
>        This yields the following in the bitcode, replacing the like
>        lines from above:
>            @EI = external global [5000 x i32]
>            ; Function Attrs: nounwind
>            define i32 @MYFUNC() #0 {
>            entry:
>              %0 = load i32* getelementptr inbounds ([5000 x i32]*
>              @EI, i32 0, i32 1288), align 4
>              ret i32 %0
>            }
> 
> 
> 
>    (2) Change EI to be an int*:
>              extern int* EI;
> 
>        This yields the following in the bitcode:
>          @EI = external global i32*
>          ; Function Attrs: nounwind
>          define i32 @MYFUNC() #0 {
>          entry:
>            %0 = load i32** @EI, align 4
>            %arrayidx = getelementptr inbounds i32* %0, i32 1288
>            %1 = load i32* %arrayidx, align 4
>            ret i32 %1
>          }
> 
> 
> 
> I have tried a number of things to figure out this issue, but to no
> avail.  For some reason the 'EI[1288]' reference is being treated
as
> possibly unaligned ("align=1"), but I can't figure out why.
> 
For the question of how C is being translated into LLVM IR (why there is the
'align 1' vs 'align 4'), you should ask on the cfe-dev list (not
here).

To mention a related point, if your target supports unaligned loads for 4-byte
integers, then you need to override the
*TargetLowering::allowsUnalignedMemoryAccesses callback for your target.

 -Hal
> 
> 
> 
> TD file excerpt (modeled after the MIPS .td file):
> 
> 
> def DGCAddrDefault :
>         ComplexPattern<iPTR, 2, "selectAddrDefault",
[frameindex]>;
> def DGCAddrInt :
>         ComplexPattern<iPTR, 2, "selectAddrInt",
[frameindex]>;
> 
> def DGCMemSrc : Operand<iPTR> {
>   let MIOperandInfo = (ops ptr_rc, i32imm);
>   let OperandType = "OPERAND_MEMORY";
> }
> 
> let canFoldAsLoad = 1,
>     mayLoad = 1 in
> {
> def LB : InstrDGC64_s__s_s<
>              (outs IntRegs:$rd),
>              (ins DGCMemSrc:$addr),
>              !strconcat("lb", "\t$rd, $addr"),
>              [(set i32:$rd, (sextloadi8 DGCAddrInt:$addr))],
>              0b10011, 0b000, 0, 0>;
> def LH : InstrDGC64_s__s_s<
>              (outs IntRegs:$rd),
>              (ins DGCMemSrc:$addr),
>              !strconcat("lh", "\t$rd, $addr"),
>              [(set i32:$rd, (sextloadi16 DGCAddrDefault:$addr))],
>              0b10011, 0b001, 0, 0>;
> def LBU : InstrDGC64_s__s_s<
>              (outs IntRegs:$rd),
>              (ins DGCMemSrc:$addr),
>              !strconcat("lbu", "\t$rd, $addr"),
>              [(set i32:$rd, (zextloadi8 DGCAddrDefault:$addr))],
>              0b10011, 0b100, 0, 0>;
> def LHU : InstrDGC64_s__s_s<
>              (outs IntRegs:$rd),
>              (ins DGCMemSrc:$addr),
>              !strconcat("lhu", "\t$rd, $addr"),
>              [(set i32:$rd, (zextloadi16 DGCAddrInt:$addr))],
>              0b10011, 0b101, 0, 0>;
> def LW : InstrDGC64_s__s_s<
>              (outs IntRegs:$rd),
>              (ins DGCMemSrc:$addr),
>              !strconcat("lw", "\t$rd, $addr"),
>              [(set i32:$rd, (load DGCAddrDefault:$addr))],
>              0b10011, 0b010, 0, 0>;
> }
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

llvm dev - Jun 2014 - [LLVMdev] Help with new backend: byte-sized loads being generated for 'int' array access

[LLVMdev] Help with new backend: byte-sized loads being generated for 'int' array access

[LLVMdev] Help with new backend: byte-sized loads being generated for 'int' array access