thr3ads.net - llvm dev - [llvm-dev] Handling misaligned array accesses [May 2016]

If this information is useful, please help other people find it:
Share via:

Anna Thomas via llvm-dev

2016-May-12 21:20 UTC

[llvm-dev] Handling misaligned array accesses

Hi,

I have tried couple of c test cases with llvm to see if we handle misaligned
accesses, but it seems we do not have transformations to align loop accesses.
Misaligned accesses can worsen performance depending on the underlying target
(severity of crossing cache line boundaries)

One example:
//unaligned load and store
int foo(short *a, int m){
  int i;
  for(i=1; i<m ; i++)
    a[i] *=2;
  return a[3];
}

IR generated though clang -O3 -mllvm -disable-llvm-optzns. Passed this through
opt -O3 and the loop vectorizer adds vector code for this loop, but the GEP
access starts at offset 1.

vector.body:                                      ; preds = %vector.body,
%vector.body.preheader.new
  %index = phi i64 [ 0, %vector.body.preheader.new ], [ %index.next.3,
%vector.body ]
  %niter = phi i64 [ %unroll_iter, %vector.body.preheader.new ], [
%niter.nsub.3, %vector.body ]
  %offset.idx = or i64 %index, 1
  %9 = getelementptr inbounds i16, i16* %a, i64 %offset.idx
  %10 = bitcast i16* %9 to <8 x i16>*
  %wide.load = load <8 x i16>, <8 x i16>* %10, align 2, !tbaa !2
  %11 = shl <8 x i16> %wide.load, <i16 1, i16 1, i16 1, i16 1, i16 1,
i16 1, i16 1, i16 1>
  %12 = bitcast i16* %9 to <8 x i16>*
  store <8 x i16> %11, <8 x i16>* %12, align 2, !tbaa !2

Is there a reason we don’t support loop peeling for alignment handling?

Thanks,
Anna

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160512/8fa3f7b2/attachment.html>

Hal Finkel via llvm-dev

2016-May-12 23:42 UTC

head link

[llvm-dev] Handling misaligned array accesses

----- Original Message -----> From: "Anna Thomas" <anna at azul.com>
> To: llvm-dev at lists.llvm.org
> Cc: hfinkel at anl.gov, anemet at apple.com
> Sent: Thursday, May 12, 2016 4:20:24 PM
> Subject: Handling misaligned array accesses
> 
> Hi,
> 
> 
> I have tried couple of c test cases with llvm to see if we handle
> misaligned accesses, but it seems we do not have transformations to
> align loop accesses. Misaligned accesses can worsen performance
> depending on the underlying target (severity of crossing cache line
> boundaries)
> 
> 
> One example:
> //unaligned load and store
> 
> 
> int foo(short *a, int m){
> int i;
> for(i=1; i<m ; i++)
> a[i] *=2;
> return a[3];
> }
> 
> 
> IR generated though clang -O3 -mllvm -disable-llvm-optzns. Passed
> this through opt -O3 and the loop vectorizer adds vector code for
> this loop, but the GEP access starts at offset 1.
> 
> 
> 
> vector.body: ; preds = %vector.body, %vector.body.preheader.new
> %index = phi i64 [ 0, %vector.body.preheader.new ], [ %index.next.3,
> %vector.body ]
> %niter = phi i64 [ %unroll_iter, %vector.body.preheader.new ], [
> %niter.nsub.3, %vector.body ]
> %offset.idx = or i64 %index, 1
> %9 = getelementptr inbounds i16, i16* %a, i64 %offset.idx
> %10 = bitcast i16* %9 to <8 x i16>*
> %wide.load = load <8 x i16>, <8 x i16>* %10, align 2, !tbaa !2
> %11 = shl <8 x i16> %wide.load, <i16 1, i16 1, i16 1, i16 1, i16
1,
> i16 1, i16 1, i16 1>
> %12 = bitcast i16* %9 to <8 x i16>*
> store <8 x i16> %11, <8 x i16>* %12, align 2, !tbaa !2
> 
> 
> Is there a reason we don’t support loop peeling for alignment
> handling?
> 
No. AFAIK, just no one has done the work to implement it yet. I'd certainly
be quite interested in it, however. Several targets I care about would benefit
from alignment-based peeling.

 -Hal
> 
> Thanks,
> 
> Anna
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

llvm dev - May 2016 - Handling misaligned array accesses

[llvm-dev] Handling misaligned array accesses

[llvm-dev] Handling misaligned array accesses