Scott Manley via llvm-dev
2019-May-02 21:14 UTC
[llvm-dev] llvm is illegally vectorizing with a recurrence on skylake
Hi -- I have found a bug in an HPC code where llvm is vectorizing a loop on Skylake that has an obvious recurrence. I derived a small test case based on the original benchmark below: /*****************************************************************/ static void __attribute__ ((always_inline)) one( const int *restrict in, const int *const end, const unsigned shift, int *const restrict index, int *const restrict out) { do { int a_idx = *in>>shift; int b_idx = index[a_idx]; out[b_idx] = *in; // <-- reccurence as index[a_idx] can be the index[a_idx]++; // same and incremented within the vector } while(++in!=end); // which leads to incorrect results } #ifndef NO_TWO static void __attribute__ ((noinline)) two( const int *restrict in, const int *const end, const unsigned shift, int *const restrict index, int *const restrict out) { do out[index[(*in>>shift)]++]=*in; while(++in!=end); } #endif void parent( int digits, int n, int *restrict work, int * restrict idx, int *restrict shift, int **restrict indicies) { int *in = work; int *dst = work+n; // int *indicies[1024]; // int shift[1024]; int d; for(d=1;d!=digits-1;++d) { int *t; one(in,in+n,shift[d],indicies[d],dst); t=in,in=dst,dst=t; } #ifndef NO_TWO two(in,in+n,shift[d],indicies[d],idx); #endif } /*****************************************************************/ clang -S -O2 -Rpass=loop-vectorize small.c -march=skylake-avx512 small.c:6:3: remark: vectorized loop (vectorization width: 16, interleaved count: 1) [-Rpass=loop-vectorize] do { ^ I believe the problem to be a issue with dependency information getting destroyed because if you remove the two() function (or compile one() on its own, or prevent inlining of one()), it correctly prevents vectorization. clang -S -O2 -Rpass=loop-vectorize -Rpass-missed=loop-vectorize small.c -march=skylake-avx512 -DNO_TWO small.c:6:3: remark: loop not vectorized [-Rpass-missed=loop-vectorize] do { I did trace it down to possibly being something within DepChecker->areDepsSafe() as it returns true for the incorrect case. Thanks, Scott -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190502/a593dd53/attachment.html>
Finkel, Hal J. via llvm-dev
2019-May-02 23:53 UTC
[llvm-dev] llvm is illegally vectorizing with a recurrence on skylake
Hi, Scott, Thanks for reporting this problem. We should get a bug filed on this issue at bugs.llvm.org. If you're not able to do this, please let us know, and someone else can take care of it. -Hal Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Scott Manley via llvm-dev <llvm-dev at lists.llvm.org> Sent: Thursday, May 2, 2019 4:14 PM To: llvm-dev Subject: [llvm-dev] llvm is illegally vectorizing with a recurrence on skylake Hi -- I have found a bug in an HPC code where llvm is vectorizing a loop on Skylake that has an obvious recurrence. I derived a small test case based on the original benchmark below: /*****************************************************************/ static void __attribute__ ((always_inline)) one( const int *restrict in, const int *const end, const unsigned shift, int *const restrict index, int *const restrict out) { do { int a_idx = *in>>shift; int b_idx = index[a_idx]; out[b_idx] = *in; // <-- reccurence as index[a_idx] can be the index[a_idx]++; // same and incremented within the vector } while(++in!=end); // which leads to incorrect results } #ifndef NO_TWO static void __attribute__ ((noinline)) two( const int *restrict in, const int *const end, const unsigned shift, int *const restrict index, int *const restrict out) { do out[index[(*in>>shift)]++]=*in; while(++in!=end); } #endif void parent( int digits, int n, int *restrict work, int * restrict idx, int *restrict shift, int **restrict indicies) { int *in = work; int *dst = work+n; // int *indicies[1024]; // int shift[1024]; int d; for(d=1;d!=digits-1;++d) { int *t; one(in,in+n,shift[d],indicies[d],dst); t=in,in=dst,dst=t; } #ifndef NO_TWO two(in,in+n,shift[d],indicies[d],idx); #endif } /*****************************************************************/ clang -S -O2 -Rpass=loop-vectorize small.c -march=skylake-avx512 small.c:6:3: remark: vectorized loop (vectorization width: 16, interleaved count: 1) [-Rpass=loop-vectorize] do { ^ I believe the problem to be a issue with dependency information getting destroyed because if you remove the two() function (or compile one() on its own, or prevent inlining of one()), it correctly prevents vectorization. clang -S -O2 -Rpass=loop-vectorize -Rpass-missed=loop-vectorize small.c -march=skylake-avx512 -DNO_TWO small.c:6:3: remark: loop not vectorized [-Rpass-missed=loop-vectorize] do { I did trace it down to possibly being something within DepChecker->areDepsSafe() as it returns true for the incorrect case. Thanks, Scott -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190502/2172d96d/attachment.html>
Scott Manley via llvm-dev
2019-May-02 23:55 UTC
[llvm-dev] llvm is illegally vectorizing with a recurrence on skylake
I can file a bug, no problem. I've just seen folks start on the list first. Cheers, Scott On Thu, May 2, 2019, 6:53 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:> Hi, Scott, > > Thanks for reporting this problem. We should get a bug filed on this issue > at bugs.llvm.org. If you're not able to do this, please let us know, and > someone else can take care of it. > > -Hal > > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > > ------------------------------ > *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Scott > Manley via llvm-dev <llvm-dev at lists.llvm.org> > *Sent:* Thursday, May 2, 2019 4:14 PM > *To:* llvm-dev > *Subject:* [llvm-dev] llvm is illegally vectorizing with a recurrence on > skylake > > Hi -- I have found a bug in an HPC code where llvm is vectorizing a loop > on Skylake that has an obvious recurrence. I derived a small test case > based on the original benchmark below: > > /*****************************************************************/ > static void __attribute__ ((always_inline)) one( > const int *restrict in, const int *const end, > const unsigned shift, int *const restrict index, > int *const restrict out) > { > do { > int a_idx = *in>>shift; > int b_idx = index[a_idx]; > out[b_idx] = *in; // <-- reccurence as index[a_idx] can > be the > index[a_idx]++; // same and incremented within > the vector > } while(++in!=end); // which leads to incorrect results > } > > #ifndef NO_TWO > static void __attribute__ ((noinline)) two( > const int *restrict in, const int *const end, > const unsigned shift, int *const restrict index, > int *const restrict out) > { > do out[index[(*in>>shift)]++]=*in; while(++in!=end); > } > #endif > > void parent( > int digits, int n, int *restrict work, int * restrict idx, > int *restrict shift, int **restrict indicies) > { > int *in = work; > int *dst = work+n; > // int *indicies[1024]; > // int shift[1024]; > int d; > for(d=1;d!=digits-1;++d) { > int *t; > one(in,in+n,shift[d],indicies[d],dst); > t=in,in=dst,dst=t; > } > #ifndef NO_TWO > two(in,in+n,shift[d],indicies[d],idx); > #endif > } > /*****************************************************************/ > > clang -S -O2 -Rpass=loop-vectorize small.c -march=skylake-avx512 > small.c:6:3: remark: vectorized loop (vectorization width: 16, interleaved > count: 1) [-Rpass=loop-vectorize] > do { > ^ > > I believe the problem to be a issue with dependency information getting > destroyed because if you remove the two() function (or compile one() on its > own, or prevent inlining of one()), it correctly prevents vectorization. > > clang -S -O2 -Rpass=loop-vectorize -Rpass-missed=loop-vectorize small.c > -march=skylake-avx512 -DNO_TWO > small.c:6:3: remark: loop not vectorized [-Rpass-missed=loop-vectorize] > do { > > I did trace it down to possibly being something within > DepChecker->areDepsSafe() as it returns true for the incorrect case. > > Thanks, > > Scott >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190502/83cc2022/attachment.html>