search for: rd9

Displaying 1 result from an estimated 1 matches for "rd9".

Did you mean: r9
2014 Oct 24
3
[LLVMdev] IndVar widening in IndVarSimplify causing performance regression on GPU programs
...12; setp.lt.s32 %p2, %r6, %r3; @%p2 bra BB0_2; in which %r6 is the induction variable i. With widening, the loop body becomes: BB0_2: // =>This Inner Loop Header: Depth=1 mul.lo.s64 %rd8, %rd10, %rd10; st.u32 [%rd9], %rd8; add.s64 %rd10, %rd10, 3; add.s64 %rd9, %rd9, 12; setp.lt.s64 %p2, %rd10, %rd1; @%p2 bra BB0_2; Although the number of PTX instructions in both versions are the same, the version with widening uses more mul.lo.s64, add.s64, and...