Displaying 2 results from an estimated 2 matches for "vec_remainder_body".
2018 Aug 02
2
Vectorizing remainder loop
...For (i=0;i<M;i++){ // where M is a multiple of 2048
If (I < N) {
Body
}
}
If your HW can't execute vector version of the above loop efficiently enough, it's already busted. Typically, when VF is that large, what you'll get in the remainder is masked vector like below, and vec_remainder_body is reasonably hot as you say in your original mail. As such, remainder loop vectorization isn't a solution for that problem.
for (i=0;i<N;i+=2048){
Vec_body
}
for (i<M;i+=1024){ // where M is the smallest multiple of 1024 over N
If (I < N) {
Vec_Remainder_Body
}
}
If you...
2018 Aug 03
2
Vectorizing remainder loop
...If (I < N) {
Body
}
}
If your HW can't execute vector version of the above loop efficiently enough, it's already busted. Typically, when VF is that large, what you'll get in the remainder is masked vector like below, and vec_remainder_body is reasonably hot as you say in your original mail. As such, remainder loop vectorization isn't a solution for that problem.
for (i=0;i<N;i+=2048){
Vec_body
}
for (i<M;i+=1024){ // where M is the smallest multiple of 1024 over N
If...