Displaying 2 results from an estimated 2 matches for "f544".
Did you mean:
544
2018 Jun 21
2
NVPTX - Reordering load instructions
...peri_col[idx][i] -= peri_col[idx][j] * dia[j][i];
> peri_col[idx][i] /= dia[i][i];
> }
NVCC emits PTX instructions where all loads from shared memory are
packed together:
> ...
> ld.shared.f32 %f546, [kernel_dia+440];
> ld.shared.f32 %f545, [%r4+-996];
> ld.shared.f32 %f544, [kernel_dia+56];
> ld.shared.f32 %f543, [kernel_dia+88];
> ld.shared.f32 %f542, [kernel_dia+500];
> ld.shared.f32 %f541, [kernel_dia+84];
> ld.shared.f32 %f540, [%r4+-972];
> ld.shared.f32 %f539, [%r4+-1008];
> ld.shared.f32 %f538, [kernel_dia+496];
> ld.shared.f32...
2018 Jun 21
2
NVPTX - Reordering load instructions
...i] /= dia[i][i];
> >> }
> > NVCC emits PTX instructions where all loads from shared memory are
> > packed together:
> >
> >> ...
> >> ld.shared.f32 %f546, [kernel_dia+440];
> >> ld.shared.f32 %f545, [%r4+-996];
> >> ld.shared.f32 %f544, [kernel_dia+56];
> >> ld.shared.f32 %f543, [kernel_dia+88];
> >> ld.shared.f32 %f542, [kernel_dia+500];
> >> ld.shared.f32 %f541, [kernel_dia+84];
> >> ld.shared.f32 %f540, [%r4+-972];
> >> ld.shared.f32 %f539, [%r4+-1008];
> >> ld.sh...