search for: f535

Displaying 2 results from an estimated 2 matches for "f535".

Did you mean: 535
2018 Jun 21
2
NVPTX - Reordering load instructions
..., [kernel_dia+500]; > ld.shared.f32 %f541, [kernel_dia+84]; > ld.shared.f32 %f540, [%r4+-972]; > ld.shared.f32 %f539, [%r4+-1008]; > ld.shared.f32 %f538, [kernel_dia+496]; > ld.shared.f32 %f537, [kernel_dia+136]; > ld.shared.f32 %f536, [%r4+-976]; > ld.shared.f32 %f535, [kernel_dia+428]; > ... # hundreds of these Even though this heavily bloats register usage (and NVCC seems to do this unconditionally, even with launch configurations where this could hurt performance), it allows the CUDA PTX JIT to emit 128-bit loads: > LDS.128 R76, [0x2f0]; > LDS.128...
2018 Jun 21
2
NVPTX - Reordering load instructions
...l_dia+84]; > >> ld.shared.f32 %f540, [%r4+-972]; > >> ld.shared.f32 %f539, [%r4+-1008]; > >> ld.shared.f32 %f538, [kernel_dia+496]; > >> ld.shared.f32 %f537, [kernel_dia+136]; > >> ld.shared.f32 %f536, [%r4+-976]; > >> ld.shared.f32 %f535, [kernel_dia+428]; > >> ... # hundreds of these > > Even though this heavily bloats register usage (and NVCC seems to do > > this unconditionally, even with launch configurations whe > <https://maps.google.com/?q=ons+whe&entry=gmail&source=g>re this could >...