search for: reesolv

Displaying 2 results from an estimated 2 matches for "reesolv".

Did you mean: resolv
2018 Jun 21
2
NVPTX - Reordering load instructions
...y laptop, I forget the exact filename, but it's called load-store vectorizer. I think the question is, why is LSV not vectorizing this code? I think the answer is, llvm can't tell that the loads are aligned. Ptxas can, but only because it's (apparently) doing vectorization *after* it reesolves the shmem variables to physical addresses. That is a cool trick, and llvm can't do it, because llvm never sees the physical shmem addresses. If you told llvm that the shmem variables were aligned to 16 bytes, LSV might do what you want here. llvm and ptxas should be able to cooperate to gi...
2018 Jun 21
2
NVPTX - Reordering load instructions
Hi all, I'm looking into the performance difference of a benchmark compiled with NVCC vs NVPTX (coming from Julia, not CUDA C) and I'm seeing a significant difference due to PTX instruction ordering. The relevant source code consists of two nested loops that get fully unrolled, doing some basic arithmetic with values loaded from shared memory: > #define BLOCK_SIZE 16 > >