Displaying 2 results from an estimated 2 matches for "reesolv".
Did you mean:
resolv
2018 Jun 21
2
NVPTX - Reordering load instructions
...y laptop, I forget the exact filename, but it's called
load-store vectorizer.
I think the question is, why is LSV not vectorizing this code?
I think the answer is, llvm can't tell that the loads are aligned. Ptxas
can, but only because it's (apparently) doing vectorization *after* it
reesolves the shmem variables to physical addresses. That is a cool trick,
and llvm can't do it, because llvm never sees the physical shmem addresses.
If you told llvm that the shmem variables were aligned to 16 bytes, LSV
might do what you want here. llvm and ptxas should be able to cooperate to
gi...
2018 Jun 21
2
NVPTX - Reordering load instructions
Hi all,
I'm looking into the performance difference of a benchmark compiled with
NVCC vs NVPTX (coming from Julia, not CUDA C) and I'm seeing a
significant difference due to PTX instruction ordering. The relevant
source code consists of two nested loops that get fully unrolled, doing
some basic arithmetic with values loaded from shared memory:
> #define BLOCK_SIZE 16
>
>