Displaying 5 results from an estimated 5 matches for "matrixmul".
Did you mean:
matrix_mul
2012 Jun 29
0
[LLVMdev] Another LLVM JIT extension to Python
...tions. Also, the
generated source code seems to be a good start. It would be interesting
to try it with Polly [1]. I believe that this could give great speedups
for the naive matrix multiply implementation.
Is there a way I can dump the content of the entire LLVM-IR module
generated in the demo/matrixmul/matrixmul.py example?
Cheers
Tobi
[1] http://polly.llvm.org
2012 Jun 29
1
[LLVMdev] Another LLVM JIT extension to Python
...nerated source code seems to be a good start. It would be
> interesting to try it with Polly [1]. I believe that this could give
> great speedups for the naive matrix multiply implementation.
> Is there a way I can dump the content of the entire LLVM-IR module
> generated in the demo/matrixmul/matrixmul.py example?
>
> Cheers
> Tobi
>
> [1] http://polly.llvm.org
>
Hi Tobi,
Thank you for your feedback. I will be looking at Polly for better
locality optimization. Can I simply include Polly as optimization
passes? If so, the pymothoa/llvm_backend/default_passes.py can b...
2012 Jun 28
3
[LLVMdev] Another LLVM JIT extension to Python
Dear LLVM,
I am a young developer who have just uploaded my first opensource
project based on LLVM. I would like to know what professionals think of
my project.
I have started a JIT extension to Python called Pymothoa (
http://code.google.com/p/pymothoa/). Unlike other similar projects, I
did not modify the interpreter. Pymothoa uses Python decorators to mark
function for JIT compiling. It
2013 Nov 19
7
Quadrified GTX 480 VT-d passthrough. CUDA 5.5 in Linux partial success
...at can be taken to resolve this issue.
If I restart the VM I can run a single CUDA app again, once. It''s
still pretty impressive to be able to do that without having to patch
Xen or reboot the entire machine =) It doesn''t seem to matter what CUDA app
I''m running, here is matrixMul
for example:
matrixMul# ./matrixMul
[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "Quadro 6000" with compute capability 2.0
MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
done
Performance= 227.22 GFlop/s, Time= 0.577 msec, Size= 131072000 Ops,
Workgr...
2013 Nov 18
0
Quadrified GTX 480 VT-d passthrough. CUDA 5.5 in Linux partial success!
...t can be taken to resolve this issue.
If I restart the VM I can run a single CUDA app again, once. It''s
still pretty impressive to be able to do that without having to patch
Xen or reboot the entire machine =)
It doesn''t seem to matter what CUDA app I''m running, here is matrixMul
for example:
matrixMul# ./matrixMul
[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "Quadro 6000" with compute capability 2.0
MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
done
Performance= 227.22 GFlop/s, Time= 0.577 msec, Size= 131072000 Ops,
Workgro...