search for: matrixmul

Displaying 5 results from an estimated 5 matches for "matrixmul".

Did you mean: matrix_mul
2012 Jun 29
0
[LLVMdev] Another LLVM JIT extension to Python
...tions. Also, the generated source code seems to be a good start. It would be interesting to try it with Polly [1]. I believe that this could give great speedups for the naive matrix multiply implementation. Is there a way I can dump the content of the entire LLVM-IR module generated in the demo/matrixmul/matrixmul.py example? Cheers Tobi [1] http://polly.llvm.org
2012 Jun 29
1
[LLVMdev] Another LLVM JIT extension to Python
...nerated source code seems to be a good start. It would be > interesting to try it with Polly [1]. I believe that this could give > great speedups for the naive matrix multiply implementation. > Is there a way I can dump the content of the entire LLVM-IR module > generated in the demo/matrixmul/matrixmul.py example? > > Cheers > Tobi > > [1] http://polly.llvm.org > Hi Tobi, Thank you for your feedback. I will be looking at Polly for better locality optimization. Can I simply include Polly as optimization passes? If so, the pymothoa/llvm_backend/default_passes.py can b...
2012 Jun 28
3
[LLVMdev] Another LLVM JIT extension to Python
Dear LLVM, I am a young developer who have just uploaded my first opensource project based on LLVM. I would like to know what professionals think of my project. I have started a JIT extension to Python called Pymothoa ( http://code.google.com/p/pymothoa/). Unlike other similar projects, I did not modify the interpreter. Pymothoa uses Python decorators to mark function for JIT compiling. It
2013 Nov 19
7
Quadrified GTX 480 VT-d passthrough. CUDA 5.5 in Linux partial success
...at can be taken to resolve this issue. If I restart the VM I can run a single CUDA app again, once. It''s still pretty impressive to be able to do that without having to patch Xen or reboot the entire machine =) It doesn''t seem to matter what CUDA app I''m running, here is matrixMul for example: matrixMul# ./matrixMul [Matrix Multiply Using CUDA] - Starting... GPU Device 0: "Quadro 6000" with compute capability 2.0 MatrixA(320,320), MatrixB(640,320) Computing result using CUDA Kernel... done Performance= 227.22 GFlop/s, Time= 0.577 msec, Size= 131072000 Ops, Workgr...
2013 Nov 18
0
Quadrified GTX 480 VT-d passthrough. CUDA 5.5 in Linux partial success!
...t can be taken to resolve this issue. If I restart the VM I can run a single CUDA app again, once. It''s still pretty impressive to be able to do that without having to patch Xen or reboot the entire machine =) It doesn''t seem to matter what CUDA app I''m running, here is matrixMul for example: matrixMul# ./matrixMul [Matrix Multiply Using CUDA] - Starting... GPU Device 0: "Quadro 6000" with compute capability 2.0 MatrixA(320,320), MatrixB(640,320) Computing result using CUDA Kernel... done Performance= 227.22 GFlop/s, Time= 0.577 msec, Size= 131072000 Ops, Workgro...