search for: matmult

Displaying 19 results from an estimated 19 matches for "matmult".

Did you mean: matmul
2012 Sep 19
3
[LLVMdev] counting branch frequencies
Thanks everyone for the replies. After some experimentation, I found that the order in which the passes are specified matters: opt -O3 -profile-loader matmult.bc -o matmult.opt.bc (works) opt -profile-loader -O3 matmult.bc -o matmult.opt.bc (does not work) Also, I am able to avoid the inconsistency warning only for optimization levels -O3 and -O2. I get that warning when using -O1 and -disable-opt. Anyone else have this experience? Or, any ideas why...
2009 Jan 31
1
[LLVMdev] -msse3 can degrade performance
...The -msse3 flag? Does the -msse2 flag have a similar effect? Yes: $ llvm-gcc -Wall -lm -O3 -msse2 *.c -o scimark2 $ ./scimark2 Composite Score: 525.99 FFT Mflops: 538.35 (N=1024) SOR Mflops: 472.29 (100 x 100) MonteCarlo: Mflops: 120.92 Sparse matmult Mflops: 585.14 (N=1000, nz=5000) LU Mflops: 913.27 (M=100, N=100) But -msse does not: $ llvm-gcc -Wall -lm -O3 -msse *.c -o scimark2 $ ./scimark2 Composite Score: 540.08 FFT Mflops: 535.04 (N=1024) SOR Mflops: 469.99 (100 x 10...
2009 Jan 31
2
[LLVMdev] -msse3 can degrade performance
...Monte Carlo part of the test: $ llvm-gcc -Wall -lm -O3 *.c -o scimark2 $ ./scimark2 Using 2.00 seconds min time per kenel. Composite Score: 432.84 FFT Mflops: 358.90 (N=1024) SOR Mflops: 473.45 (100 x 100) MonteCarlo: Mflops: 210.54 Sparse matmult Mflops: 354.25 (N=1000, nz=5000) LU Mflops: 767.04 (M=100, N=100) $ llvm-gcc -Wall -lm -O3 -msse3 *.c -o scimark2 $ ./scimark2 Composite Score: 548.53 FFT Mflops: 609.87 (N=1024) SOR Mflops: 497.92 (100 x 100) MonteCarlo: M...
2012 Sep 19
0
[LLVMdev] counting branch frequencies
...sumption is not wrong, is there any way to fix the results? Thanks. -Apala On 09/19/2012 09:53 AM, apala guha wrote: > Thanks everyone for the replies. After some experimentation, I found > that the order in which the passes are specified matters: > > opt -O3 -profile-loader matmult.bc -o matmult.opt.bc (works) > opt -profile-loader -O3 matmult.bc -o matmult.opt.bc (does not work) > > > Also, I am able to avoid the inconsistency warning only for > optimization levels -O3 and -O2. I get that warning when using -O1 and > -disable-opt. > > Anyone else ha...
2011 Apr 24
2
random roundoff?
...s matrix multiplications, I have a situation in in which the result depends on the nature of nearby I/O. Thus, with all arithmetic done with type double, and where values are mostly in the range [-1.0e0,+1.0e0] or nearby, I do: cerr << "some stuff" << endl; mat3 = matmult(mat1,mat2); I get a difference of the order 1.0e-15 depending on whether the cerr line does or does not end in "endl" as shown. I am imagining that there is some "randomness" in the roundoff that depends on the I/O situation. Is this credible? Any other suggestions? Tha...
2012 Sep 19
0
[LLVMdev] counting branch frequencies
Hi Apala, Dibyendu is correct that this is likely due to pass order, but things get a bit complicated with -O[1-9] or -std-compile-opts as they insert early passes *before* the profiling code. I recommend that you use identical optimizations to insert instrumentation and to load the profiling data. E.g.: opt -insert-edge-profiling -O3 foo.bc -o foo.2.bc opt -profile-loader -O3 foo.bc
2015 Feb 17
6
[LLVMdev] [3.6 Release] RC3 has been tagged
...of 3.6-rc1 at > http://www.phoronix.com/scan.php?page=article&item=llvm-clang-3.5-3.6-rc1). > This same issue has also been reported in > http://llvm.org/bugs/show_bug.cgi?id=22058. In the case of the 22% > performance degradation in SciMark2's Sparse matmult benchmark, I have > identified both commits that contribute equally to this regression in > http://llvm.org/bugs/show_bug.cgi?id=22589... Thank you very much for trying out the release candidate. I asked a few of the other developers, and the consensus was that while unfortu...
2009 Jan 31
0
[LLVMdev] -msse3 can degrade performance
On Fri, Jan 30, 2009 at 5:43 PM, Jon Harrop <jon at ffconsultancy.com> wrote: > > I just remembered an anomalous result that I stumbled upon whilst tweaking the > command-line options to llvm-gcc. Specifically, the -msse3 flag The -msse3 flag? Does the -msse2 flag have a similar effect? -Eli
2009 Feb 04
0
[LLVMdev] -msse3 can degrade performance
...erformance results completely (still x86): > > $ llvm-gcc -O3 -msse3 -lm all.c -o all > $ ./all > Composite Score: 570.07 > FFT Mflops: 599.40 (N=1024) > SOR Mflops: 476.97 (100 x 100) > MonteCarlo: Mflops: 278.17 > Sparse matmult Mflops: 582.54 (N=1000, nz=5000) > LU Mflops: 913.27 (M=100, N=100) > $ gcc -O3 -msse3 -lm all.c -o all > $ ./all > Composite Score: 539.20 > FFT Mflops: 516.05 (N=1024) > SOR Mflops: 472.29 (100 x 100) > Mo...
2012 Sep 18
4
[LLVMdev] counting branch frequencies
I tried getting profile data from LLVM 3.1, using the method mentioned below. I tried it out on a simple matrix multiplication program. However, I noticed the following problems: 1. There is a warning message: "WARNING: profile information is inconsistent with the current program!" 2. The basic block counts (obtained from ProfileInfo::getExecutionCount(const BasicBlock*)) are
2009 Jan 30
5
[LLVMdev] Performance vs other VMs
The release of a new code generator in Mono 2.2 prompted me to benchmark the performance of various VMs using the SciMark2 benchmark on an 8x 2.1GHz 64-bit Opteron and I have published the results here: http://flyingfrogblog.blogspot.com/2009/01/mono-22.html The LLVM results were generated using llvm-gcc 4.2.1 on the C version of SciMark2 with the following command-line
2015 Feb 13
10
[LLVMdev] [3.6 Release] RC3 has been tagged
Hello testers, Start your engines, RC3 has just been tagged (at r229050 on the branch). If this one looks good, it will become the release. There has been quite a bit of activity on the branch since RC2; let's hope it's all goodness :-) Please let me know how it looks, and upload binaries to the sftp as usual. Thanks for all your efforts so far! - Hans
2009 Jan 31
0
[LLVMdev] Performance vs other VMs
----- Original Message ----- From: "Jon Harrop" <jon at ffconsultancy.com> To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> Sent: Saturday, January 31, 2009 6:56 AM Subject: [LLVMdev] Performance vs other VMs > > The release of a new code generator in Mono 2.2 prompted me to benchmark > the > performance of various VMs using the SciMark2
2010 May 08
0
[LLVMdev] Auto-Vectorization in LLVM
...first very simple tests started to work. At the moment we can detect matrix multiplication, create polyhedral information and code generate it again. Exporting the test case, optimizing it, and importing will be done in the next weeks. As soon as this is done, we can show impressive results for matmult and we compile the llvm-testsuite without crashing I will write a mail on the mailing list. Anybody who wants try polly earlier will probably trigger some unimplemented stuff. However you could try anyways. ;-) I will glad to help you with it. @Andreas: Do you believe your vectorization would...
2010 May 05
5
[LLVMdev] Auto-Vectorization in LLVM
Hi, I found out that Auto-Vectorization was implemented as a part of GSoC 2009. Can someone point me to the code repository including any documentation available? I would also like to know if there is any progress/future plans to include this in the main trunk? Best Regards, Raj
2009 Feb 01
0
[LLVMdev] Performance vs other VMs
This is not a quite fair comparison. Other virtual machines must be doing garbage collection, while LLVM, as it is using C code, it is taking advantage of memory allocation by hand. On Fri, Jan 30, 2009 at 9:56 PM, Jon Harrop <jon at ffconsultancy.com> wrote: > > The release of a new code generator in Mono 2.2 prompted me to benchmark the > performance of various VMs using the
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
Using the SciMark 2.0 code from http://math.nist.gov/scimark2/scimark2_1c.zip compiled with the same... make CFLAGS="-O3 -march=native" I am able to reproduce the 22% performance regression in the run time of the Sparse matmult benchmark. For 10 runs of the scimark2 benechmark, I get 998.439+/-0.4828 with the release llvm clang 3.5.1 compiler and 1217.363+/-1.1004 for the current clang 3.6svn from 3.6 branch. Not good. Jack On Sat, Feb 14, 2015 at 11:19 AM, Jack Howarth <howarth.mailing.lis...
2004 Dec 07
0
Installation of R-2.0.1 failure
...html latex example match text html latex example match.arg text html latex example match.call text html latex example match.fun text html latex example matmult text html latex example matrix text html latex example maxCol text html latex example mean text html latex example missing link(s): weighted.mean...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
The regressions in the performance of generated code, introduced by the llvm 3.6 release, don't seem to be limited to this 8 queens puzzle" solver test case. See... http://www.phoronix.com/scan.php?page=article&item=llvm-clang-3.5-3.6-rc1&num=1 where a bit hit in the performance of the Sparse Matrix Multiply test of the SciMark v2.0 benchmark was observed as well as