search for: msse3

Displaying 20 results from an estimated 89 matches for "msse3".

Did you mean: ssse3
2009 Feb 04
0
[LLVMdev] -msse3 can degrade performance
On Feb 2, 2009, at 3:00 PM, Jon Harrop wrote: > On Monday 02 February 2009 20:37:47 you wrote: >> On Feb 2, 2009, at 12:39 PM, Jon Harrop wrote: >>> On Monday 02 February 2009 06:10:26 Chris Lattner wrote: >>>> I'm seeing exactly identical .s files with -msse2 and -msse3 on the >>>> scimark version I have. Can you please send the output of: >>>> >>>> llvm-gcc -O3 MonteCarlo.c -S -msse2 -o MonteCarlo.2.s >>>> llvm-gcc -O3 MonteCarlo.c -S -msse3 -o MonteCarlo.3.s >>>> >>>> llvm-gcc -O3 MonteCarl...
2006 Jun 26
1
Patch for rgl with gcc 4.0 in R 2.3.0 on OS X
...config.status: creating src/Makevars ** libs ** arch - i386 g++-4.0 -arch i386 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -DRGL_USE_CARBON -I/System/Library/Frameworks/AGL.framework/Headers -DHAVE_PNG_H -I/opt/local/include/libpng12 -msse3 -fPIC -fno-common -g -O2 -march=pentium-m -mtune=prescott -c BBoxDeco.cpp -o BBoxDeco.o g++-4.0 -arch i386 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -DRGL_USE_CARBON -I/System/Library/Frameworks/AGL.framework/Headers -DHAVE_PNG_H...
2009 Jan 31
1
[LLVMdev] -msse3 can degrade performance
On Saturday 31 January 2009 03:42:04 Eli Friedman wrote: > On Fri, Jan 30, 2009 at 5:43 PM, Jon Harrop <jon at ffconsultancy.com> wrote: > > I just remembered an anomalous result that I stumbled upon whilst > > tweaking the command-line options to llvm-gcc. Specifically, the -msse3 > > flag > > The -msse3 flag? Does the -msse2 flag have a similar effect? Yes: $ llvm-gcc -Wall -lm -O3 -msse2 *.c -o scimark2 $ ./scimark2 Composite Score: 525.99 FFT Mflops: 538.35 (N=1024) SOR Mflops: 472.29 (100 x 100) MonteCarlo: Mf...
2009 Jan 31
2
[LLVMdev] -msse3 can degrade performance
I just remembered an anomalous result that I stumbled upon whilst tweaking the command-line options to llvm-gcc. Specifically, the -msse3 flag does a great job improving the performance of floating point intensive code on the SciMark2 benchmark but it also degrades the performance of the int-intensive Monte Carlo part of the test: $ llvm-gcc -Wall -lm -O3 *.c -o scimark2 $ ./scimark2 Using 2.00 seconds min time per kenel. C...
2009 Jan 31
0
[LLVMdev] -msse3 can degrade performance
On Fri, Jan 30, 2009 at 5:43 PM, Jon Harrop <jon at ffconsultancy.com> wrote: > > I just remembered an anomalous result that I stumbled upon whilst tweaking the > command-line options to llvm-gcc. Specifically, the -msse3 flag The -msse3 flag? Does the -msse2 flag have a similar effect? -Eli
2012 Apr 03
3
[LLVMdev] pb05 results for current llvm/dragonegg
...o do almost nothing in terms of vectorization. Do we need to pass any additional flags to actually achieve autovectorization via llvm (in absence of -ftree-vectorize and -fplugin-arg-dragonegg-enable-gcc-optzns)? Jack > > Ciao, Duncan. > > The benchmarks >> for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate >> since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom >> (http://llvm.org/bugs/show_bug.cgi?id=12434). >> Jack >> >> llvm/dragonegg r153877 >...
2011 Jun 09
3
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
Duncan, Below are the tabulated compile times and executable sizes. A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize Compile time (seconds) Benchmark A) stock B) gcc 4.5.4/ C) g...
2012 Apr 02
6
[LLVMdev] pb05 results for current llvm/dragonegg
Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. The benchmarks for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom (http://llvm.org/bugs/show_bug.cgi?id=12434). Jack llvm/dragonegg r153877 dragonegg: de-gfortran46 -msse3 -ffast-math -fun...
2012 Apr 03
0
[LLVMdev] pb05 results for current llvm/dragonegg
...rrent default of 6 (for example, -bb-vectorize-req-chain-depth=3) will cause a lot more vectorization. -Hal (in > absence of -ftree-vectorize and > -fplugin-arg-dragonegg-enable-gcc-optzns)? Jack > > > > > Ciao, Duncan. > > > > The benchmarks > >> for -msse3 and -msse4 appear identical (at least for degg+optnz). > >> This is fortunate since there seems to be a bug in -msse4 on 2.33 > >> GHz (T7600) Intel Core 2 Duo Merom > >> (http://llvm.org/bugs/show_bug.cgi?id=12434). Jack > >> > >> llvm/dragonegg r153877...
2011 Jun 09
3
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
...2005 benchmarks compared to stock dragonegg and stock gcc 4.5.4. The runtime benchmarks below show that we average slightly faster than stock gcc 4.5.4 and significantly faster than stock dragonegg through the use of -fplugin-arg-dragonegg-enable-gcc-optzns. x86_64 darwin A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/...
2011 Jun 09
3
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
On Thu, Jun 09, 2011 at 03:44:40PM +0200, Duncan Sands wrote: > Hi Jack, thanks for doing this. > >> Below are the tabulated compile times and executable sizes. >> >> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize >> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns >> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize > > These numbers really surprised me: the GCC c...
2008 May 24
1
RSPerl & OS X
...lude -I/Library/Frameworks/R.framework/Resources/include/i386 -I. -g -pipe -fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include -I/System/Library/Perl/5.8.6/darwin-thread-multi-2level/CORE -DPERL_POLLUTE -D_R_=1 -DUSE_R=1 -DUSE_TOPLEVEL_EXEC=1 -DWITH_R_IN_PERL=1 -msse3 -fPIC -g -O2 -march=nocona -c Converters.c -o Converters.o gcc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -mmacosx-version-min=10.4 -std=gnu99 -no-cpp-precomp -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -I. -g -pipe -fn...
2012 Apr 03
0
[LLVMdev] pb05 results for current llvm/dragonegg
...> Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn > on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. thanks for the numbers. How does this compare to LLVM 3.0 - were there any regressions? Ciao, Duncan. The benchmarks > for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate > since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom > (http://llvm.org/bugs/show_bug.cgi?id=12434). > Jack > > llvm/dragonegg r153877 > > dragonegg: &gt...
2011 Jun 10
0
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
...--with-ppl=/sw --with-cloog=/sw --with-mpc=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.5 --enable-lto --enable-checking=release Thread model: posix gcc version 4.5.4 20110608 (prerelease) (GCC) x86_64 darwin A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize D) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vector...
2011 Jun 09
0
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
Hi Jack, thanks for doing this. > Below are the tabulated compile times and executable sizes. > > A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize > B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns > C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize These numbers really surprised me: the GCC code generators mu...
2012 Dec 09
3
[LLVMdev] pb05 benchmarks for llvm/dragonegg 3.2
...austive effort been made yet to insure that llvm/dragonegg isn't still unnecessarily scalarizing the vector code generated by FSF gcc? If that issue were completely solved, llvm/dragonegg might become faster than vanilla FSF gcc. FSF gcc 4.7.2 with llvm/dragonegg 3.2 branch a) de-gfortran47 -msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n b) de-gfortran47 -msse3 -ffast-math -funroll-loops -O3 -fplugin-arg-dragonegg-enable-gcc-optzns %n.f90 -o %n c) gfortran-fsf-4.7 msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n Run time (secs) Benchmark de-gfortran47 de-gfortran47+optzns gfortran...
2011 Jun 09
0
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
...o stock dragonegg and stock gcc 4.5.4. The runtime > benchmarks below show that we average slightly faster than stock gcc 4.5.4 and significantly > faster than stock dragonegg through the use of -fplugin-arg-dragonegg-enable-gcc-optzns. > > x86_64 darwin > > A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize > B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns > C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize > > > Benchmark A) stock B) gcc 4.5.4/ C) g...
2012 Oct 13
0
XML_3.95-0.1.tar.gz does not build on FreeBSD
...ML_3.95-0.1.tar.gz [..snip..] Checking for 1.8: -DR_HAS_REMOVE_FINALIZERS=1 -I/usr/local/include/libxml2 -I/usr/local/include Using libxml2.* checking for gzopen in -lz... gcc46 -std=gnu99 -I/usr/local/lib/R/include -DNDEBUG -I/usr/local/include -fpic -O2 -pipe -O2 -fno-strict-aliasing -pipe -msse3 -Wl,-rpath=/usr/local/lib/gcc46 -c testRemoveFinalizers.c -o testRemoveFinalizers.o testRemoveFinalizers.c: In function 'foo': testRemoveFinalizers.c:7:2: warning: implicit declaration of function 'R_RemoveExtPtrWeakRef_direct' [-Wimplicit-function-declaration] gcc46 -std=gnu99 -fp...
2012 Dec 10
0
[LLVMdev] pb05 benchmarks for llvm/dragonegg 3.2
...ved, llvm/dragonegg might become faster > than vanilla FSF gcc. Another issue is that, until recently, LLVM didn't have much in the way of fast-math optimizations. It should be better in 3.3. Ciao, Duncan. > > FSF gcc 4.7.2 with llvm/dragonegg 3.2 branch > > a) de-gfortran47 -msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n > b) de-gfortran47 -msse3 -ffast-math -funroll-loops -O3 -fplugin-arg-dragonegg-enable-gcc-optzns %n.f90 -o %n > c) gfortran-fsf-4.7 msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n > > Run time (secs) > > Benchmark de-gfortran47 d...
2012 Apr 03
1
[LLVMdev] pb05 results for current llvm/dragonegg
Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. The benchmarks for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom (http://llvm.org/bugs/show_bug.cgi?id=12434). I've added two additional entries to the table. The first, degg+novect+optnz, should show the...