Displaying 20 results from an estimated 89 matches for "msse3".
Did you mean:
ssse3
2009 Feb 04
0
[LLVMdev] -msse3 can degrade performance
On Feb 2, 2009, at 3:00 PM, Jon Harrop wrote:
> On Monday 02 February 2009 20:37:47 you wrote:
>> On Feb 2, 2009, at 12:39 PM, Jon Harrop wrote:
>>> On Monday 02 February 2009 06:10:26 Chris Lattner wrote:
>>>> I'm seeing exactly identical .s files with -msse2 and -msse3 on the
>>>> scimark version I have. Can you please send the output of:
>>>>
>>>> llvm-gcc -O3 MonteCarlo.c -S -msse2 -o MonteCarlo.2.s
>>>> llvm-gcc -O3 MonteCarlo.c -S -msse3 -o MonteCarlo.3.s
>>>>
>>>> llvm-gcc -O3 MonteCarl...
2006 Jun 26
1
Patch for rgl with gcc 4.0 in R 2.3.0 on OS X
...config.status: creating src/Makevars
** libs
** arch - i386
g++-4.0 -arch i386 -I/Library/Frameworks/R.framework/Resources/include
-I/Library/Frameworks/R.framework/Resources/include/i386 -DRGL_USE_CARBON
-I/System/Library/Frameworks/AGL.framework/Headers -DHAVE_PNG_H
-I/opt/local/include/libpng12 -msse3 -fPIC -fno-common -g -O2
-march=pentium-m -mtune=prescott -c BBoxDeco.cpp -o BBoxDeco.o
g++-4.0 -arch i386 -I/Library/Frameworks/R.framework/Resources/include
-I/Library/Frameworks/R.framework/Resources/include/i386 -DRGL_USE_CARBON
-I/System/Library/Frameworks/AGL.framework/Headers -DHAVE_PNG_H...
2009 Jan 31
1
[LLVMdev] -msse3 can degrade performance
On Saturday 31 January 2009 03:42:04 Eli Friedman wrote:
> On Fri, Jan 30, 2009 at 5:43 PM, Jon Harrop <jon at ffconsultancy.com> wrote:
> > I just remembered an anomalous result that I stumbled upon whilst
> > tweaking the command-line options to llvm-gcc. Specifically, the -msse3
> > flag
>
> The -msse3 flag? Does the -msse2 flag have a similar effect?
Yes:
$ llvm-gcc -Wall -lm -O3 -msse2 *.c -o scimark2
$ ./scimark2
Composite Score: 525.99
FFT Mflops: 538.35 (N=1024)
SOR Mflops: 472.29 (100 x 100)
MonteCarlo: Mf...
2009 Jan 31
2
[LLVMdev] -msse3 can degrade performance
I just remembered an anomalous result that I stumbled upon whilst tweaking the
command-line options to llvm-gcc. Specifically, the -msse3 flag does a great
job improving the performance of floating point intensive code on the
SciMark2 benchmark but it also degrades the performance of the int-intensive
Monte Carlo part of the test:
$ llvm-gcc -Wall -lm -O3 *.c -o scimark2
$ ./scimark2
Using 2.00 seconds min time per kenel.
C...
2009 Jan 31
0
[LLVMdev] -msse3 can degrade performance
On Fri, Jan 30, 2009 at 5:43 PM, Jon Harrop <jon at ffconsultancy.com> wrote:
>
> I just remembered an anomalous result that I stumbled upon whilst tweaking the
> command-line options to llvm-gcc. Specifically, the -msse3 flag
The -msse3 flag? Does the -msse2 flag have a similar effect?
-Eli
2012 Apr 03
3
[LLVMdev] pb05 results for current llvm/dragonegg
...o do almost
nothing in terms of vectorization. Do we need to pass any additional flags
to actually achieve autovectorization via llvm (in absence of -ftree-vectorize
and -fplugin-arg-dragonegg-enable-gcc-optzns)?
Jack
>
> Ciao, Duncan.
>
> The benchmarks
>> for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate
>> since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom
>> (http://llvm.org/bugs/show_bug.cgi?id=12434).
>> Jack
>>
>> llvm/dragonegg r153877
>...
2011 Jun 09
3
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
Duncan,
Below are the tabulated compile times and executable sizes.
A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize
B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
Compile time (seconds)
Benchmark A) stock B) gcc 4.5.4/ C) g...
2012 Apr 02
6
[LLVMdev] pb05 results for current llvm/dragonegg
Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn
on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. The benchmarks
for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate
since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom
(http://llvm.org/bugs/show_bug.cgi?id=12434).
Jack
llvm/dragonegg r153877
dragonegg:
de-gfortran46 -msse3 -ffast-math -fun...
2012 Apr 03
0
[LLVMdev] pb05 results for current llvm/dragonegg
...rrent default of 6 (for
example, -bb-vectorize-req-chain-depth=3) will cause a lot more
vectorization.
-Hal
(in
> absence of -ftree-vectorize and
> -fplugin-arg-dragonegg-enable-gcc-optzns)? Jack
>
> >
> > Ciao, Duncan.
> >
> > The benchmarks
> >> for -msse3 and -msse4 appear identical (at least for degg+optnz).
> >> This is fortunate since there seems to be a bug in -msse4 on 2.33
> >> GHz (T7600) Intel Core 2 Duo Merom
> >> (http://llvm.org/bugs/show_bug.cgi?id=12434). Jack
> >>
> >> llvm/dragonegg r153877...
2011 Jun 09
3
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
...2005 benchmarks compared to stock dragonegg and stock gcc 4.5.4. The runtime
benchmarks below show that we average slightly faster than stock gcc 4.5.4 and significantly
faster than stock dragonegg through the use of -fplugin-arg-dragonegg-enable-gcc-optzns.
x86_64 darwin
A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize
B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/...
2011 Jun 09
3
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
On Thu, Jun 09, 2011 at 03:44:40PM +0200, Duncan Sands wrote:
> Hi Jack, thanks for doing this.
>
>> Below are the tabulated compile times and executable sizes.
>>
>> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize
>> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
>> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
>
> These numbers really surprised me: the GCC c...
2008 May 24
1
RSPerl & OS X
...lude
-I/Library/Frameworks/R.framework/Resources/include/i386 -I. -g -pipe
-fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing
-I/usr/local/include
-I/System/Library/Perl/5.8.6/darwin-thread-multi-2level/CORE -DPERL_POLLUTE
-D_R_=1 -DUSE_R=1 -DUSE_TOPLEVEL_EXEC=1 -DWITH_R_IN_PERL=1 -msse3 -fPIC
-g -O2 -march=nocona -c Converters.c -o Converters.o
gcc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk
-mmacosx-version-min=10.4 -std=gnu99 -no-cpp-precomp
-I/Library/Frameworks/R.framework/Resources/include
-I/Library/Frameworks/R.framework/Resources/include/i386 -I. -g -pipe
-fn...
2012 Apr 03
0
[LLVMdev] pb05 results for current llvm/dragonegg
...> Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn
> on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3.
thanks for the numbers. How does this compare to LLVM 3.0 - were there any
regressions?
Ciao, Duncan.
The benchmarks
> for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate
> since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom
> (http://llvm.org/bugs/show_bug.cgi?id=12434).
> Jack
>
> llvm/dragonegg r153877
>
> dragonegg:
>...
2011 Jun 10
0
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
...--with-ppl=/sw --with-cloog=/sw --with-mpc=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.5 --enable-lto --enable-checking=release
Thread model: posix
gcc version 4.5.4 20110608 (prerelease) (GCC)
x86_64 darwin
A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize
B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
D) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vector...
2011 Jun 09
0
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
Hi Jack, thanks for doing this.
> Below are the tabulated compile times and executable sizes.
>
> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize
> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
These numbers really surprised me: the GCC code generators mu...
2012 Dec 09
3
[LLVMdev] pb05 benchmarks for llvm/dragonegg 3.2
...austive effort been made yet to insure that llvm/dragonegg isn't still unnecessarily scalarizing
the vector code generated by FSF gcc? If that issue were completely solved, llvm/dragonegg might become faster
than vanilla FSF gcc.
FSF gcc 4.7.2 with llvm/dragonegg 3.2 branch
a) de-gfortran47 -msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n
b) de-gfortran47 -msse3 -ffast-math -funroll-loops -O3 -fplugin-arg-dragonegg-enable-gcc-optzns %n.f90 -o %n
c) gfortran-fsf-4.7 msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n
Run time (secs)
Benchmark de-gfortran47 de-gfortran47+optzns gfortran...
2011 Jun 09
0
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
...o stock dragonegg and stock gcc 4.5.4. The runtime
> benchmarks below show that we average slightly faster than stock gcc 4.5.4 and significantly
> faster than stock dragonegg through the use of -fplugin-arg-dragonegg-enable-gcc-optzns.
>
> x86_64 darwin
>
> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize
> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
>
>
> Benchmark A) stock B) gcc 4.5.4/ C) g...
2012 Oct 13
0
XML_3.95-0.1.tar.gz does not build on FreeBSD
...ML_3.95-0.1.tar.gz
[..snip..]
Checking for 1.8: -DR_HAS_REMOVE_FINALIZERS=1
-I/usr/local/include/libxml2 -I/usr/local/include
Using libxml2.*
checking for gzopen in -lz... gcc46 -std=gnu99
-I/usr/local/lib/R/include -DNDEBUG -I/usr/local/include -fpic -O2
-pipe -O2 -fno-strict-aliasing -pipe -msse3
-Wl,-rpath=/usr/local/lib/gcc46 -c testRemoveFinalizers.c -o
testRemoveFinalizers.o
testRemoveFinalizers.c: In function 'foo':
testRemoveFinalizers.c:7:2: warning: implicit declaration of function
'R_RemoveExtPtrWeakRef_direct' [-Wimplicit-function-declaration]
gcc46 -std=gnu99 -fp...
2012 Dec 10
0
[LLVMdev] pb05 benchmarks for llvm/dragonegg 3.2
...ved, llvm/dragonegg might become faster
> than vanilla FSF gcc.
Another issue is that, until recently, LLVM didn't have much in the way of
fast-math optimizations. It should be better in 3.3.
Ciao, Duncan.
>
> FSF gcc 4.7.2 with llvm/dragonegg 3.2 branch
>
> a) de-gfortran47 -msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n
> b) de-gfortran47 -msse3 -ffast-math -funroll-loops -O3 -fplugin-arg-dragonegg-enable-gcc-optzns %n.f90 -o %n
> c) gfortran-fsf-4.7 msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n
>
> Run time (secs)
>
> Benchmark de-gfortran47 d...
2012 Apr 03
1
[LLVMdev] pb05 results for current llvm/dragonegg
Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn
on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. The benchmarks
for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate
since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom
(http://llvm.org/bugs/show_bug.cgi?id=12434). I've added two additional entries to
the table. The first, degg+novect+optnz, should show the...