Displaying 15 results from an estimated 15 matches for "msse4".
Did you mean:
msse3
2009 Jan 24
1
[LLVMdev] -O4 limitations in llvm/llvm-gcc-4.2 2.5?
...ARCH=Darwin_9_x86 lib);\
done
[ -d bin.Darwin_9_x86 ] || mkdir bin.Darwin_9_x86
cd bin.Darwin_9_x86; make -f /Users/howarth/xplor-nih-2.21/fortlib/libfft/Makefile \
ARCH=Darwin_9_x86 SRCDIR=/Users/howarth/xplor-nih-2.21/fortlib/libfft/ ARCHDEP=TRUE lib
llvm-gfortran -c -O4 -fPIC -ffast-math -msse4 -ffixed-line-length-120 -fno-second-underscore -DZOMPLEX /Users/howarth/xplor-nih-2.21/fortlib/libfft/dsint.f
llvm-gfortran -c -O4 -fPIC -ffast-math -msse4 -ffixed-line-length-120 -fno-second-underscore -DZOMPLEX /Users/howarth/xplor-nih-2.21/fortlib/libfft/dsinti.f
llvm-gfortran -c -O4 -fPIC...
2011 Oct 12
0
[LLVMdev] dragonegg svn benchmarks
Hi Chris,
>> PS: With -fplugin-arg-dragonegg-enable-gcc-optzns the LLVM optimizers are run at
>> the following levels:
>>
>> Command line option LLVM optimizers run at
>> ------------------- ----------------------
>> -O1 tiny amount of optimization
>> -O2 or -O3 -O1
>> -O4 or -O5
2011 Oct 11
4
[LLVMdev] dragonegg svn benchmarks
On Oct 8, 2011, at 12:05 PM, Duncan Sands wrote:
> PS: With -fplugin-arg-dragonegg-enable-gcc-optzns the LLVM optimizers are run at
> the following levels:
>
> Command line option LLVM optimizers run at
> ------------------- ----------------------
> -O1 tiny amount of optimization
> -O2 or -O3 -O1
> -O4 or -O5
2012 Apr 02
6
[LLVMdev] pb05 results for current llvm/dragonegg
Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn
on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. The benchmarks
for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate
since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom
(http://llvm.org/bugs/show_bug.cgi?id=12434).
Jack
llvm/dragonegg r153877
dragonegg:
de-gfortran46 -msse3 -ffast-math -funroll-loops...
2012 Apr 03
0
[LLVMdev] pb05 results for current llvm/dragonegg
...ttached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn
> on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3.
thanks for the numbers. How does this compare to LLVM 3.0 - were there any
regressions?
Ciao, Duncan.
The benchmarks
> for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate
> since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom
> (http://llvm.org/bugs/show_bug.cgi?id=12434).
> Jack
>
> llvm/dragonegg r153877
>
> dragonegg:
> de-gfortr...
2012 Apr 03
3
[LLVMdev] pb05 results for current llvm/dragonegg
...nothing in terms of vectorization. Do we need to pass any additional flags
to actually achieve autovectorization via llvm (in absence of -ftree-vectorize
and -fplugin-arg-dragonegg-enable-gcc-optzns)?
Jack
>
> Ciao, Duncan.
>
> The benchmarks
>> for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate
>> since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom
>> (http://llvm.org/bugs/show_bug.cgi?id=12434).
>> Jack
>>
>> llvm/dragonegg r153877
>>
>&g...
2012 Apr 03
0
[LLVMdev] pb05 results for current llvm/dragonegg
...lt of 6 (for
example, -bb-vectorize-req-chain-depth=3) will cause a lot more
vectorization.
-Hal
(in
> absence of -ftree-vectorize and
> -fplugin-arg-dragonegg-enable-gcc-optzns)? Jack
>
> >
> > Ciao, Duncan.
> >
> > The benchmarks
> >> for -msse3 and -msse4 appear identical (at least for degg+optnz).
> >> This is fortunate since there seems to be a bug in -msse4 on 2.33
> >> GHz (T7600) Intel Core 2 Duo Merom
> >> (http://llvm.org/bugs/show_bug.cgi?id=12434). Jack
> >>
> >> llvm/dragonegg r153877
> >&...
2016 Jan 14
1
Antw: Test still failing in old CPUs
...rstood. We just don't have a
solution for it yet.
What happens is that the unit tests have to directly #include the C
files they test because some of the functions tested are static. But
some of the #included C files use intrinsics with CPU detection so they
require being compiled with (e.g.) -msse4, but as soon as we include
those compile flags, the compiler is free to use these instructions
anywhere. And this cases the Illegal instruction failure.
Jean-Marc
2011 Oct 08
4
[LLVMdev] dragonegg svn benchmarks
The Polyhedron 2005 benchmark results for dragonegg svn at r141492
using FSF gcc 4.6.2svn measured on x86_64-apple-darwin11 are listed below.
The benchmarks used the optimizaton flags...
-msse4 -ffast-math -funroll-loops -O3
in all cases. The use of -fplugin-arg-dragonegg-enable-gcc-optzns to allow
for autovectorization from the FSF gcc front-end only produces a single run-time
regression, fatigue, which is PR10892.
Run time
Benchmark gfortran dragonegg dragonegg+optnz
-----------...
2012 Apr 03
1
[LLVMdev] pb05 results for current llvm/dragonegg
Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn
on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. The benchmarks
for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate
since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom
(http://llvm.org/bugs/show_bug.cgi?id=12434). I've added two additional entries to
the table. The first, degg+novect+optnz, should show the optimizati...
2012 Apr 03
2
[LLVMdev] pb05 results for current llvm/dragonegg
...-chain-depth=3?
Jack
>
> -Hal
>
> (in
> > absence of -ftree-vectorize and
> > -fplugin-arg-dragonegg-enable-gcc-optzns)? Jack
> >
> > >
> > > Ciao, Duncan.
> > >
> > > The benchmarks
> > >> for -msse3 and -msse4 appear identical (at least for degg+optnz).
> > >> This is fortunate since there seems to be a bug in -msse4 on 2.33
> > >> GHz (T7600) Intel Core 2 Duo Merom
> > >> (http://llvm.org/bugs/show_bug.cgi?id=12434). Jack
> > >>
> > >> llvm/dra...
2009 Jan 23
0
[LLVMdev] llvm/llvm-gcc-4.2 and xplor-nih
I am happy to report that current llvm/llvm-gcc-4.2 svn
builds all of xplor-nih (a complex mix of c, c++ and fortran)
with -O3 -fPIC -msse4 -ffast-math. A single fortran file
exposes PR3376 which is triggered by -O3 -ffinite-math-only.
The resulting build of xplor-nih completely passes its testsuite
and compares very well to the same build against gcc trunk for
gcc 4.4 in terms of execution time.
gcc 4.4...
2011 Oct 08
0
[LLVMdev] dragonegg svn benchmarks
Hi Jack,
> The Polyhedron 2005 benchmark results for dragonegg svn at r141492
> using FSF gcc 4.6.2svn measured on x86_64-apple-darwin11 are listed below.
> The benchmarks used the optimizaton flags...
>
> -msse4 -ffast-math -funroll-loops -O3
>
> in all cases. The use of -fplugin-arg-dragonegg-enable-gcc-optzns to allow
> for autovectorization from the FSF gcc front-end only produces a single run-time
> regression, fatigue, which is PR10892.
thanks for these numbers. I suggest you also try -O...
2011 Oct 01
1
[LLVMdev] Vector-select status update
Hi,
As of recently, the LLVM code-generator started supporting vector-select instructions (select instructions where the predicate operand is a vector of booleans).
This support includes efficient sequences for targets which have dedicated blend instructions (such as SSE4 and AVX), a slower implementation using
vector AND/OR/XOR instructions for unoptimized targets, and scalarization for
2016 Jan 13
5
Test still failing in old CPUs
Opus 1.1.2.
As experienced in previous release:
"""
./test-driver: line 107: 25185 Illegal instruction "$@" > $log_file 2>&1
FAIL: celt/tests/test_unit_mathops
"""
--
Jes?s Cea Avi?n _/_/ _/_/_/ _/_/_/
jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/
Twitter: @jcea