thr3ads.net - search: "msse4"

Displaying 15 results from an estimated 15 matches for "msse4".

Did you mean: msse3

[LLVMdev] -O4 limitations in llvm/llvm-gcc-4.2 2.5?

2009 Jan 24

[LLVMdev] -O4 limitations in llvm/llvm-gcc-4.2 2.5?

...ARCH=Darwin_9_x86 lib);\ done [ -d bin.Darwin_9_x86 ] || mkdir bin.Darwin_9_x86 cd bin.Darwin_9_x86; make -f /Users/howarth/xplor-nih-2.21/fortlib/libfft/Makefile \ ARCH=Darwin_9_x86 SRCDIR=/Users/howarth/xplor-nih-2.21/fortlib/libfft/ ARCHDEP=TRUE lib llvm-gfortran -c -O4 -fPIC -ffast-math -msse4 -ffixed-line-length-120 -fno-second-underscore -DZOMPLEX /Users/howarth/xplor-nih-2.21/fortlib/libfft/dsint.f llvm-gfortran -c -O4 -fPIC -ffast-math -msse4 -ffixed-line-length-120 -fno-second-underscore -DZOMPLEX /Users/howarth/xplor-nih-2.21/fortlib/libfft/dsinti.f llvm-gfortran -c -O4 -fPIC...

[LLVMdev] dragonegg svn benchmarks

2011 Oct 12

[LLVMdev] dragonegg svn benchmarks

Hi Chris, >> PS: With -fplugin-arg-dragonegg-enable-gcc-optzns the LLVM optimizers are run at >> the following levels: >> >> Command line option LLVM optimizers run at >> ------------------- ---------------------- >> -O1 tiny amount of optimization >> -O2 or -O3 -O1 >> -O4 or -O5

[LLVMdev] dragonegg svn benchmarks

2011 Oct 11

[LLVMdev] dragonegg svn benchmarks

On Oct 8, 2011, at 12:05 PM, Duncan Sands wrote: > PS: With -fplugin-arg-dragonegg-enable-gcc-optzns the LLVM optimizers are run at > the following levels: > > Command line option LLVM optimizers run at > ------------------- ---------------------- > -O1 tiny amount of optimization > -O2 or -O3 -O1 > -O4 or -O5

[LLVMdev] pb05 results for current llvm/dragonegg

2012 Apr 02

[LLVMdev] pb05 results for current llvm/dragonegg

Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. The benchmarks for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom (http://llvm.org/bugs/show_bug.cgi?id=12434). Jack llvm/dragonegg r153877 dragonegg: de-gfortran46 -msse3 -ffast-math -funroll-loops...

[LLVMdev] pb05 results for current llvm/dragonegg

2012 Apr 03

[LLVMdev] pb05 results for current llvm/dragonegg

...ttached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn > on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. thanks for the numbers. How does this compare to LLVM 3.0 - were there any regressions? Ciao, Duncan. The benchmarks > for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate > since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom > (http://llvm.org/bugs/show_bug.cgi?id=12434). > Jack > > llvm/dragonegg r153877 > > dragonegg: > de-gfortr...

[LLVMdev] pb05 results for current llvm/dragonegg

2012 Apr 03

[LLVMdev] pb05 results for current llvm/dragonegg

...nothing in terms of vectorization. Do we need to pass any additional flags to actually achieve autovectorization via llvm (in absence of -ftree-vectorize and -fplugin-arg-dragonegg-enable-gcc-optzns)? Jack > > Ciao, Duncan. > > The benchmarks >> for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate >> since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom >> (http://llvm.org/bugs/show_bug.cgi?id=12434). >> Jack >> >> llvm/dragonegg r153877 >> >&g...

[LLVMdev] pb05 results for current llvm/dragonegg

2012 Apr 03

[LLVMdev] pb05 results for current llvm/dragonegg

...lt of 6 (for example, -bb-vectorize-req-chain-depth=3) will cause a lot more vectorization. -Hal (in > absence of -ftree-vectorize and > -fplugin-arg-dragonegg-enable-gcc-optzns)? Jack > > > > > Ciao, Duncan. > > > > The benchmarks > >> for -msse3 and -msse4 appear identical (at least for degg+optnz). > >> This is fortunate since there seems to be a bug in -msse4 on 2.33 > >> GHz (T7600) Intel Core 2 Duo Merom > >> (http://llvm.org/bugs/show_bug.cgi?id=12434). Jack > >> > >> llvm/dragonegg r153877 > >&...

Antw: Test still failing in old CPUs

2016 Jan 14

Antw: Test still failing in old CPUs

...rstood. We just don't have a solution for it yet. What happens is that the unit tests have to directly #include the C files they test because some of the functions tested are static. But some of the #included C files use intrinsics with CPU detection so they require being compiled with (e.g.) -msse4, but as soon as we include those compile flags, the compiler is free to use these instructions anywhere. And this cases the Illegal instruction failure. Jean-Marc

[LLVMdev] dragonegg svn benchmarks

2011 Oct 08

[LLVMdev] dragonegg svn benchmarks

The Polyhedron 2005 benchmark results for dragonegg svn at r141492 using FSF gcc 4.6.2svn measured on x86_64-apple-darwin11 are listed below. The benchmarks used the optimizaton flags... -msse4 -ffast-math -funroll-loops -O3 in all cases. The use of -fplugin-arg-dragonegg-enable-gcc-optzns to allow for autovectorization from the FSF gcc front-end only produces a single run-time regression, fatigue, which is PR10892. Run time Benchmark gfortran dragonegg dragonegg+optnz -----------...

[LLVMdev] pb05 results for current llvm/dragonegg

2012 Apr 03

[LLVMdev] pb05 results for current llvm/dragonegg

Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. The benchmarks for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom (http://llvm.org/bugs/show_bug.cgi?id=12434). I've added two additional entries to the table. The first, degg+novect+optnz, should show the optimizati...

[LLVMdev] pb05 results for current llvm/dragonegg

2012 Apr 03

[LLVMdev] pb05 results for current llvm/dragonegg

...-chain-depth=3? Jack > > -Hal > > (in > > absence of -ftree-vectorize and > > -fplugin-arg-dragonegg-enable-gcc-optzns)? Jack > > > > > > > > Ciao, Duncan. > > > > > > The benchmarks > > >> for -msse3 and -msse4 appear identical (at least for degg+optnz). > > >> This is fortunate since there seems to be a bug in -msse4 on 2.33 > > >> GHz (T7600) Intel Core 2 Duo Merom > > >> (http://llvm.org/bugs/show_bug.cgi?id=12434). Jack > > >> > > >> llvm/dra...

[LLVMdev] llvm/llvm-gcc-4.2 and xplor-nih

2009 Jan 23

[LLVMdev] llvm/llvm-gcc-4.2 and xplor-nih

I am happy to report that current llvm/llvm-gcc-4.2 svn builds all of xplor-nih (a complex mix of c, c++ and fortran) with -O3 -fPIC -msse4 -ffast-math. A single fortran file exposes PR3376 which is triggered by -O3 -ffinite-math-only. The resulting build of xplor-nih completely passes its testsuite and compares very well to the same build against gcc trunk for gcc 4.4 in terms of execution time. gcc 4.4...

[LLVMdev] dragonegg svn benchmarks

2011 Oct 08

[LLVMdev] dragonegg svn benchmarks

Hi Jack, > The Polyhedron 2005 benchmark results for dragonegg svn at r141492 > using FSF gcc 4.6.2svn measured on x86_64-apple-darwin11 are listed below. > The benchmarks used the optimizaton flags... > > -msse4 -ffast-math -funroll-loops -O3 > > in all cases. The use of -fplugin-arg-dragonegg-enable-gcc-optzns to allow > for autovectorization from the FSF gcc front-end only produces a single run-time > regression, fatigue, which is PR10892. thanks for these numbers. I suggest you also try -O...

[LLVMdev] Vector-select status update

2011 Oct 01

[LLVMdev] Vector-select status update

Hi, As of recently, the LLVM code-generator started supporting vector-select instructions (select instructions where the predicate operand is a vector of booleans). This support includes efficient sequences for targets which have dedicated blend instructions (such as SSE4 and AVX), a slower implementation using vector AND/OR/XOR instructions for unoptimized targets, and scalarization for

Test still failing in old CPUs

2016 Jan 13

Test still failing in old CPUs

Opus 1.1.2. As experienced in previous release: """ ./test-driver: line 107: 25185 Illegal instruction "$@" > $log_file 2>&1 FAIL: celt/tests/test_unit_mathops """ -- Jes?s Cea Avi?n _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ Twitter: @jcea

search for: msse4