thr3ads.net - similar to: "patch to build theora-mmx on AMD64"

2004 Aug 24

5

MMX/mmxext optimisations

quite some speed improvement indeed. attached the updated patch to apply to svn/trunk. j -------------- next part -------------- A non-text attachment was scrubbed... Name: theora-mmx.patch.gz Type: application/x-gzip Size: 8648 bytes Desc: not available Url : http://lists.xiph.org/pipermail/theora-dev/attachments/20040824/5a5f2731/theora-mmx.patch-0001.bin

Delay occurred when the makefile change

2008 Apr 10

2

Delay occurred when the makefile change

I have tried to add a plunging to the "libtheora-1.0beta2" (network bandwidth measuring component was added) and Got it success for some far now the problem is when it is added the encoding process get extremely slow (around 20 seconds delay). I think that the problem is with my modified Makefile (some flag may have missed). the following is my modified Makefile.am which is in the

Theora got extreamly slow (Makefile.am was changed)

2008 Apr 23

1

Theora got extreamly slow (Makefile.am was changed)

I have tried to add a plunging to the "libtheora-1.0beta2" (network bandwidth measuring component was added) and Got it success for some far now the problem is when it is added the encoding process get extremely slow (around 20 seconds delay). I think that the problem is with my modified Makefile (some flag may have missed). the following is my modified Makefile.am which is in the

compiling theora-mmx on AMD64

2006 Mar 30

2

compiling theora-mmx on AMD64

Hi all, I'm a Theora noob and just taking a look at the theora-mmx package in hopes of making Thoggen run faster for DVD ripping. I've checked out the latest svn of the theora-mmx branch and trying to compile it on Ubuntu Dapper AMD64. I run autogen.sh, then make, and soon get the following errors: make[2]: Entering directory `/home/dlenski/theora-mmx/lib' if /bin/sh ../libtool

Theora MMX and Mac OS X Intel

2006 Jun 21

2

Theora MMX and Mac OS X Intel

hi, i was trying to enable the mmx code on mac os x. to get to that point one has to replace some inline assembler code: .balign 16 -> .p2align 4 and replace .rept .. .endr with #defines. but to makes things more complicated apple's GAS does not support movsx instructions and thus the following line does not work: " movsx %%di, %%edi \n\t" [ more details at

VC6 Patch

2007 Oct 09

1

VC6 Patch

Here is a patch that gets the theora_static.dsp project for VC6 building again. Aaron -------------- next part -------------- Index: win32/theora_static.dsp =================================================================== --- win32/theora_static.dsp (revision 13945) +++ win32/theora_static.dsp (working copy) @@ -41,7 +41,7 @@ # PROP Intermediate_Dir "Static_Release" # PROP

Can't compile libtheora vs2010

2011 Apr 22

2

Can't compile libtheora vs2010

I'm getting errors like so on initial build of libtheora - 1>c1 : fatal error C1083: Cannot open source file: '..\lib\dec\x86_vc\x86stat.c': No such file or directory 1> mmxstate.c (TaskId:16) 1>c1 : fatal error C1083: Cannot open source file: '..\lib\dec\x86_vc\mmxstate.c': No such file or directory 1> mmxloopfilter.c (TaskId:16) 1>c1 : fatal error C1083:

MMX loop filter for theora-exp

2005 Aug 17

2

MMX loop filter for theora-exp

Hello, I would like to announce the semi-optimized oc_state_loop_filter_frag_rows It gains like 7% speedup. Unfortunately it has some issues: 1) wont compile on 64bit (I will fix it later hopefully) 2) is not yet fully optimized (instruction stalls) Here are the results. CPU: Athlon, speed 1466.91 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask

[PATCH] promised MMX patches rc1

2005 Mar 23

3

[PATCH] promised MMX patches rc1

Hello, Here is my first speedup patch. Like 10-11%. No IDCT yet. Please feel free to comment my code or even better think about improvements. :) I belive my routines are not so bad, maybe one day they will be even more faster. What needs to be optimized is the loop filter fuction. I have no ideas now how to do it. It does not leave much space for parallel stuff, copying memory from lot of

[PATCH]

2005 Mar 23

0

[PATCH]

Hello, Here is my first speedup patch. Like 10-11%. No IDCT yet. Please feel free to comment my code or even better think about improvements. :) I belive my routines are not so bad, maybe one day they will be even more faster. What needs to be optimized is the loop filter fuction. I have no ideas now how to do it. It does not leave much space for parallel stuff, copying memory from lot of

MMX and extended-MMX acceleration patch for encoding

2003 May 08

3

MMX and extended-MMX acceleration patch for encoding

Hello, attached is a gzipped patch file to the lib/mcomp.c source file of theora (as of AnonCVS current version) that implements MMX and extended-MMX optimizations in the most frequently used functions of the encoder (as shown by gprof). This is more a proof of concept than a real request for inclusion into the source tree. My personal intent was more to look deeper into the MMX instruction set

MMX version of Theora

2010 Jul 20

0

MMX version of Theora

Hi all, I am trying to build the mmx version of the theora and the encoderwin is throwing the following errors. 1>------ Build started: Project: encoderwin, Configuration: Debug Win32 ------ 1>Linking... 1> Creating library encoderwin.lib and object encoderwin.exp 1>LINK : warning LNK4098: defaultlib 'LIBCMTD' conflicts with use of other libs; use /NODEFAULTLIB:library

Optimizing on AMD Geode (MMX, no SSE)

2015 Jan 07

1

Optimizing on AMD Geode (MMX, no SSE)

I'm trying to improve Opus on an AMD Geode CPU, which has limited SSE support (called 3DNow!), but MMX. Without optimizations I can only encode 16 bit audio @16KHz with complexity up to 2-3 without underruns. I tried compiling with SSE2/4 optimizations, but all I got was a crash with SIGILL, so I looked into optimized code and found that a good starting point was the dot product, so I

[LLVMdev] changing -mattr behavior with mmx and sse

2008 Nov 20

0

[LLVMdev] changing -mattr behavior with mmx and sse

Might you instead consider just adding a -disable-mmx option? Preston On Thu, 2008-20-11 at 02:57 -0500, Mon Ping Wang wrote: > Hi, > > When setting -mattr option on X86, I would like to treat MMX > separately from SSE levels. This would allow a client who sets the > attributes directly to set the SSE level independent of MMX, e.g., llc > -march=x86 -mattr=sse41, one would get

[LLVMdev] changing -mattr behavior with mmx and sse

2008 Nov 20

0

[LLVMdev] changing -mattr behavior with mmx and sse

On Nov 19, 2008, at 11:57 PMPST, Mon Ping Wang wrote: > Hi, > > When setting -mattr option on X86, I would like to treat MMX > separately from SSE levels. This would allow a client who sets the > attributes directly to set the SSE level independent of MMX, e.g., llc > -march=x86 -mattr=sse41, one would get sse4.1 with mmx disabled while > llc -march=x86 -mattr=mmx

[LLVMdev] Implementing MMX and SSE shifts

2009 Mar 19

1

[LLVMdev] Implementing MMX and SSE shifts

Hi all, Recently some great work has been done to implement vector shifts as described in the language reference, and I'd like to contribute by attempting to match these operations on x86 to MMX and SSE instructions whenever possible. I'm experienced in writing MMX and SSE assembly but I'm unfamiliar with how LLVM performs instruction selection. So every bit of information to

[LLVMdev] LLVM 2.8 and MMX

2010 Sep 07

0

[LLVMdev] LLVM 2.8 and MMX

Hi all, I've tested a recent revision and noticed that using 64-bit vectors became very slow. It looks like they are expanded to non-MMX instructions to avoid breaking code which does not clear the MMX state using emms? For my project I'm already manually inserting emms instructions in the right places, so I'd really like 64-bit vector operations to be lowered to MMX

[LLVMdev] LLVM 2.8 and MMX

2010 Sep 21

0

[LLVMdev] LLVM 2.8 and MMX

Hi Dale, I suspect that these patches were intended to improve 128-bit vector performance but caused certain 64-bit vector operations to no longer lower to MMX instructions. Anyway, now that I've narrowed it down to these patches I think I can narrow it down further to a specific case so I can file a bug... Will Bruno be back soon or is he no longer working on the project for good? Cheers,

[LLVMdev] [cfe-dev] should -mno-sse -mno-mmx -msse -mmmx work?

2011 Jul 01

2

[LLVMdev] [cfe-dev] should -mno-sse -mno-mmx -msse -mmmx work?

Hi Andrew- > fatal error: error in backend: SSE2 register return with SSE2 disabled Is this for 32-bit or 64-bit x86? If it's the latter, the ABI demands that the return value in this case is in xmm0 - SSE is required. Alistair

[LLVMdev] [cfe-dev] should -mno-sse -mno-mmx -msse -mmmx work?

2011 Jul 01

0

[LLVMdev] [cfe-dev] should -mno-sse -mno-mmx -msse -mmmx work?

Hi Andrew- > Well -no-sse -mno-mmx works for EFI as it is pre-boot firmware and does not have any floating point C code. We use -no-sse and -mno-mmx code to prevent optimized code gen using these registers for optimizations. Whether it's optimised or not doesn't particularly matter, the x86_64 ABI says that floating-point return values go into SSE registers, so that is where LLVM is

similar to: patch to build theora-mmx on AMD64