thr3ads.net - similar to: "MMX loop filter for theora-exp"

Displaying 20 results from an estimated 110 matches similar to: "MMX loop filter for theora-exp"

2010 Jul 24

theorarm build

Hi all-- I tried building the ARM-optimized theora codec from the theorarm- merge-branch, and encountered the following compile and runtime problems before getting something to run. If there is another way to build it, it would be nice to know, but I got the sense that its current state in svn is incomplete. I'm using a gcc cross-compiler for ARM on an x86 Linux PC. After running

MMX IDCT for theora-exp

2005 Jul 20

MMX IDCT for theora-exp

Hello, I'm attaching IDCT MMX patch. I reused IDCT from theora-a3-MMXd.zip. It should work on 64bit X86 platform too. Here is most used functions when playing video with jet aircrafts (gripen) Ogg logical stream 310b2968 is Theora 720x480 29.97 fps video Encoded frame content is 720x480 with 0x0 offset I can play this video with like 200-300 frame drops on Athlon XP 1700+ CPU load (with

[PATCH] remove some FZIGZAG

2005 Aug 20

[PATCH] remove some FZIGZAG

Hello, As we discussed with derf some time ago, it seems it is not neccessary to enforce "forward" order of dct_coeffs. This patch gains .99366902855226196000% so approx 1% speedup. Meausurement method: time nice -n -19 ./dump /mnt/disc4/theora/unix/gripen.ogg > /dev/null Ogg logical stream 310b2968 is Theora 720x480 29.97 fps video Encoded frame content is 720x480 with 0x0 offset

[PATCH]

2005 Mar 23

[PATCH]

Hello, Here is my first speedup patch. Like 10-11%. No IDCT yet. Please feel free to comment my code or even better think about improvements. :) I belive my routines are not so bad, maybe one day they will be even more faster. What needs to be optimized is the loop filter fuction. I have no ideas now how to do it. It does not leave much space for parallel stuff, copying memory from lot of

[PATCH] promised MMX patches rc1

2005 Mar 23

[PATCH] promised MMX patches rc1

experimental patch for libtheora1.1beta3

2009 Aug 30

experimental patch for libtheora1.1beta3

Good morning in the Lord Regarding the port of libtheora1.1beta3 for OpenBSD for amd64 and the problem I described at: http://lists.xiph.org/pipermail/theora/2009-August/002640.html Attached is a patch for libtheora/patches/patch-lib_x86_mmxencfrag_c I can play videos with it. ?Does it work for you? Best regards -- Dios, gracias por tu amor infinito.

Theora decoding problem on PowerPC

2007 Sep 26

Theora decoding problem on PowerPC

Hi, I'm attempting to decode Theora videos on a PowerPC running a Linux 2.6.19 kernel. The version of GCC I'm cross-compiling from is 3.4.4. The software versions I'm running are: libogg-1.1.3 libpng-1.2.20 libtheora-1.0beta1 libvorbis-1.2.0 These are all the latest I was able to download. Here's a back trace I got while running "dump_video" under

An assembly optimization and fix

2004 Sep 10

An assembly optimization and fix

I have optimized FLAC__fixed_compute_best_predictor_asm_ia32_mmx_cmov function and fixed bug when data_len == 0. Now the function is about 50% faster and flac -5 is about 5% faster on my box. I have tested it thoroughly, I think it can go to flac 1.0.4. -- Miroslav Lichvar -------------- next part -------------- --- src/libFLAC/ia32/fixed_asm.nasm.orig 2002-01-26 19:05:12.000000000 +0100 +++

[LLVMdev] Folding an insertelt chain

2012 Feb 17

[LLVMdev] Folding an insertelt chain

On Feb 17, 2012, at 12:50 AM, Ivan Llopard wrote: > Hello, > > I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere

Proposal for replacing asm code with intrinsics

2009 Oct 13

Proposal for replacing asm code with intrinsics

Hi, I'm new to Theora and would like to propose several performance optimization using advanced instructions in x86 CPUs (SSE2-SSE4.2). There are several source files in \x86 and \x86_vc which developed using inline assembler. However this cause several maintenance problems: 1) Need to sync gcc & msvc versions 2) Only 32bit environment is supported 3) No support for newer than MMX

[LLVMdev] Folding an insertelt chain

2012 Feb 17

[LLVMdev] Folding an insertelt chain

Hello, I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere else. Please find attached the patch. Regards, Ivan

[LLVMdev] [llvm-commits] [dragonegg] r168787 - in /dragonegg/trunk: src/x86/Target.cpp src/x86/x86_builtins test/validator/c/copysignp.c

2012 Nov 28

[LLVMdev] [llvm-commits] [dragonegg] r168787 - in /dragonegg/trunk: src/x86/Target.cpp src/x86/x86_builtins test/validator/c/copysignp.c

Hi Pawel, can you please pull this dragonegg patch into 3.2. I am the code owner for dragonegg. Thanks a lot, Duncan. On 28/11/12 13:44, Duncan Sands wrote: > Author: baldrick > Date: Wed Nov 28 06:44:50 2012 > New Revision: 168787 > > URL: http://llvm.org/viewvc/llvm-project?rev=168787&view=rev > Log: > Add support for GCC's vector copysign builtins, fixing

glm.predict?

2002 Jan 22

glm.predict?

I've been attempting to calculate the predictions from a poisson glm object, along these lines: predict(foo.glm, type = "response") and predict(foo.glm, type = "response", se.fit = TRUE) foo.glm is arrived at this way: foo.glm <- glm(Insects ~ Dad * Mum + Location, offset = log(MM), family = "poisson", data = model.df) There are two

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

[LLVMdev] llvm register reload/spilling around calls

On Oct 20, 2010, at 7:46 AM, Roland Scheidegger wrote: > On 20.10.2010 05:00, Jakob Stoklund Olesen wrote: >> Look in X86InstrControl.td. The call instructions are all prefixed >> by: >> >> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, >> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, >> XMM0, XMM1, XMM2, XMM3,

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

[LLVMdev] llvm register reload/spilling around calls

(repost with right sender address) On 20.10.2010 18:13, Jakob Stoklund Olesen wrote: > On Oct 20, 2010, at 7:46 AM, Roland Scheidegger wrote: > >> On 20.10.2010 05:00, Jakob Stoklund Olesen wrote: >>> Look in X86InstrControl.td. The call instructions are all prefixed >>> by: >>> >>> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1,

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

[LLVMdev] llvm register reload/spilling around calls

On Oct 19, 2010, at 6:37 PM, Roland Scheidegger wrote: > Thanks for giving it a look! > > On 19.10.2010 23:21, Jakob Stoklund Olesen wrote: >> On Oct 19, 2010, at 11:40 AM, Roland Scheidegger wrote: >> >>> So I saw that the code is doing lots of register >>> spilling/reloading. Now I understand that due to calling >>> conventions, there's not

[LLVMdev] TargetRegisterClass for Physical Register

2007 Jun 19

[LLVMdev] TargetRegisterClass for Physical Register

On Monday 18 June 2007 19:02, Christopher Lamb wrote: > Take a look at getPhysicalRegisterRegClass( > const MRegisterInfo *MRI, > MVT::ValueType VT, > unsigned reg) > > in ScheduleDAG.cpp. Yuck. I was afraid of that. What is the ValueType needed for? Isn't the register id itself an indication of the ValueType it represents? Where I'm at I

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

[LLVMdev] llvm register reload/spilling around calls

On 20.10.2010 05:00, Jakob Stoklund Olesen wrote: > On Oct 19, 2010, at 6:37 PM, Roland Scheidegger wrote: > >> Thanks for giving it a look! >> >> On 19.10.2010 23:21, Jakob Stoklund Olesen wrote: >>> On Oct 19, 2010, at 11:40 AM, Roland Scheidegger wrote: >>> >>>> So I saw that the code is doing lots of register >>>>

[LLVMdev] Codegen/Register allocation question.

2008 Sep 04

[LLVMdev] Codegen/Register allocation question.

On Sep 3, 2008, at 5:58 AM, Lang Hames wrote: > Hi LLVMers, > > I have finally sorted out licensing issues and found some time, so I'm > trying to port my PBQP register allocator to 2.4 in order to Nice! We would definitely welcome your contribution. > > contribute it (if you want it). I've run into a bug that has me > confused though. > > I'm currently

[LLVMdev] Live Intervals Question

2007 Jun 27

[LLVMdev] Live Intervals Question

On Jun 26, 2007, at 12:57 PM, David Greene wrote: > Evan, thanks for responding so quickly. > > On Tuesday 26 June 2007 14:11, Evan Cheng wrote: >> On Jun 26, 2007, at 11:20 AM, David A. Greene wrote: >>> 28 %AL<dead> = MOV8rr %reg1024<kill>, %EAX<imp-def> >>> MOV8rr %mreg(2)<d> %reg1024 %mreg(17)<d> >>> 32 CALL64pcrel32

similar to: MMX loop filter for theora-exp