Felipe Portavales Goldstein
2007-Oct-02 14:55 UTC
[theora-dev] Multi-Thread Theora Encoder
Hello, I'm happy to announce I developed a Multi-Threaded version of the Theora encoder. I changed the Motion Vector Search part of the algorithm to be executed in parallel. I've chosen the Motion search part after a careful set of profilings that shown that the Motion Vector Search is responsible by 70% of CPU-time on average and up to 95% of CPU-time in some cases. I also have chosen this part because it is easier to parallelize. Running this multi-thread version I've got: - On a dual core machine : * Average of 28% of speed-up on the running time * up to 38% of speed-up (the best case of my set of tests) - On a quad core machine: * Average of 36% of speed-up on the running time * up to 58% of speed-up (the best case of my set of tests) As you can see it does not scale very well (yet), because a part of the source codame is still sequential and run in a sequential manner after the initial parallel part (the motion search). It is not like a double buffer or a pipeline (there is some data dependencies that makes it difficult). This multi-threaded version will be converted to a port of the Theora encoder to the IBM CELL processor. I am working on this project in my free time during my undergrad course. So, the development can be slow. If someone wants to contribute, please send me an email. The source code used was the libtheora 1.0alpha8, but I think that my patch can be easily applied to the current libtheora 1.0beta1. I will clean-up the source code and maybe I could upload to the Theora SVN. Or send the patch to somebody. Cheers, Felipe -- ________________________________________ Felipe Portavales Goldstein <portavales at gmail> Undergraduate Student - IC-UNICAMP Computer Systems Laboratory http://www.students.ic.unicamp.br/~ra023772/
Multi-Thread Encoder is such a good thing that justifies a new release of libtheora.> Date: Tue, 2 Oct 2007 18:51:10 -0300 > From: portavales@gmail.com > To: theora-dev@xiph.org > Subject: [theora-dev] Multi-Thread Theora Encoder > > Hello, > > I'm happy to announce I developed a Multi-Threaded version of the > Theora encoder. I changed the Motion Vector Search part of the > algorithm to be executed in parallel. > > I've chosen the Motion search part after a careful set of profilings > that shown that the Motion Vector Search is responsible by 70% of > CPU-time on average and up to 95% of CPU-time in some cases. I also > have chosen this part because it is easier to parallelize. > > Running this multi-thread version I've got: > > - On a dual core machine : > * Average of 28% of speed-up on the running time > * up to 38% of speed-up (the best case of my set of tests) > > - On a quad core machine: > * Average of 36% of speed-up on the running time > * up to 58% of speed-up (the best case of my set of tests) > > > As you can see it does not scale very well (yet), because a part of > the source codame is still sequential and run in a sequential manner > after the initial parallel part (the motion search). It is not like a > double buffer or a pipeline (there is some data dependencies that > makes it difficult). > > This multi-threaded version will be converted to a port of the Theora > encoder to the IBM CELL processor. > > I am working on this project in my free time during my undergrad > course. So, the development can be slow. > If someone wants to contribute, please send me an email. > > The source code used was the libtheora 1.0alpha8, but I think that my > patch can be easily applied to the current libtheora 1.0beta1. > > I will clean-up the source code and maybe I could upload to the Theora SVN. > Or send the patch to somebody. > > > Cheers, > Felipe > > --_________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
On Tue, Oct 02, 2007 at 06:51:10PM -0300, Felipe Portavales Goldstein wrote:> I'm happy to announce I developed a Multi-Threaded version of the > Theora encoder. I changed the Motion Vector Search part of the > algorithm to be executed in parallel.Excellent news! Please do check it into svn under a branch so we can try it out. What's the slowdown running the code on a uniprocessor? -r
>>>>> "Felipe" == Felipe Portavales Goldstein <portavales@gmail.com> writes:> This multi-threaded version will be converted to a port of the Theora > encoder to the IBM CELL processor.Neat, this might finally convince me to buy a PS3 :)> I am working on this project in my free time during my undergrad > course. So, the development can be slow. If someone wants to > contribute, please send me an email.I've recently done some Cell programming on Linux for my master thesis and might be able to help with porting. BTW may I recommend to not use the official Cell-SDK, but instead only the three packages ppe-gcc, spu-gcc and libspe which recently became available natively for many (Power-PC) Linux distros? Installing the official SDK would be a major hassle for people wishing to participate. Also some of the IBM libs in the SDK might never make it into mainstream distros due to licensing restrictions... cheers, David -- GnuPG public key: http://user.cs.tu-berlin.de/~dvdkhlng/dk.gpg Fingerprint: B17A DC95 D293 657B 4205 D016 7DEF 5323 C174 7D40