Hi,
Here is a decoding speed comparison between new-trunk (theora-exp) and
old-trunk. I find out I missed some things last time so theora-exp
looked slower than it should be. The first thing is it is faster to
disable striped-decoding in dump_video.c. The second thing is the cpu
detection does not recognize the Geode used in OLPC so MMX was not
enabled. After fixing these two things, here are the numbers for some
clips I collected (svr r12867):
                               Athlon-XP        Geode-GX
                               new  old new/old new  old  new/old
320x240              320x240   462  800  0.57    63   88   0.71
322x242_not-divisibl 336x256   429  706  0.60    56   76   0.73
Building_On_The_Past 320x240   407  388  1.05    50   57   0.87
Elephants_Dream_1024 1024x576   51   36  1.42   7.9  7.5   1.05
Elephants_Dream_512- 512x288   216  155  1.39    29   29   1.00
MSD_ORACLE.vlog_2    368x240   511  457  1.12    68   79   0.86
chained_streams      320x240   462  750  0.61    61   83   0.73
monday_0930_Welcome  320x240   503  464  1.08    62   71   0.87
pixel_aspect_ratio   352x288   317  275  1.15    41   44   0.93
romalcc              352x288   503  657  0.76    76  103   0.73
timeskew1-t2s1       320x240   542  596  0.90    71   88   0.80
videotestsrc-720x576 720x576    89  148  0.60    15   21   0.71
The numbers are fps on Athlon-XP and Geode-GX. The new/old ratio is
the fps ratio between new-trunk and old-trunk. The result shows
new-trunk shines for larger resolutions, but for smaller resolutions
the old-trunk is faster (about 20% on Geode). I intended to do some
optimization for theora on OLPC, but now I am not sure which code base
should I work on...
The Geode MMX detection patch is attached.
Thanks,
Chih-Chung Chang
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.cpu
Type: application/octet-stream
Size: 570 bytes
Desc: not available
Url :
http://lists.xiph.org/pipermail/theora-dev/attachments/20070416/b52e7e77/patch.obj
Seems that latest revision include some more MMX optimizations. These are my
results using revision 12882. I am using a Dual Opteron 252 (2,6 GHz) on a 64
bit linux.
th-old     : http://svn.xiph.org/trunk/theora-old/ compiled
   with --disable-asm
th         : http://svn.xiph.org/trunk/theora/ compiled
   with --disable-asm
th-old-MMX : http://svn.xiph.org/trunk/theora-old/ (with MMX)
th-MMX     : http://svn.xiph.org/trunk/theora/ (with MMX)
Times are in seconds, new/old is the speed ratio (higher, better).
         frames  th-old  th  new/old th-old-MMX th-MMX new/old
Romalcc   25178   12,75  15,02  0,85    11,44    12,74  0,9
Elep_1024 15293	 167    144     1,16   131      113     1,16
Monday    55988   26,38  28,65  0,92    23,29    23,25  1
MSD_Oracl 42674   33,5   31,5   1,06    28,31    23,72  1,19
320x240     120    0,08   0,08  1        0,08     0,08  0,9
Videotest    49    0,12   0,22  0,54     0,12     0,2   0,61
------------------------------------------------------
Passa a Infostrada. ADSL e Telefono senza limiti e senza canone Telecom
http://click.libero.it/infostrada
Seems also that, while using theora trunk, dump_video.c from theora-old is about 10% faster than dump_video.c from theora.> Seems that latest revision include some more MMX optimizations. These are my results using revision 12882. I am using a Dual Opteron 252 (2,6 GHz) on a 64 bit linux.------------------------------------------------------ Passa a Infostrada. ADSL e Telefono senza limiti e senza canone Telecom http://click.libero.it/infostrada