Displaying 4 results from an estimated 4 matches for "oc_state_borders_fill_rows".
2008 Jul 07
2
GSoC - Theora multithread decoder
...with the parallel implementation.
T1 should be at most 0.66To.
I will use the pthread implementation to try a pipelined version and see if
we obtain more gains.
These version will run the functions (c_dec_dc_unpredict_mcu_plane +
oc_dec_frags_recon_mcu_plane) and
(oc_state_loop_filter_frag_rows + oc_state_borders_fill_rows) in parallel.
The upper bound for the gain is 60%, that is, let T2 be a video decoded with
the pipelined implementation. T2 should be at most 0.4To.
Here is the branch for the OpenMP implementation:
http://svn.xiph.org/branches/theora_multithread_decode_omp/
Here is the branch for the PThread impl...
2008 Aug 15
1
GSoC - Theora multithread decoder
...l is to inform what I have been doing since the mid-term.
After the mid-term I worked on a pipeline implementation with OpenMP.
As I said before I did a pipelined implementation of these functions:
(c_dec_dc_unpredict_mcu_plane + oc_dec_frags_recon_mcu_plane) and
(oc_state_loop_filter_frag_rows + oc_state_borders_fill_rows) as
explained in my previous email.
But the results were not good. They were equal the implementation
without pipeline.
http://lampiao.lsc.ic.unicamp.br/~piga/gsoc_2008/comparison.png
http://lampiao.lsc.ic.unicamp.br/~piga/gsoc_2008/speedup.png
http://lampiao.lsc.ic.unicamp.br/~piga/gsoc_2008/syst...
2008 Mar 25
0
No subject
...at it is not the case. For coarse grain jobs they
>> are equivalent
>>
>>>
>>>
>>> > These version will run the functions (c_dec_dc_unpredict_mcu_plane +
>>> > oc_dec_frags_recon_mcu_plane) and
>>> > (oc_state_loop_filter_frag_rows + oc_state_borders_fill_rows) in
>>> > parallel. The upper bound for the gain is 60%, that is, let T2 be a
>>> > video decoded with the pipelined implementation. T2 should be at most
>>> > 0.4To.
>>>
>>> I think you mean "at least". Let us know what your test resu...
2005 Jul 20
1
MMX IDCT for theora-exp
...0.0901 15 0.0039 dump oc_frag_recon_intra
470 0.0832 505 0.1301 dump oc_token_dec1val_cat2
445 0.0788 96 0.0247 dump oc_token_dec1val_zrl
344 0.0609 81 0.0209 dump oc_state_borders_fill_rows
324 0.0574 237 0.0611 dump oc_restore_fpu_mmx
260 0.0460 36 0.0093 dump main
218 0.0386 83 0.0214 dump stripe_decoded
213 0.0377 67 0.0173 dump oc_token_skip_zr...