Displaying 3 results from an estimated 3 matches for "c_dec_dc_unpredict_mcu_plane".
2008 Aug 15
1
GSoC - Theora multithread decoder
Hi,
This email is to inform what I have been doing since the mid-term.
After the mid-term I worked on a pipeline implementation with OpenMP.
As I said before I did a pipelined implementation of these functions:
(c_dec_dc_unpredict_mcu_plane + oc_dec_frags_recon_mcu_plane) and
(oc_state_loop_filter_frag_rows + oc_state_borders_fill_rows) as
explained in my previous email.
But the results were not good. They were equal the implementation
without pipeline.
http://lampiao.lsc.ic.unicamp.br/~piga/gsoc_2008/comparison.png
http://lampiao.ls...
2008 Jul 07
2
GSoC - Theora multithread decoder
...To be the time spent in decoding a video with the current
implementation. Let T1 be a video decoded with the parallel implementation.
T1 should be at most 0.66To.
I will use the pthread implementation to try a pipelined version and see if
we obtain more gains.
These version will run the functions (c_dec_dc_unpredict_mcu_plane +
oc_dec_frags_recon_mcu_plane) and
(oc_state_loop_filter_frag_rows + oc_state_borders_fill_rows) in parallel.
The upper bound for the gain is 60%, that is, let T2 be a video decoded with
the pipelined implementation. T2 should be at most 0.4To.
Here is the branch for the OpenMP implementation:
ht...
2008 Mar 25
0
No subject
...> the overhead of pthreads, but I guess that stuff's just not "there" yet.
>>
>> The results above show that it is not the case. For coarse grain jobs they
>> are equivalent
>>
>>>
>>>
>>> > These version will run the functions (c_dec_dc_unpredict_mcu_plane +
>>> > oc_dec_frags_recon_mcu_plane) and
>>> > (oc_state_loop_filter_frag_rows + oc_state_borders_fill_rows) in
>>> > parallel. The upper bound for the gain is 60%, that is, let T2 be a
>>> > video decoded with the pipelined implementation. T2 should...