Leonardo de Paula Rosa Piga
2008-Aug-15 00:49 UTC
[theora-dev] GSoC - Theora multithread decoder
Hi, This email is to inform what I have been doing since the mid-term. After the mid-term I worked on a pipeline implementation with OpenMP. As I said before I did a pipelined implementation of these functions: (c_dec_dc_unpredict_mcu_plane + oc_dec_frags_recon_mcu_plane) and (oc_state_loop_filter_frag_rows + oc_state_borders_fill_rows) as explained in my previous email. But the results were not good. They were equal the implementation without pipeline. http://lampiao.lsc.ic.unicamp.br/~piga/gsoc_2008/comparison.png http://lampiao.lsc.ic.unicamp.br/~piga/gsoc_2008/speedup.png http://lampiao.lsc.ic.unicamp.br/~piga/gsoc_2008/systime.png Next I tried to do some improvements, but without success. I think that the implementation as it is, could not be improved more with parallelism. Another approach could be tried to decode the next frame, as the current frame is decoding, but this is very chanlenging, because most of the frames has data dependency of the frame before it. The gain adding parallelism is 13% with OpenMP. Not so bad. My project is with few code, I tried to do my best. When I proposed this project I knew that it would be very difficult. It was challenging and I have learned a lot, specially about the OpenMP library. Also, I will understand if you don't approve me. The project was very difficult, more than I expected. I wanted to obtain better results but I couldn't. I have tried a lot. -- Leonardo de Paula Rosa Piga http://lampiao.lsc.ic.unicamp.br/~piga
On 14-Aug-08, at 5:49 PM, Leonardo de Paula Rosa Piga wrote:> This email is to inform what I have been doing since the mid-term.Thanks, it's great to hear a report. A few comments/questions below.> As I said before I did a pipelined implementation of these functions: > (c_dec_dc_unpredict_mcu_plane + oc_dec_frags_recon_mcu_plane) and > (oc_state_loop_filter_frag_rows + oc_state_borders_fill_rows) as > explained in my previous email. > But the results were not good. They were equal the implementation > without pipeline.That's too bad. Do you have any hope that explicit threading would do better?> http://lampiao.lsc.ic.unicamp.br/~piga/gsoc_2008/comparison.pngWhy are the error bars so large on three points on this plot? There data doesn't distinguish anything in that mcu range.> My project is with few code, I tried to do my best. When I proposed > this project I knew that it would be very difficult. It was > challenging and I have learned a lot, specially about the OpenMP > library.You shouldn't be so discouraged. We were all aware this was a difficult proposal, but actually trying things is still useful work. You should also commit all the code for these tests. That way others can reproduce and build on your work. One advantage of open source code repositories is that you can publish negative results. Google also wants all your code at the end of the program, and it's easier if they can pull it from our repository. -r