Along the same line as Frederick, myself and another university student were able to implement a multi threaded FLAC encoder, but using Intel's Threading Building Blocks (TBB) package. We saw similar near-linear speedup. Our solution is a bit more convoluted since we were learning the API, TBB and writing the encoder all in one 6 week period. We used a pipeline model on the input stream, and considered each token in the pipeline to be a single block of audio. We also had to add some pipeline-safe functions to libFLAC. All of this is detailed in our project report, so I won't repeat it here. I agree with Frederick in that the existing encoding API was obviously designed for serial execution, and any internal threading support would be messy. We had to come up with quite a few tricks to modify as little of libFLAC as possible. Project report, patches and example encoder: http://rhubarbtech.com/things/pflac.tar.gz Chris Peplin
On Tue, May 6, 2008 at 4:52 PM, Christopher Peplin <chris.peplin at rhubarbtech.com> wrote:> Along the same line as Frederick, myself and another university student were able to implement a multi threaded FLAC > encoder, but using Intel's Threading Building Blocks (TBB) package. We saw similar near-linear speedup.Great! Your approach is better in that it bounds the memory usage of the encoding, which is nice, and it also is amenable to streaming; a similar pipelined approach would probably be best for the production encoder. Using mmap() for the input file like I did might simplify your first pipeline stage and reduce the number of modifications you need to make to libFLAC (as well as improve I/O performance) at the cost of having to write callbacks yourself. Using aio for the output, too, might improve performance but it might not be worth the extra trickiness.> > I agree with Frederick in that the existing encoding API was obviously designed for serial execution, and any internal > threading support would be messy. We had to come up with quite a few tricks to modify as little of libFLAC as possible.It looks like implementing parallel-friendly APIs is indeed the logical next step. -- Frederick Akalin http://www.akalin.cx
There's a problem with Intel's TBB package: It won't run on PowerPC or other processors. On May 6, 2008, at 21:41, Frederick Akalin wrote: On Tue, May 6, 2008 at 4:52 PM, Christopher Peplin <chris.peplin at rhubarbtech.com> wrote:> Along the same line as Frederick, myself and another university > student were able to implement a multi threaded FLAC > encoder, but using Intel's Threading Building Blocks (TBB) > package. We saw similar near-linear speedup.Great! Your approach is better in that it bounds the memory usage of the encoding, which is nice, and it also is amenable to streaming; a similar pipelined approach would probably be best for the production encoder. Using mmap() for the input file like I did might simplify your first pipeline stage and reduce the number of modifications you need to make to libFLAC (as well as improve I/O performance) at the cost of having to write callbacks yourself. Using aio for the output, too, might improve performance but it might not be worth the extra trickiness.> I agree with Frederick in that the existing encoding API was > obviously designed for serial execution, and any internal threading > support would be messy. We had to come up with quite a few tricks > to modify as little of libFLAC as possible.It looks like implementing parallel-friendly APIs is indeed the logical next step. -- Frederick Akalin http://www.akalin.cx _______________________________________________ Flac-dev mailing list Flac-dev at xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev