Michael Shatz
2007-Jun-14 13:26 UTC
[Speex-dev] Blackfin inline assembler and VisualDSP++ toolchain
> >Actually, you're the first I know using the VisualDSP++ toolchain :-) >I guess that's because speex has pretty big memory footprint. So developers that integrate speex tend to have plenty of RAM and once one has plenty of RAM he could install biggish OS. And between biggish OSes for Blackfin the most popular choice is uCLinux. And ucLinux works best with gnu tools. Something like that. On the other hand, developers that use Blakfin in a manner similar to traditional 16-bit DSP usage model, i.e. without external RAM or with relatively small internal SRAM normally use no OS at all (like me) or ADI's VDK. These people naturally prefer ADI toolchain because it gives you good visibility of what's going on within a small "bare metal" target. But such developers a less likely to integrate speex because it simply doesn't fit. I guess I am one of the few that try to run speex entirely from internal RAM and the fact already forced me to move from BF531 to BF533. BTW, the above exercise in deep philosophy at shallow shores shouldn't be taken too seriously ;)>About the inline assembly, I was under the impression that the syntax >was compatible. Could you tell me what's the problem? If it's just one >bit, you can easily remove the function and things will work as before.Just about everything fails. Some things fail during compilation, the rest during final assembling pass. Thinking about it, the problem is probably not in the asm syntax, but in a way by each the compiler treats the asm keyword. For example, for the following function: ----- static inline spx_word16_t MAX16(spx_word16_t a, spx_word16_t b) { spx_word32_t res; __asm__ ( "%1 = %1.L (X);\n\t" "%2 = %2.L (X);\n\t" "%0 = MAX(%1,%2);" : "=d" (res) : "%d" (a), "d" (b) ); return res; } --- Compiler says: "libspeex\fixed_bfin.h", line 48: cc1101: error: invalid constraint in asm statement : "%d" (a), "d" (b) ^ Following modification successfully passed: : "d" (a), "d" (b) Similarly, compiler doesn't understand the following line: : "=m" (res) It claims that m is not valid constarin. Looking into the manual (including gnu manual) I agree with compiler.>Here's some more information about Speex on the Blackfin from David >Rowe's blog: ><http://www.rowetel.com/blog/?p=5> ><http://www.rowetel.com/blog/?p=6> > >BTW, when you say it's slow, can you be more precise? What performance >do you expect and what do you get? Using gcc, I think David got it down >to ~20 MIPS at 15 kbps, so I assume VisualDSP++ should be able to do >better than that. > >Cheers, > > Jean-Marc >I talked to David. He got 22 MIPS _with_ inline asm. I am getting around 34 MIPS for exactly the same mode (15kbps, complexity=1, vbr=off) without inline asm. Don't know the scores for gcc, so can't tell whether ADI compiler is better. It's surely not better than your assembler. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20070614/6a0b6325/attachment.html
Jean-Marc Valin
2007-Jun-14 14:17 UTC
[Speex-dev] Blackfin inline assembler and VisualDSP++ toolchain
Michael Shatz a ?crit :>> Actually, you're the first I know using the VisualDSP++ toolchain >> :-) > > I guess that's because speex has pretty big memory footprint.Actually, you'll find that the data footprint in the lastest versions is pretty small. There's a bit more code/tables, but you'll find that many can go away if you're not actually using them.> So > developers that integrate speex tend to have plenty of RAM and once > one has plenty of RAM he could install biggish OS. And between > biggish OSes for Blackfin the most popular choice is uCLinux. And > ucLinux works best with gnu tools. Something like that. On the other > hand, developers that use Blakfin in a manner similar to traditional > 16-bit DSP usage model, i.e. without external RAM or with relatively > small internal SRAM normally use no OS at all (like me) or ADI's VDK. > These people naturally prefer ADI toolchain because it gives you good > visibility of what's going on within a small "bare metal" target. But > such developers a less likely to integrate speex because it simply > doesn't fit.What do they use? I don't think Speex is really much more expensive than other codecs when you compare apples to apples (e.g. if you compare with g.729, then first disable anything that isn't used by the 8 kbps mode).> I guess I am one of the few that try to run speex entirely from > internal RAM and the fact already forced me to move from BF531 to > BF533.That's an interesting exercise indeed.> Just about everything fails. Some things fail during compilation, the > rest during final assembling pass. Thinking about it, the problem is > probably not in the asm syntax, but in a way by each the compiler > treats the asm keyword. For example, for the following function: > ----- > static inline spx_word16_t MAX16(spx_word16_t a, spx_word16_t b) > { > spx_word32_t res; > __asm__ ( > "%1 = %1.L (X);\n\t" > "%2 = %2.L (X);\n\t" > "%0 = MAX(%1,%2);" > : "=d" (res) > : "%d" (a), "d" (b) > ); > return res; > } > --- > Compiler says: > "libspeex\fixed_bfin.h", line 48: cc1101: error: invalid constraint in asm statement > : "%d" (a), "d" (b) > ^ > > Following modification successfully passed: > : "d" (a), "d" (b) > > Similarly, compiler doesn't understand the following line: > : "=m" (res) > It claims that m is not valid constarin. Looking into the manual (including gnu manual) I agree with compiler.BTW, gcc accepts these constraints fine. It's been too long so I don't quite remember how all of that worked, though (IIRC, the % means "input may share a register with output"). What happens if you make all the changes to make it compile? Does it run fine. I don't have VisualDSP++, so it's hard to help with exact constraints.>> BTW, when you say it's slow, can you be more precise? What >> performance do you expect and what do you get? Using gcc, I think >> David got it down to ~20 MIPS at 15 kbps, so I assume VisualDSP++ >> should be able to do better than that. > > I talked to David. He got 22 MIPS _with_ inline asm. I am getting > around 34 MIPS for exactly the same mode (15kbps, complexity=1, > vbr=off) without inline asm. Don't know the scores for gcc, so can't > tell whether ADI compiler is better. It's surely not better than your > assembler.IIRC, gcc alone (no asm) was using something in the order of 100 MIPS (back when it couldn't do hardware loops, MACs, cond. moves, ...), so as you can see, there's a fair bit of difference. So yes, with assembly working, VDSP++ should be able to achieve better than 20 MIPS. Jean-Marc
Jean-Marc Valin
2007-Jun-14 14:19 UTC
[Speex-dev] Blackfin inline assembler and VisualDSP++ toolchain
> I talked to David. He got 22 MIPS _with_ inline asm. I am getting > around 34 MIPS for exactly the same mode (15kbps, complexity=1, > vbr=off) without inline asm. Don't know the scores for gcc, so can't > tell whether ADI compiler is better. It's surely not better than your > assembler.Note, David wasn't using the instruction SRAM (only the I1 cache) and I suspect that this was becoming the bottleneck (I1 cache too small for both Speex and the OS). If you get everything into SRAM, I'd expect a significant improvement. Jean-Marc