thr3ads.net - Speex dev - [Speex-dev] Speex on ARM7 [Aug 2007]

If this information is useful, please help other people find it:
Share via:

Eliso

2007-Aug-24 16:42 UTC

[Speex-dev] Speex on ARM7

Hello

 

I'm testing SPEEX on embedded board using ARM7 (Atmel). ARM7 don't have
floating point so I'm using FIXED_POINT. Unfortunately the encoding speed is
about 5 times slower then necessary for real time. 

ARM7 is slow for 16/8 bits operations. 

The  sequence:

 

static inline spx_word32_t compute_pitch_error(spx_word16_t *C, spx_word16_t
*g, spx_word16_t pitch_control)

{

   spx_word32_t sum = 0;

   sum = ADD32(sum,MULT16_16(MULT16_16_16(g[0],pitch_control),C[0]));

   sum = ADD32(sum,MULT16_16(MULT16_16_16(g[1],pitch_control),C[1]));

   sum = ADD32(sum,MULT16_16(MULT16_16_16(g[2],pitch_control),C[2]));

   sum = SUB32(sum,MULT16_16(MULT16_16_16(g[0],g[1]),C[3]));

   sum = SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[1]),C[4]));

   sum = SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[0]),C[5]));

   sum = SUB32(sum,MULT16_16(MULT16_16_16(g[0],g[0]),C[6]));

   sum = SUB32(sum,MULT16_16(MULT16_16_16(g[1],g[1]),C[7]));

   sum = SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[2]),C[8]));

   return sum;

}

 

is about 30 times slower than similar operation using 32 bits (int)  below.

 

static inline long compute_pitch_errorL(int *C, int *g, int pitch_control)

{

   spx_word32_t sum=0;

   sum+=g[0] * pitch_control * C[0]; //
ADD32(sum,MULT16_16(MULT16_16_16(g[0],pitch_control),C[0]));

   sum+=g[1] * pitch_control * C[1]; //
ADD32(sum,MULT16_16(MULT16_16_16(g[1],pitch_control),C[1]));

   sum+=g[2] * pitch_control * C[2]; //
ADD32(sum,MULT16_16(MULT16_16_16(g[2],pitch_control),C[2]));

   sum-=g[0] * g[1] * C[3];
//SUB32(sum,MULT16_16(MULT16_16_16(g[0],g[1]),C[3]));

   sum-=g[2] * g[1] * C[4];
//SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[1]),C[4]));

   sum-=g[2] * g[0] * C[5]; //
SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[0]),C[5]));

   sum-=g[0] * g[0] * C[6]; //
SUB32(sum,MULT16_16(MULT16_16_16(g[0],g[0]),C[6]));

   sum-=g[1] * g[1] * C[7]; //
SUB32(sum,MULT16_16(MULT16_16_16(g[1],g[1]),C[7]));

   sum-=g[2] * g[2] * C[8]; //
SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[2]),C[8]));

   return sum;

}

Not use 16 bits seem to be a possible solution. I'd like to know if there is
an option to execute this way or if the algorithm relay on 16 bit operation
and cannot easily converted to 32 bits.

 

Best regards

 

Eliso Cavalli

 

 

 

 

Planeta Informatica Ltda.Rua Roxo Moreira, 1178, 

Campinas/SP/BRASIL. 

CEP 13083-591.

phone: +55 19 32897755

fax: +55 19 32491717

 <mailto:eliso@planeta.inf.br>  

 
<file:///C:\Documents%20and%20Settings\Administrador.PLANETA\Dados%20de%20ap
licativos\Microsoft\Signatures\www.planeta.inf.br>  

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.xiph.org/pipermail/speex-dev/attachments/20070824/eda03686/attachment.htm

Jean-Marc Valin

2007-Aug-24 17:03 UTC

head link

[Speex-dev] Speex on ARM7

Hi,

I'm quite surprised that doing a*b*c is faster than doing MULT16_16(a,
MULT16_16_16(b, c)) on ARM. probably your compiler doesn't realise that
it can ignore pretty much all the casts. Are you using some crappy MS
compiler by any chance (I think gcc is usually smart enough for that)?

In any case, the workaround would be to override MULT16_16 and
MULT16_16_16 so just do;

#define MULT16_16(a,b) ((a)*(b))
#define MULT16_16_16(a,b) ((a)*(b))

I think that should solve the problem. Let me know whether that works
(and doesn't have undesirable side effects), and what compiler you're
using.

	Jean-Marc

Eliso a ?crit :> Hello
> 
>  
> 
> I'm testing SPEEX on embedded board using ARM7 (Atmel). ARM7 don't
have
> floating point so I'm using FIXED_POINT. Unfortunately the encoding
speed is
> about 5 times slower then necessary for real time. 
> 
> ARM7 is slow for 16/8 bits operations. 
> 
> The  sequence:
> 
>  
> 
> static inline spx_word32_t compute_pitch_error(spx_word16_t *C,
spx_word16_t
> *g, spx_word16_t pitch_control)
> 
> {
> 
>    spx_word32_t sum = 0;
> 
>    sum = ADD32(sum,MULT16_16(MULT16_16_16(g[0],pitch_control),C[0]));
> 
>    sum = ADD32(sum,MULT16_16(MULT16_16_16(g[1],pitch_control),C[1]));
> 
>    sum = ADD32(sum,MULT16_16(MULT16_16_16(g[2],pitch_control),C[2]));
> 
>    sum = SUB32(sum,MULT16_16(MULT16_16_16(g[0],g[1]),C[3]));
> 
>    sum = SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[1]),C[4]));
> 
>    sum = SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[0]),C[5]));
> 
>    sum = SUB32(sum,MULT16_16(MULT16_16_16(g[0],g[0]),C[6]));
> 
>    sum = SUB32(sum,MULT16_16(MULT16_16_16(g[1],g[1]),C[7]));
> 
>    sum = SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[2]),C[8]));
> 
>    return sum;
> 
> }
> 
>  
> 
> is about 30 times slower than similar operation using 32 bits (int)  below.
> 
>  
> 
> static inline long compute_pitch_errorL(int *C, int *g, int pitch_control)
> 
> {
> 
>    spx_word32_t sum=0;
> 
>    sum+=g[0] * pitch_control * C[0]; //
> ADD32(sum,MULT16_16(MULT16_16_16(g[0],pitch_control),C[0]));
> 
>    sum+=g[1] * pitch_control * C[1]; //
> ADD32(sum,MULT16_16(MULT16_16_16(g[1],pitch_control),C[1]));
> 
>    sum+=g[2] * pitch_control * C[2]; //
> ADD32(sum,MULT16_16(MULT16_16_16(g[2],pitch_control),C[2]));
> 
>    sum-=g[0] * g[1] * C[3];
> //SUB32(sum,MULT16_16(MULT16_16_16(g[0],g[1]),C[3]));
> 
>    sum-=g[2] * g[1] * C[4];
> //SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[1]),C[4]));
> 
>    sum-=g[2] * g[0] * C[5]; //
> SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[0]),C[5]));
> 
>    sum-=g[0] * g[0] * C[6]; //
> SUB32(sum,MULT16_16(MULT16_16_16(g[0],g[0]),C[6]));
> 
>    sum-=g[1] * g[1] * C[7]; //
> SUB32(sum,MULT16_16(MULT16_16_16(g[1],g[1]),C[7]));
> 
>    sum-=g[2] * g[2] * C[8]; //
> SUB32(sum,MULT16_16(MULT16_16_16(g[2],g[2]),C[8]));
> 
>    return sum;
> 
> }
> 
> Not use 16 bits seem to be a possible solution. I'd like to know if
there is
> an option to execute this way or if the algorithm relay on 16 bit operation
> and cannot easily converted to 32 bits.
> 
>  
> 
> Best regards
> 
>  
> 
> Eliso Cavalli
> 
>  
> 
>  
> 
>  
> 
>  
> 
> Planeta Informatica Ltda.Rua Roxo Moreira, 1178, 
> 
> Campinas/SP/BRASIL. 
> 
> CEP 13083-591.
> 
> phone: +55 19 32897755
> 
> fax: +55 19 32491717
> 
>  <mailto:eliso@planeta.inf.br>  
> 
>  
>
<file:///C:\Documents%20and%20Settings\Administrador.PLANETA\Dados%20de%20ap
> licativos\Microsoft\Signatures\www.planeta.inf.br>  
> 
>  
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Speex-dev mailing list
> Speex-dev@xiph.org
> http://lists.xiph.org/mailman/listinfo/speex-dev

Reasonably Related Threads

Search for more maybe matching threads

Speex dev - Aug 2007 - Speex on ARM7

[Speex-dev] Speex on ARM7

[Speex-dev] Speex on ARM7

Reasonably Related Threads