Displaying 19 results from an estimated 19 matches for "in7".
Did you mean:
in
2017 Feb 06
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...corr_QC[ order ] += silk_RSHIFT64( silk_SMULL( tmp1_QS, state_QS[
0 ] ), 2 * QS - QC );
}
in which corr_QC[0, 1, ..., order] is the only output.
Suppose order = 10, and each stage of the inner loop is noted by s0, s1,
..., s9. And suppose we simultaneously process 8 input in SIMD, from in0 to
in7. Let PROC(inx(sy)) denote processing input[x] at stage y.
If there is no dependency between inx(sy) and in(x+1)(sy), then we can do
this
FOR in=0 TO N WITH in+=8
FOR y=0 TO order-1 WITH y++
PROC(in0(sy) in1(sy) in2(sy) in3(sy) in4(sy) in5(sy) in6(sy) in7(sy))
END FOR
END FOR
Definitely th...
2017 Feb 07
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...in3(s0))
> PROC( in0(s4) in1(s3) in2(s2) in3(s1) in4(s0))
> PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1) in5(s0))
> PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2) in5(s1) in6(s0))
> PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3) in5(s2) in6(s1) in7(s0))
> PROC(in1(s7) in2(s6) in3(s5) in4(s4) in5(s3) in6(s2) in7(s1) in8(s0))
> PROC(in2(s7) in3(s6) in4(s5) in5(s4) in6(s3) in7(s2) in8(s1) in9(s0))
> PROC(in3(s7) in4(s6) in5(s5) in6(s4) in7(s3) in8(s2) in9(s1)in10(s0))
> PROC(in4(s7) in5(s6) in6(s5) in7(s4) in8(s3) in9(s2)in10(s1)in11...
2017 Feb 07
3
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...in0(s4) in1(s3) in2(s2) in3(s1) in4(s0))
> > PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1) in5(s0))
> > PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2) in5(s1) in6(s0))
> > PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3) in5(s2) in6(s1) in7(s0))
> > PROC(in1(s7) in2(s6) in3(s5) in4(s4) in5(s3) in6(s2) in7(s1) in8(s0))
> > PROC(in2(s7) in3(s6) in4(s5) in5(s4) in6(s3) in7(s2) in8(s1) in9(s0))
> > PROC(in3(s7) in4(s6) in5(s5) in6(s4) in7(s3) in8(s2) in9(s1)in10(s0))
> > PROC(in4(s7) in5(s6) in6(s5)...
2017 Feb 06
0
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...s3) in1(s2) in2(s1) in3(s0))
PROC( in0(s4) in1(s3) in2(s2) in3(s1) in4(s0))
PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1) in5(s0))
PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2) in5(s1) in6(s0))
PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3) in5(s2) in6(s1) in7(s0))
PROC(in1(s7) in2(s6) in3(s5) in4(s4) in5(s3) in6(s2) in7(s1) in8(s0))
PROC(in2(s7) in3(s6) in4(s5) in5(s4) in6(s3) in7(s2) in8(s1) in9(s0))
PROC(in3(s7) in4(s6) in5(s5) in6(s4) in7(s3) in8(s2) in9(s1)in10(s0))
PROC(in4(s7) in5(s6) in6(s5) in7(s4) in8(s3) in9(s2)in10(s1)in11(s0))
...and so on u...
2017 Apr 05
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...t;> > PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1)
>>> in5(s0))
>>> > PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2) in5(s1)
>>> in6(s0))
>>> > PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3) in5(s2) in6(s1)
>>> in7(s0))
>>> > PROC(in1(s7) in2(s6) in3(s5) in4(s4) in5(s3) in6(s2) in7(s1)
>>> in8(s0))
>>> > PROC(in2(s7) in3(s6) in4(s5) in5(s4) in6(s3) in7(s2) in8(s1)
>>> in9(s0))
>>> > PROC(in3(s7) in4(s6) in5(s5) in6(s4) in7(s3) in8(s2)
>>&...
2017 Feb 07
0
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...PROC( in0(s4) in1(s3) in2(s2) in3(s1) in4(s0))
> PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1) in5(s0))
> PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2) in5(s1) in6(s0))
> PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3) in5(s2) in6(s1) in7(s0))
> PROC(in1(s7) in2(s6) in3(s5) in4(s4) in5(s3) in6(s2) in7(s1) in8(s0))
> PROC(in2(s7) in3(s6) in4(s5) in5(s4) in6(s3) in7(s2) in8(s1) in9(s0))
> PROC(in3(s7) in4(s6) in5(s5) in6(s4) in7(s3) in8(s2) in9(s1)in10(s0))
> PROC(in4(s7) in5(s6) in6(s5) in7(s4) in8(s3) in9...
2017 Apr 03
0
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...>> in4(s0))
>> > PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1)
>> in5(s0))
>> > PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2) in5(s1)
>> in6(s0))
>> > PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3) in5(s2) in6(s1)
>> in7(s0))
>> > PROC(in1(s7) in2(s6) in3(s5) in4(s4) in5(s3) in6(s2) in7(s1)
>> in8(s0))
>> > PROC(in2(s7) in3(s6) in4(s5) in5(s4) in6(s3) in7(s2) in8(s1)
>> in9(s0))
>> > PROC(in3(s7) in4(s6) in5(s5) in6(s4) in7(s3) in8(s2)
>> in9(s1)in10(s0))
>...
2017 Apr 05
4
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...in1(s4) in2(s3) in3(s2)
> > in4(s1) in5(s0))
> > > PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2)
> > in5(s1) in6(s0))
> > > PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3) in5(s2)
> > in6(s1) in7(s0))
> > > PROC(in1(s7) in2(s6) in3(s5) in4(s4) in5(s3) in6(s2)
> > in7(s1) in8(s0))
> > > PROC(in2(s7) in3(s6) in4(s5) in5(s4) in6(s3) in7(s2)
> > in8(s1) in9(s0))
> > > PROC(in3(s7) in4(...
2017 Apr 05
0
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...( in0(s5) in1(s4) in2(s3) in3(s2)
> in4(s1) in5(s0))
> > PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2)
> in5(s1) in6(s0))
> > PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3) in5(s2)
> in6(s1) in7(s0))
> > PROC(in1(s7) in2(s6) in3(s5) in4(s4) in5(s3) in6(s2)
> in7(s1) in8(s0))
> > PROC(in2(s7) in3(s6) in4(s5) in5(s4) in6(s3) in7(s2)
> in8(s1) in9(s0))
> > PROC(in3(s7) in4(s6) in5(s5) in6(s4) in7(s...
2017 Jan 31
6
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi,
Attached is a patch with arm neon optimizations for
silk_warped_autocorrelation_FIX(). Please review.
Thanks,
Felicia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/opus/attachments/20170131/9a912bb4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name:
2017 Apr 06
0
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...s2)
> > in4(s1) in5(s0))
> > > PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2)
> > in5(s1) in6(s0))
> > > PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3) in5(s2)
> > in6(s1) in7(s0))
> > > PROC(in1(s7) in2(s6) in3(s5) in4(s4) in5(s3) in6(s2)
> > in7(s1) in8(s0))
> > > PROC(in2(s7) in3(s6) in4(s5) in5(s4) in6(s3) in7(s2)
> > in8(s1) in9(s0))
> > >...
2008 Mar 31
1
[03/15][PATCH] kvm/ia64: Add header files for kvm/ia64. V8
...al_retval ret;
> +} pal_call_t;
> +
> +/* Sal data structure */
> +typedef struct sal_call{
and again...
> + /*In area*/
> + uint64_t in0;
> + uint64_t in1;
> + uint64_t in2;
> + uint64_t in3;
> + uint64_t in4;
> + uint64_t in5;
> + uint64_t in6;
> + uint64_t in7;
> + /*Our area*/
> + struct sal_ret_values ret;
> +} sal_call_t;
2008 Mar 31
1
[03/15][PATCH] kvm/ia64: Add header files for kvm/ia64. V8
...al_retval ret;
> +} pal_call_t;
> +
> +/* Sal data structure */
> +typedef struct sal_call{
and again...
> + /*In area*/
> + uint64_t in0;
> + uint64_t in1;
> + uint64_t in2;
> + uint64_t in3;
> + uint64_t in4;
> + uint64_t in5;
> + uint64_t in6;
> + uint64_t in7;
> + /*Our area*/
> + struct sal_ret_values ret;
> +} sal_call_t;
2008 Feb 25
6
[PATCH 0/4] ia64/xen: paravirtualization of hand written assembly code
Hi. The patch I send before was too large so that it was dropped from
the maling list. I'm sending again with smaller size.
This patch set is the xen paravirtualization of hand written assenbly
code. And I expect that much clean up is necessary before merge.
We really need the feed back before starting actual clean up as Eddie
already said before.
Eddie discussed how to clean up and suggested
2008 Feb 25
6
[PATCH 0/4] ia64/xen: paravirtualization of hand written assembly code
Hi. The patch I send before was too large so that it was dropped from
the maling list. I'm sending again with smaller size.
This patch set is the xen paravirtualization of hand written assenbly
code. And I expect that much clean up is necessary before merge.
We really need the feed back before starting actual clean up as Eddie
already said before.
Eddie discussed how to clean up and suggested
2008 Feb 26
8
[PATCH 0/8] RFC: ia64/xen TAKE 2: paravirtualization of hand written assembly code
Hi. I rewrote the patch according to the comments. I adopted generating
in-place code because it looks the quickest way.
The point Eddie wanted to discuss is how to generate code and its ABI.
i.e. in-place generating v.s. direct jump v.s. indirect function call
Indirect function call doesn't make sense because ivt.S is compiled
multi times. And it is up to pv instances to choose in-place
2008 Feb 26
8
[PATCH 0/8] RFC: ia64/xen TAKE 2: paravirtualization of hand written assembly code
Hi. I rewrote the patch according to the comments. I adopted generating
in-place code because it looks the quickest way.
The point Eddie wanted to discuss is how to generate code and its ABI.
i.e. in-place generating v.s. direct jump v.s. indirect function call
Indirect function call doesn't make sense because ivt.S is compiled
multi times. And it is up to pv instances to choose in-place
2008 Mar 05
51
[PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization
Hi. This patchset implements xen/ia64 domU support.
Qing He and Eddie Dong also has been woring on pv_ops so that
I want to discuss before going further and avoid duplicated work.
I suppose that Eddie will also post his own patch. So reviewing both
patches, we can reach to better pv_ops interface.
- I didn't changed the ia64 intrinsic paravirtulization abi from
the last post. Presumably it
2008 Mar 05
51
[PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization
Hi. This patchset implements xen/ia64 domU support.
Qing He and Eddie Dong also has been woring on pv_ops so that
I want to discuss before going further and avoid duplicated work.
I suppose that Eddie will also post his own patch. So reviewing both
patches, we can reach to better pv_ops interface.
- I didn't changed the ia64 intrinsic paravirtulization abi from
the last post. Presumably it