thr3ads.net - search: "in2"

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 06

2

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

..., ..., s9. And suppose we simultaneously process 8 input in SIMD, from in0 to in7. Let PROC(inx(sy)) denote processing input[x] at stage y. If there is no dependency between inx(sy) and in(x+1)(sy), then we can do this FOR in=0 TO N WITH in+=8 FOR y=0 TO order-1 WITH y++ PROC(in0(sy) in1(sy) in2(sy) in3(sy) in4(sy) in5(sy) in6(sy) in7(sy)) END FOR END FOR Definitely there is no any prolog and epilog needed. However, the critical thing is that all the states in each stage when processing input[i] are reused by the next input[i+1]. That is input[i+1] must wait input[i] for 1 stage, and i...

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 07

2

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

...quot;order" in chunks of N. Using your notation, you would > be doing: > > PROC( in0(s0)) > PROC( in0(s1) in1(s0)) > PROC( in0(s2) in1(s1) in2(s0)) > PROC( in0(s3) in1(s2) in2(s1) in3(s0)) > PROC( in0(s4) in1(s3) in2(s2) in3(s1) in4(s0)) > PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1) in5(s0)) > PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2) in5(s1) in6...

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 07

3

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

...tation, you > would > > be doing: > > > > PROC( in0(s0)) > > PROC( in0(s1) in1(s0)) > > PROC( in0(s2) in1(s1) in2(s0)) > > PROC( in0(s3) in1(s2) in2(s1) in3(s0)) > > PROC( in0(s4) in1(s3) in2(s2) in3(s1) in4(s0)) > > PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1) in5(s0)) > > PROC( in0(s6) in1(s5)...

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 06

0

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

...is to instead chop the "order" in chunks of N. Using your notation, you would be doing: PROC( in0(s0)) PROC( in0(s1) in1(s0)) PROC( in0(s2) in1(s1) in2(s0)) PROC( in0(s3) in1(s2) in2(s1) in3(s0)) PROC( in0(s4) in1(s3) in2(s2) in3(s1) in4(s0)) PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1) in5(s0)) PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4(s2) in5(s1) in6(s0)) PROC(in0(s7) i...

DAHDI with (CDR(userfield)

2013 Nov 14

1

DAHDI with (CDR(userfield)

...and incoming the IVR was thinking about using userfield field, and I'm trying to do, I have at the moment 4 channel DAHDI ; DAHDI CHANNEL 3=23XXXXX6 context=in callerid=asreceived group=1 signalling=fxs_ks channel => 3 [in] exten => s,1,Set(CDR(userfield)=23XXXXX6) same=> n,Goto(in2) [in2] exten => s,1,GotoIfTime(08:00-17:00|mon-fri|*|*?s,dentro) exten => s,2,Playback(custom/fuera) exten => s,n,Set(CHANNEL(language)=es) etc etc etc .. -- Starting simple switch on 'DAHDI/3-1' -- Executing [s at in:1] Set("DAHDI/3-1", "CDR(userfield)=23XXXX...

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 05

2

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

...gt;>> > be doing: >>> > >>> > PROC( >>> in0(s0)) >>> > PROC( in0(s1) >>> in1(s0)) >>> > PROC( in0(s2) in1(s1) >>> in2(s0)) >>> > PROC( in0(s3) in1(s2) in2(s1) >>> in3(s0)) >>> > PROC( in0(s4) in1(s3) in2(s2) in3(s1) >>> in4(s0)) >>> > PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1) &...

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 29

2

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

Hello everyone, I think I have found an gvn / alias analysis related bug, but before opening an issue on the tracker I wanted to see if I am missing something. I have the following testcase: define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* %out) { > entry: > ; Just some temporary storage > %tmp.0 = alloca i32 > %tmp.1 = alloca i32 > %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0 > %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1 > ; Read from in1 and in2 >...

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 07

0

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

...in chunks of N. Using your notation, you would > be doing: > > PROC( in0(s0)) > PROC( in0(s1) in1(s0)) > PROC( in0(s2) in1(s1) in2(s0)) > PROC( in0(s3) in1(s2) in2(s1) in3(s0)) > PROC( in0(s4) in1(s3) in2(s2) in3(s1) in4(s0)) > PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1) in5(s0)) > PROC( in0(s6) in1(s5) in2(s4) in3(s3) in4...

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 03

0

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

...r notation, you >> would >> > be doing: >> > >> > PROC( >> in0(s0)) >> > PROC( in0(s1) >> in1(s0)) >> > PROC( in0(s2) in1(s1) >> in2(s0)) >> > PROC( in0(s3) in1(s2) in2(s1) >> in3(s0)) >> > PROC( in0(s4) in1(s3) in2(s2) in3(s1) >> in4(s0)) >> > PROC( in0(s5) in1(s4) in2(s3) in3(s2) in4(s1) >> in5(s0)) &gt...

SCEV and LoopStrengthReduction Formulae

2018 Apr 07

0

SCEV and LoopStrengthReduction Formulae

...o x86 SIMD optimization for a living, I did similar tricks pretty much everywhere in DSP functions. It’d be pretty nice if the compiler could do it too. There is one alternate approach that I recall, which looks like this: Original code (example, pseudocode): int add_delta_256(uint8 *in1, uint8 *in2) { int accum = 0; for (int i = 0; i < 16; ++i) { uint8x16 a = load16(in1 + i *16); // NOTE: takes an extra addressing op because x86 uint8x16 b = load16(in2 + i *16); // NOTE: takes an extra addressing op because x86 accum += psadbw(a, b); } return accum; } end of loop: inc i c...

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 05

4

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

...e doing: > > > > > > PROC( > > in0(s0)) > > > PROC( > > in0(s1) in1(s0)) > > > PROC( in0(s2) > > in1(s1) in2(s0)) > > > PROC( in0(s3) in1(s2) > > in2(s1) in3(s0)) > > > PROC( in0(s4) in1(s3) in2(s2) > > in3(s1) in4(s0)) > > > PROC(...

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 05

0

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

...( > in0(s0)) > > PROC( > in0(s1) in1(s0)) > > PROC( in0(s2) > in1(s1) in2(s0)) > > PROC( in0(s3) in1(s2) > in2(s1) in3(s0)) > > PROC( in0(s4) in1(s3) in2(s2) > in3(s1) in4(s0)) > > PROC( in0(s5) in1(s4) in2(s...

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 29

2

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

...> I think I have found an gvn / alias analysis related bug, but before > opening > > an issue on the tracker I wanted to see if I am missing something. I have > > the following testcase: > > > >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* > %out) > >> { > >> entry: > >> ; Just some temporary storage > >> %tmp.0 = alloca i32 > >> %tmp.1 = alloca i32 > >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0 > >> %tmp = insertelement <2 x...

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 29

2

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

...ated bug, but before >>> opening >>> > an issue on the tracker I wanted to see if I am missing something. I >>> have >>> > the following testcase: >>> > >>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* >>> %out) >>> >> { >>> >> entry: >>> >> ; Just some temporary storage >>> >> %tmp.0 = alloca i32 >>> >> %tmp.1 = alloca i32 >>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tm...

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 29

2

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

...;>>> opening >>>> > an issue on the tracker I wanted to see if I am missing something. I >>>> have >>>> > the following testcase: >>>> > >>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* >>>> %out) >>>> >> { >>>> >> entry: >>>> >> ; Just some temporary storage >>>> >> %tmp.0 = alloca i32 >>>> >> %tmp.1 = alloca i32 >>>> >> %tmp.i = insertelement <...

Evaluating a multivariable function XXXX

2011 May 07

1

Evaluating a multivariable function XXXX

...)} > > > > fn2(-5,-2,3) > [1] 0.8 > > > > No problems. > > === > > If, however, I call the function using a vector substitution for the > arguments, R sees this as 3 separate calls to the function while supplying > only the first argument: > > > in2<-c(-5,-2,3) > > in2 > [1] -5 -2 3 > > > > fn2(in2) > Error in fn2(in2) : argument "y" is missing, with no default > > === > > How should I call the function using the vector substitution method so that > R sees that this is a single call to the f...

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 06

0

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

...> > > > PROC( > > in0(s0)) > > > PROC( > > in0(s1) in1(s0)) > > > PROC( in0(s2) > > in1(s1) in2(s0)) > > > PROC( in0(s3) in1(s2) > > in2(s1) in3(s0)) > > > PROC( in0(s4) in1(s3) in2(s2) > > in3(s1) in4(s0)) > > >...

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

2013 Feb 14

2

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

...tigating one of the existing tests (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some interesting code. The IR is very straightforward: define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { entry: ret i32 %a3 } define fastcc i32 @tailcaller(i32 %in1, i32 %in2) { entry: %tmp11 = tail call fastcc i32 @tailcallee( i32 %in1, i32 %in2, i32 %in1, i32 %in2) ret i32 %tmp11 } define i32 @foo(i32 %in1, i32 %in2) { entry: %q = call fastcc i32 @tailcaller(i32 %in2, i32 %in1) %ww = sub i32 %q, 6 ret i32 %ww } Built with (ToT LLVM): llc < ~/temp/z.ll -mar...

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 30

2

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

...> > > > > > > > > > > > > > > > > > > > > > > > > >> define spir_kernel void @test(<2 x i32*> %in1, <2 x > > > > > > > >> i32*> > > > > > > > >> %in2, > > > > > > > >> i32* %out) > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> { > > > > > > > > > > > > > > > > &gt...

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 31

2

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

...ssue on the tracker I wanted to see if I am missing something. >>>>>>> I have >>>>>>> > the following testcase: >>>>>>> > >>>>>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, >>>>>>> i32* %out) >>>>>>> >> { >>>>>>> >> entry: >>>>>>> >> ; Just some temporary storage >>>>>>> >> %tmp.0 = alloca i32 >>>>>>> >> %tm...

search for: in2