Displaying 20 results from an estimated 3000 matches similar to: "merging issue........."
2017 Feb 06
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc,
Thanks a lot for reviewing this huge assembly function!
silk_warped_autocorrelation_FIX_c()'s kernel part is
for( n = 0; n < length; n++ ) {
tmp1_QS = silk_LSHIFT32( (opus_int32)input[ n ], QS );
/* Loop over allpass sections */
for( i = 0; i < order; i++ ) {
/* Output of allpass section */
tmp2_QS = silk_SMLAWB(
2017 Feb 07
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
This is a great idea. But the order (psEncC->shapingLPCOrder) can be
configured to 12, 14, 16, 20 and 24 according to complexity parameter.
It's hard to get a universal function to handle all these orders
efficiently. Any suggestions?
Thanks,
Linfeng
On Mon, Feb 6, 2017 at 12:40 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote:
> Hi Linfeng,
>
> On 06/02/17 02:51 PM,
2017 Feb 07
3
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc,
Thanks for your suggestions. Will get back to you once we have some updates.
Linfeng
On Mon, Feb 6, 2017 at 5:47 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote:
> Hi Linfeng,
>
> On 06/02/17 07:18 PM, Linfeng Zhang wrote:
> > This is a great idea. But the order (psEncC->shapingLPCOrder) can be
> > configured to 12, 14, 16, 20 and 24 according to
2017 Apr 05
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
I attached a new patch with small cleanup (disassembly is identical as the
last patch). We have done the same internal testing as usual.
Also, attached 2 failed temporary versions which try to reduce code size
(just for code review reference purpose).
The new patch of silk_warped_autocorrelation_FIX_neon() has a code size of
3,228 bytes (with gcc).
smaller_slower.c has a code size of 2,304
2016 Aug 29
2
GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
Hello everyone,
I think I have found an gvn / alias analysis related bug, but before
opening an issue on the tracker I wanted to see if I am missing something.
I have the following testcase:
define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* %out) {
> entry:
> ; Just some temporary storage
> %tmp.0 = alloca i32
> %tmp.1 = alloca i32
> %tmp.i =
2017 Apr 05
4
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Thank Jean-Marc!
The speedup percentages are all relative to the entire encoder.
Comparing to master, this optimization patch speeds up fixed-point SILK
encoder on NEON as following: Complexity 5: 6.1% Complexity 6: 5.8%
Complexity 8: 5.5% Complexity 10: 4.0%
when testing on an Acer Chromebook, ARMv7 Processor rev 3 (v7l), CPU max
MHz: 2116.5
Thanks,
Linfeng
On Wed, Apr 5, 2017 at 11:02 AM,
2016 Aug 29
2
GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
this is definitely a bug in AA.
225 for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E; ++I) {
226 const Value *Arg = *I;
227 if (!Arg->getType()->isPointerTy())
-> 228 continue;
229 unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I);
230 auto CS2ArgLoc = MemoryLocation::getForArgument(CS2,
CS2ArgIdx, TLI);
2017 Jan 31
6
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi,
Attached is a patch with arm neon optimizations for
silk_warped_autocorrelation_FIX(). Please review.
Thanks,
Felicia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/opus/attachments/20170131/9a912bb4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name:
2016 Aug 29
2
GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
+ a few others.
After following this rabbit hole a bit, there are a lot of mutually
recursive calls, etc, that may or may not do the right thing with vectors
of pointers.
I can fix *this* particular bug with the attached patch.
However, it's mostly papering over stuff. Nothing seems to know what to do
with a memorylocation that is a vector of pointers. They all expect
memorylocation to be a
2016 Aug 29
2
GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
Okay, so then it sounds like, for now, the right fix is to stop marking
masked.gather and masked.scatter with intrarg* options.
On Mon, Aug 29, 2016, 1:26 PM Philip Reames <listmail at philipreames.com>
wrote:
> We might have specification bug here, but we appear to implement what we
> specified. argmemonly is specified as only considering pointer typed
> arguments. It's
2013 Feb 14
2
[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates
Hello,
While investigating one of the existing tests
(test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some
interesting code. The IR is very straightforward:
define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) {
entry:
ret i32 %a3
}
define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
entry:
%tmp11 = tail call fastcc i32 @tailcallee( i32 %in1, i32 %in2, i32
2016 Aug 30
2
GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
----- Original Message -----
> From: "Daniel Berlin" <dberlin at dberlin.org>
> To: "Philip Reames" <listmail at philipreames.com>, "Davide Italiano"
> <davide at freebsd.org>, "Chandler Carruth" <chandlerc at gmail.com>
> Cc: "Chris Sakalis" <chrissakalis at gmail.com>, "David Majnemer"
>
2016 Aug 31
2
GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
Thank you for the quick fix, I can no longer reproduce the issue. As far a
releases go, I am guessing that this is going to be in 4.0?
Best,
Chris
On Tue, Aug 30, 2016 at 9:26 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
> Yeah, i just hope it doesn't regress scatter/gather vector code badly.
> But at least it's correct now?
>
>
> On Tue, Aug 30, 2016 at 1:11
2016 Aug 31
2
GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
Great, thank you!
On Wed, Aug 31, 2016 at 2:07 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> ------------------------------
>
> *From: *"Chris Sakalis" <chrissakalis at gmail.com>
> *To: *"Daniel Berlin" <dberlin at dberlin.org>
> *Cc: *"Hal Finkel" <hfinkel at anl.gov>, "David Majnemer" <
> david.majnemer
2018 Apr 03
4
SCEV and LoopStrengthReduction Formulae
I am attempting to implement a minor loop strength reduction optimization for
targets that support compare and jump fusion, specifically
TTI::canMacroFuseCmp(). My approach might be wrong; however, I am soliciting
the idea for feedback, so that I can implement this correctly. My plan is to
add a Supplemental LSR formula to LoopStrengthReduce.cpp that optimizes the
following case, but perhaps
2015 Dec 01
11
[PATCH 1/6] x86: Add VMWare Host Communication Macros
These macros will be used by multiple VMWare modules for handling
host communication.
v2:
* Keeping only the minimal common platform defines
* added vmware_platform() check function
v3:
* Added new field to handle different hypervisor magic values
Signed-off-by: Sinclair Yeh <syeh at vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom at vmware.com>
Reviewed-by: Alok N Kataria
2015 Dec 01
11
[PATCH 1/6] x86: Add VMWare Host Communication Macros
These macros will be used by multiple VMWare modules for handling
host communication.
v2:
* Keeping only the minimal common platform defines
* added vmware_platform() check function
v3:
* Added new field to handle different hypervisor magic values
Signed-off-by: Sinclair Yeh <syeh at vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom at vmware.com>
Reviewed-by: Alok N Kataria
2013 Feb 15
2
[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates
>> While investigating one of the existing tests
>> (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some
>> interesting code. The IR is very straightforward:
>>
>> define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32
>> %a4) {
>> entry:
>> ret i32 %a3
>> }
>>
>> define fastcc i32 @tailcaller(i32
2011 Nov 24
4
I cannot get species scores to plot with site scores in MDS when I use a distance matrix as input. Problems with NA's?
Hi, First I should note I am relatively new to R so I would appreciate answers that take this into account.
I am trying to perform an MDS ordination using the function ?metaMDS? of the ?vegan? package. I want to ordinate species according to a set of functional traits. ?Species? here refers to ?sites? in traditional vegetation analyses while ?traits? here correspond to ?species? in such
2013 Feb 15
0
[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates
Hey Eli,
On Thu, Feb 14, 2013 at 5:45 PM, Eli Bendersky <eliben at google.com> wrote:
> Hello,
>
> While investigating one of the existing tests
> (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some
> interesting code. The IR is very straightforward:
>
> define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32
> %a4) {
> entry:
>