thr3ads.net - similar to: "merging issue........."

Displaying 20 results from an estimated 3000 matches similar to: "merging issue........."

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 06

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

Hi Jean-Marc, Thanks a lot for reviewing this huge assembly function! silk_warped_autocorrelation_FIX_c()'s kernel part is for( n = 0; n < length; n++ ) { tmp1_QS = silk_LSHIFT32( (opus_int32)input[ n ], QS ); /* Loop over allpass sections */ for( i = 0; i < order; i++ ) { /* Output of allpass section */ tmp2_QS = silk_SMLAWB(

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 07

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

This is a great idea. But the order (psEncC->shapingLPCOrder) can be configured to 12, 14, 16, 20 and 24 according to complexity parameter. It's hard to get a universal function to handle all these orders efficiently. Any suggestions? Thanks, Linfeng On Mon, Feb 6, 2017 at 12:40 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > On 06/02/17 02:51 PM,

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 07

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

Hi Jean-Marc, Thanks for your suggestions. Will get back to you once we have some updates. Linfeng On Mon, Feb 6, 2017 at 5:47 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > On 06/02/17 07:18 PM, Linfeng Zhang wrote: > > This is a great idea. But the order (psEncC->shapingLPCOrder) can be > > configured to 12, 14, 16, 20 and 24 according to

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 05

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

I attached a new patch with small cleanup (disassembly is identical as the last patch). We have done the same internal testing as usual. Also, attached 2 failed temporary versions which try to reduce code size (just for code review reference purpose). The new patch of silk_warped_autocorrelation_FIX_neon() has a code size of 3,228 bytes (with gcc). smaller_slower.c has a code size of 2,304

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 29

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

Hello everyone, I think I have found an gvn / alias analysis related bug, but before opening an issue on the tracker I wanted to see if I am missing something. I have the following testcase: define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* %out) { > entry: > ; Just some temporary storage > %tmp.0 = alloca i32 > %tmp.1 = alloca i32 > %tmp.i =

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 05

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

Thank Jean-Marc! The speedup percentages are all relative to the entire encoder. Comparing to master, this optimization patch speeds up fixed-point SILK encoder on NEON as following: Complexity 5: 6.1% Complexity 6: 5.8% Complexity 8: 5.5% Complexity 10: 4.0% when testing on an Acer Chromebook, ARMv7 Processor rev 3 (v7l), CPU max MHz: 2116.5 Thanks, Linfeng On Wed, Apr 5, 2017 at 11:02 AM,

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 29

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

this is definitely a bug in AA. 225 for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E; ++I) { 226 const Value *Arg = *I; 227 if (!Arg->getType()->isPointerTy()) -> 228 continue; 229 unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I); 230 auto CS2ArgLoc = MemoryLocation::getForArgument(CS2, CS2ArgIdx, TLI);

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Jan 31

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

Hi, Attached is a patch with arm neon optimizations for silk_warped_autocorrelation_FIX(). Please review. Thanks, Felicia -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170131/9a912bb4/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name:

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 29

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

+ a few others. After following this rabbit hole a bit, there are a lot of mutually recursive calls, etc, that may or may not do the right thing with vectors of pointers. I can fix *this* particular bug with the attached patch. However, it's mostly papering over stuff. Nothing seems to know what to do with a memorylocation that is a vector of pointers. They all expect memorylocation to be a

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 29

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

Okay, so then it sounds like, for now, the right fix is to stop marking masked.gather and masked.scatter with intrarg* options. On Mon, Aug 29, 2016, 1:26 PM Philip Reames <listmail at philipreames.com> wrote: > We might have specification bug here, but we appear to implement what we > specified. argmemonly is specified as only considering pointer typed > arguments. It's

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

2013 Feb 14

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

Hello, While investigating one of the existing tests (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some interesting code. The IR is very straightforward: define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { entry: ret i32 %a3 } define fastcc i32 @tailcaller(i32 %in1, i32 %in2) { entry: %tmp11 = tail call fastcc i32 @tailcallee( i32 %in1, i32 %in2, i32

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 30

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

----- Original Message ----- > From: "Daniel Berlin" <dberlin at dberlin.org> > To: "Philip Reames" <listmail at philipreames.com>, "Davide Italiano" > <davide at freebsd.org>, "Chandler Carruth" <chandlerc at gmail.com> > Cc: "Chris Sakalis" <chrissakalis at gmail.com>, "David Majnemer" >

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 31

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

Thank you for the quick fix, I can no longer reproduce the issue. As far a releases go, I am guessing that this is going to be in 4.0? Best, Chris On Tue, Aug 30, 2016 at 9:26 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > Yeah, i just hope it doesn't regress scatter/gather vector code badly. > But at least it's correct now? > > > On Tue, Aug 30, 2016 at 1:11

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

2016 Aug 31

GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

Great, thank you! On Wed, Aug 31, 2016 at 2:07 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > ------------------------------ > > *From: *"Chris Sakalis" <chrissakalis at gmail.com> > *To: *"Daniel Berlin" <dberlin at dberlin.org> > *Cc: *"Hal Finkel" <hfinkel at anl.gov>, "David Majnemer" < > david.majnemer

SCEV and LoopStrengthReduction Formulae

2018 Apr 03

SCEV and LoopStrengthReduction Formulae

I am attempting to implement a minor loop strength reduction optimization for targets that support compare and jump fusion, specifically TTI::canMacroFuseCmp(). My approach might be wrong; however, I am soliciting the idea for feedback, so that I can implement this correctly. My plan is to add a Supplemental LSR formula to LoopStrengthReduce.cpp that optimizes the following case, but perhaps

[PATCH 1/6] x86: Add VMWare Host Communication Macros

2015 Dec 01

[PATCH 1/6] x86: Add VMWare Host Communication Macros

These macros will be used by multiple VMWare modules for handling host communication. v2: * Keeping only the minimal common platform defines * added vmware_platform() check function v3: * Added new field to handle different hypervisor magic values Signed-off-by: Sinclair Yeh <syeh at vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom at vmware.com> Reviewed-by: Alok N Kataria

[PATCH 1/6] x86: Add VMWare Host Communication Macros

2015 Dec 01

[PATCH 1/6] x86: Add VMWare Host Communication Macros

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

2013 Feb 15

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

>> While investigating one of the existing tests >> (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some >> interesting code. The IR is very straightforward: >> >> define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 >> %a4) { >> entry: >> ret i32 %a3 >> } >> >> define fastcc i32 @tailcaller(i32

I cannot get species scores to plot with site scores in MDS when I use a distance matrix as input. Problems with NA's?

2011 Nov 24

I cannot get species scores to plot with site scores in MDS when I use a distance matrix as input. Problems with NA's?

Hi, First I should note I am relatively new to R so I would appreciate answers that take this into account. I am trying to perform an MDS ordination using the function ?metaMDS? of the ?vegan? package. I want to ordinate species according to a set of functional traits. ?Species? here refers to ?sites? in traditional vegetation analyses while ?traits? here correspond to ?species? in such

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

2013 Feb 15

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

Hey Eli, On Thu, Feb 14, 2013 at 5:45 PM, Eli Bendersky <eliben at google.com> wrote: > Hello, > > While investigating one of the existing tests > (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some > interesting code. The IR is very straightforward: > > define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 > %a4) { > entry: >

similar to: merging issue.........