search for: accumal

Displaying 20 results from an estimated 63 matches for "accumal".

Did you mean: accumail
2009 Oct 26
1
[PATCH] Fix miscompile of SSE resampler
From: Thorvald Natvig <slicer at users.sourceforge.net> Some optimizing compilers miscompile the current SSE optimizations when full optimizations are enabled. By using output value pointer instead of a return value, we can bypass this misbehaviour. --- libspeex/resample.c | 8 ++++---- libspeex/resample_sse.h | 24 ++++++++---------------- 2 files changed, 12 insertions(+), 20
2008 May 03
2
Resampler (no api)
.. And a version without the API changes. -------------- next part -------------- Index: libspeex/resample_sse.h =================================================================== --- libspeex/resample_sse.h (revision 0) +++ libspeex/resample_sse.h (revision 0) @@ -0,0 +1,128 @@ +/* Copyright (C) 2002-2008 Jean-Marc Valin + * Copyright (C) 2008 Thorvald Natvig + */ +/** + @file resample_sse.h +
2008 May 03
0
Resampler, memory only variant
Hi, Here's the (hopefully) final version of the resampler, now always using st->mem as the buffer area. It only allocates buffers on the stack when it's necesarry to convert the output between int and float. -------------- next part -------------- Index: include/speex/speex_resampler.h =================================================================== ---
2018 Apr 07
0
SCEV and LoopStrengthReduction Formulae
> > I realize this is a micro-op saving a single cycle. But this reduces the instruction count, one less > instr to decode in a potentially hot path. If this all makes sense, and seems like a reasonable addition > to llvm, would it make sense to implement this as a supplemental LSR formula, or as a separate pass? This seems reasonable to me so long as rbx has no other uses that
2009 Sep 11
2
Accumulating results from "for" loop in a list/array
Dear R users, I would like to accumulate objects generated from 'for' loop to a list or array. To illustrate the problem, arbitrary data set and script is shown below, x <- data.frame(a = c(rep("n",3),rep("y",2),rep("n",3),rep("y",2)), b = c(rep("y",2),rep("n",4),rep("y",3),"n"), c =
2012 Dec 10
3
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
Hello all, I wanted to get some feedback on this patch for ScalarEvolution. It addresses a performance problem I am seeing for simple benchmark. Starting with this C code: 01: signed char foo(void) 02: { 03: const int count = 8000; 04: signed char result = 0; 05: int j; 06: 07: for (j = 0; j < count; ++j) { 08: result += (result_t)(3); 09: } 10: 11: return result; 12: } I
2014 Jan 02
0
[PATCH] drm/nvc0-: Fix voltage obtained from vbios.
Coefficients are based on the formula: uV = 0.1 * arg[0] + 150.5 * arg[1] + 22.65025 * arg[2] It seems to be rounded downwards. I have no idea why the voltage isn't specified in the bios directly. Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com> ---- diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/vmap.c b/drivers/gpu/drm/nouveau/core/subdev/bios/vmap.c
2003 May 20
2
mdct_backward with fused muladd?
Can anybody point me at any resources that would explain how to optimize mdct_backward for a cpu with a fused multiply-accumute unit? >From what I understand from responses to my older postings, Tremor's mdct_backward could be rewritten to take advantage of a muladd. My target machine can do either two-wide 32x32 + Accum(64) -> Accum(64) integer muladd or eight-wide 16x16 + Accum(32)
2008 May 14
6
PWGL in wine, problems
Hello, I'm new on this list. First of all, thank you to all the developers of this great project! At the moment there is only an application that keeps me on both macos and windows, its name is PWGL a free environment for computer assisted composition in openGL. (http://www2.siba.fi/PWGL/) I'm running Ubuntu 8.04 and wine 0.9.59. I have to say that I also installed vcrun2005 and
2011 Sep 01
0
[PATCH 3/5] resample: Add NEON optimized inner_product_single for fixed point
From: Jyri Sarha <jsarha at ti.com> Semantics of inner_product_single have also been changed to contain the final right shift and saturation so it can also be implemented in the optimal way for the used platform. This change affects fixed point calculations only. I also added a new fixed point macro SATURATE32PSHR(x, shift, a). It does pretty much the same thing as SATURATE32(PSHR32(x,
2004 Aug 18
1
Hangups - SIGFPE in dsp.c
Hi, I'm running the latest CVS HEAD version of asterisk, and I'm experiencing hangups during voice conversation. This happens quite regularely and often. The problem is in dsp.c, line 1235, where it says accum /= len; But `len', at this point, is 0, resulting in a SIGFPE. The routine ast_frame *i4l_read() in channels/chan_modem_i4l.c:411 is setting p->fr.datalen to
2018 Apr 03
4
SCEV and LoopStrengthReduction Formulae
I am attempting to implement a minor loop strength reduction optimization for targets that support compare and jump fusion, specifically TTI::canMacroFuseCmp(). My approach might be wrong; however, I am soliciting the idea for feedback, so that I can implement this correctly. My plan is to add a Supplemental LSR formula to LoopStrengthReduce.cpp that optimizes the following case, but perhaps
2010 Jan 20
2
Plot frame border to start at zero?
Hello, I am creating plots of hourly precipitation and accumulated precipitation (on different axis, see attached image). I was wondering how can I have the plot frame (black border) start at zero, it looks like it is plotted less than zero? The code I use to create the png files is below: CairoPNG(PNG_file,width=1000, height=600, pointsize=14, bg="white") opar <-
2011 May 20
1
[LLVMdev] subregisters, def-kill
If I write %reg16506<def> = INSERT_SUBREG %reg16506, %reg16445, hi16; #1 %reg16506<def> = INSERT_SUBREG %reg16506, %reg16468, lo16; #2 store %reg16506 #3 it will not coalesce, as LiveVariables: on #2: %16506 gets #2 as a kill #3: %16506 gets #3 as an additional kill LiveIntervalAnalysis:
2013 Dec 18
0
[LLVMdev] LLVM ARM VMLA instruction
> http://llvm.org/bugs/show_bug.cgi?id=17188 > http://llvm.org/bugs/show_bug.cgi?id=17211 Ah, thanks. That makes a lot more sense now. > Correct - clang is different than gcc, icc, msvc, xlc, etc. on this. Still > haven't seen any explanation for how this is better though... That would be because it follows what C tells us a compiler has to do by default but provides overrides
2012 Apr 05
1
[PATCH] remove unnecesary typedef in bitwriter.c
--- src/libFLAC/bitwriter.c | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/src/libFLAC/bitwriter.c b/src/libFLAC/bitwriter.c index 651440d..7da4b15 100644 --- a/src/libFLAC/bitwriter.c +++ b/src/libFLAC/bitwriter.c @@ -43,12 +43,11 @@ /* Things should be fastest when this matches the machine word size */ /* WATCHOUT: if you change this
2013 Dec 19
2
[LLVMdev] LLVM ARM VMLA instruction
Thanks for the explanation, Tim! gcc 4.8.1 *does* generate an fma for your code example for an x86 target that supports fma. I'd bet that the HW vendors' compilers do the same, but I don't have any of those installed at the moment to test that theory. So this is a bug in those compilers? Do you know how they justify it? I see section 6.5 "Expressions" in the C standard, and
2004 Aug 19
2
Floating point exception help
Hi Manfred, I applied the patch and recompiled and reinstalled and I got the folowing warning during my first test call: Aug 19 12:26:51 WARNING[294927]: dsp.c:1234 __ast_dsp_silence: zero length packet It looks like that could be the problem... and the fix! I'll let you know if the problem reoccurs. Might it be an idea to submit the patch to the bugtracker? Thanks, Gary ----- Original
2013 Dec 18
2
[LLVMdev] LLVM ARM VMLA instruction
> "-ffp-contract=fast" is needed Correct - clang is different than gcc, icc, msvc, xlc, etc. on this. Still haven't seen any explanation for how this is better though... http://llvm.org/bugs/show_bug.cgi?id=17188 http://llvm.org/bugs/show_bug.cgi?id=17211 On Wed, Dec 18, 2013 at 6:02 AM, Tim Northover <t.p.northover at gmail.com>wrote: > > I believe that's the
2011 Sep 01
6
[PATCH 0/5] ARM NEON optimization for samplerate converter
From: Jyri Sarha <jsarha at ti.com> I optimized Speex resampler for NEON capable ARM CPUs. The first patch should speed up resampling on any platform that can spare the increased memory usage. It would be nice to have these merged to the master branch. Please let me know if there is anything I can do to help the the merge. The patches have been rebased on top of master branch in