Displaying 20 results from an estimated 63 matches for "accumated".
2009 Oct 26
1
[PATCH] Fix miscompile of SSE resampler
From: Thorvald Natvig <slicer at users.sourceforge.net>
Some optimizing compilers miscompile the current SSE optimizations when
full optimizations are enabled. By using output value pointer instead of
a return value, we can bypass this misbehaviour.
---
libspeex/resample.c | 8 ++++----
libspeex/resample_sse.h | 24 ++++++++----------------
2 files changed, 12 insertions(+), 20
2008 May 03
2
Resampler (no api)
.. And a version without the API changes.
-------------- next part --------------
Index: libspeex/resample_sse.h
===================================================================
--- libspeex/resample_sse.h (revision 0)
+++ libspeex/resample_sse.h (revision 0)
@@ -0,0 +1,128 @@
+/* Copyright (C) 2002-2008 Jean-Marc Valin
+ * Copyright (C) 2008 Thorvald Natvig
+ */
+/**
+ @file resample_sse.h
+
2008 May 03
0
Resampler, memory only variant
Hi,
Here's the (hopefully) final version of the resampler, now always using
st->mem as the buffer area. It only allocates buffers on the stack when
it's necesarry to convert the output between int and float.
-------------- next part --------------
Index: include/speex/speex_resampler.h
===================================================================
---
2018 Apr 07
0
SCEV and LoopStrengthReduction Formulae
>
> I realize this is a micro-op saving a single cycle. But this reduces the instruction count, one less
> instr to decode in a potentially hot path. If this all makes sense, and seems like a reasonable addition
> to llvm, would it make sense to implement this as a supplemental LSR formula, or as a separate pass?
This seems reasonable to me so long as rbx has no other uses that
2009 Sep 11
2
Accumulating results from "for" loop in a list/array
Dear R users,
I would like to accumulate objects generated from 'for' loop to a list or
array.
To illustrate the problem, arbitrary data set and script is shown below,
x <- data.frame(a = c(rep("n",3),rep("y",2),rep("n",3),rep("y",2)), b =
c(rep("y",2),rep("n",4),rep("y",3),"n"), c =
2012 Dec 10
3
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
Hello all,
I wanted to get some feedback on this patch for ScalarEvolution.
It addresses a performance problem I am seeing for simple benchmark.
Starting with this C code:
01: signed char foo(void)
02: {
03: const int count = 8000;
04: signed char result = 0;
05: int j;
06:
07: for (j = 0; j < count; ++j) {
08: result += (result_t)(3);
09: }
10:
11: return result;
12: }
I
2014 Jan 02
0
[PATCH] drm/nvc0-: Fix voltage obtained from vbios.
Coefficients are based on the formula:
uV = 0.1 * arg[0] + 150.5 * arg[1] + 22.65025 * arg[2]
It seems to be rounded downwards. I have no idea why the voltage isn't
specified in the bios directly.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
----
diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/vmap.c b/drivers/gpu/drm/nouveau/core/subdev/bios/vmap.c
2003 May 20
2
mdct_backward with fused muladd?
Can anybody point me at any resources that would explain how to optimize
mdct_backward for a cpu with a fused multiply-accumute unit?
>From what I understand from responses to my older postings, Tremor's
mdct_backward could be rewritten to take advantage of a muladd.
My target machine can do either two-wide 32x32 + Accum(64) -> Accum(64)
integer muladd or eight-wide 16x16 + Accum(32)
2008 May 14
6
PWGL in wine, problems
Hello,
I'm new on this list. First of all, thank you to all the developers of this
great project!
At the moment there is only an application that keeps me on both macos and
windows, its name is PWGL a free environment for computer assisted
composition in openGL. (http://www2.siba.fi/PWGL/)
I'm running Ubuntu 8.04 and wine 0.9.59.
I have to say that I also installed vcrun2005 and
2011 Sep 01
0
[PATCH 3/5] resample: Add NEON optimized inner_product_single for fixed point
From: Jyri Sarha <jsarha at ti.com>
Semantics of inner_product_single have also been changed to contain
the final right shift and saturation so it can also be implemented in
the optimal way for the used platform. This change affects fixed point
calculations only.
I also added a new fixed point macro SATURATE32PSHR(x, shift, a). It
does pretty much the same thing as SATURATE32(PSHR32(x,
2004 Aug 18
1
Hangups - SIGFPE in dsp.c
Hi,
I'm running the latest CVS HEAD version of asterisk, and I'm experiencing
hangups during voice conversation. This happens quite regularely and
often.
The problem is in dsp.c, line 1235, where it says
accum /= len;
But `len', at this point, is 0, resulting in a SIGFPE. The routine
ast_frame *i4l_read() in channels/chan_modem_i4l.c:411 is
setting p->fr.datalen to
2018 Apr 03
4
SCEV and LoopStrengthReduction Formulae
I am attempting to implement a minor loop strength reduction optimization for
targets that support compare and jump fusion, specifically
TTI::canMacroFuseCmp(). My approach might be wrong; however, I am soliciting
the idea for feedback, so that I can implement this correctly. My plan is to
add a Supplemental LSR formula to LoopStrengthReduce.cpp that optimizes the
following case, but perhaps
2010 Jan 20
2
Plot frame border to start at zero?
Hello,
I am creating plots of hourly precipitation and accumulated
precipitation (on different axis, see attached image). I was wondering
how can I have the plot frame (black border) start at zero, it looks
like it is plotted less than zero?
The code I use to create the png files is below:
CairoPNG(PNG_file,width=1000, height=600, pointsize=14, bg="white")
opar <-
2011 May 20
1
[LLVMdev] subregisters, def-kill
If I write
%reg16506<def> = INSERT_SUBREG %reg16506, %reg16445, hi16; #1
%reg16506<def> = INSERT_SUBREG %reg16506, %reg16468, lo16; #2
store %reg16506 #3
it will not coalesce, as
LiveVariables:
on
#2: %16506 gets #2 as a kill
#3: %16506 gets #3 as an additional kill
LiveIntervalAnalysis:
2013 Dec 18
0
[LLVMdev] LLVM ARM VMLA instruction
> http://llvm.org/bugs/show_bug.cgi?id=17188
> http://llvm.org/bugs/show_bug.cgi?id=17211
Ah, thanks. That makes a lot more sense now.
> Correct - clang is different than gcc, icc, msvc, xlc, etc. on this. Still
> haven't seen any explanation for how this is better though...
That would be because it follows what C tells us a compiler has to do
by default but provides overrides
2012 Apr 05
1
[PATCH] remove unnecesary typedef in bitwriter.c
---
src/libFLAC/bitwriter.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/src/libFLAC/bitwriter.c b/src/libFLAC/bitwriter.c
index 651440d..7da4b15 100644
--- a/src/libFLAC/bitwriter.c
+++ b/src/libFLAC/bitwriter.c
@@ -43,12 +43,11 @@
/* Things should be fastest when this matches the machine word size */
/* WATCHOUT: if you change this
2013 Dec 19
2
[LLVMdev] LLVM ARM VMLA instruction
Thanks for the explanation, Tim!
gcc 4.8.1 *does* generate an fma for your code example for an x86 target
that supports fma. I'd bet that the HW vendors' compilers do the same, but
I don't have any of those installed at the moment to test that theory. So
this is a bug in those compilers? Do you know how they justify it?
I see section 6.5 "Expressions" in the C standard, and
2004 Aug 19
2
Floating point exception help
Hi Manfred,
I applied the patch and recompiled and reinstalled and I got the folowing
warning during my first test call:
Aug 19 12:26:51 WARNING[294927]: dsp.c:1234 __ast_dsp_silence: zero length
packet
It looks like that could be the problem... and the fix! I'll let you know if
the problem reoccurs. Might it be an idea to submit the patch to the
bugtracker?
Thanks,
Gary
----- Original
2013 Dec 18
2
[LLVMdev] LLVM ARM VMLA instruction
> "-ffp-contract=fast" is needed
Correct - clang is different than gcc, icc, msvc, xlc, etc. on this. Still
haven't seen any explanation for how this is better though...
http://llvm.org/bugs/show_bug.cgi?id=17188
http://llvm.org/bugs/show_bug.cgi?id=17211
On Wed, Dec 18, 2013 at 6:02 AM, Tim Northover <t.p.northover at gmail.com>wrote:
> > I believe that's the
2011 Sep 01
6
[PATCH 0/5] ARM NEON optimization for samplerate converter
From: Jyri Sarha <jsarha at ti.com>
I optimized Speex resampler for NEON capable ARM CPUs. The first patch
should speed up resampling on any platform that can spare the
increased memory usage. It would be nice to have these merged to the
master branch. Please let me know if there is anything I can do to
help the the merge. The patches have been rebased on top of master
branch in