similar to: [LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

Displaying 20 results from an estimated 400 matches similar to: "[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)"

2012 Dec 17
0
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
On Mon, Dec 10, 2012 at 2:13 PM, Matthew Curtis <mcurtis at codeaurora.org> wrote: > Hello all, > > I wanted to get some feedback on this patch for ScalarEvolution. > > It addresses a performance problem I am seeing for simple benchmark. > > Starting with this C code: > > 01: signed char foo(void) > 02: { > 03: const int count = 8000; > 04: signed char
2012 Dec 18
2
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
Dan, Thanks for the response ... On 12/17/2012 1:53 PM, Dan Gohman wrote: > On Mon, Dec 10, 2012 at 2:13 PM, Matthew Curtis <mcurtis at codeaurora.org> wrote: >> Hello all, >> >> I wanted to get some feedback on this patch for ScalarEvolution. >> >> It addresses a performance problem I am seeing for simple benchmark. >> >> Starting with this C
2012 Dec 18
0
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
On Tue, Dec 18, 2012 at 9:56 AM, Matthew Curtis <mcurtis at codeaurora.org> wrote: > > Here's how I'm evaluating the expression (in my head): > > 00: Add(ZeroExtend(Truncate(Minus(AddRec(Start=0,Step=3)[n],3), i8), i32),3) > | > 01: Add(ZeroExtend(Truncate(Minus(AddRec(Start=0,Step=3)[0],3), i8), i32),3) >
2012 Dec 20
2
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
Ok, so I think I've mis-represented what's really happening. Ignore my previous statements concerning %add :) Again, given: 05: for.body: ; preds = %entry, %for.body 06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ] 07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ] 08: %conv2 = and i32 %result.03, 255 09: %add = add nsw
2010 Sep 29
1
nlminb and optim
I am using both nlminb and optim to get MLEs from a likelihood function I have developed. AFAIK, the model I has not been previously used in this way and so I am struggling a bit to unit test my code since I don't have another data set to compare this kind of estimation to. The likelihood I have is (in tex below) \begin{equation} \label{eqn:marginal} L(\beta) = \prod_{s=1}^N \int
2009 Nov 29
1
optim or nlminb for minimization, which to believe?
I have constructed the function mml2 (below) based on the likelihood function described in the minimal latex I have pasted below for anyone who wants to look at it. This function finds parameter estimates for a basic Rasch (IRT) model. Using the function without the gradient, using either nlminb or optim returns the correct parameter estimates and, in the case of optim, the correct standard
2008 Feb 24
0
problem with ML estimation
dear list, as a part my problem. I have to estimate some parameters using ML estimation. The form of the likelihood function is not straight forward and I had to use a for loop to define the function. I used "optim" to maximise the result but was not sure of the programme. To validate my results, I tried to write a function to obtain the MLE of a bivariate normal in the same manner. On
2008 May 03
2
Resampler (no api)
.. And a version without the API changes. -------------- next part -------------- Index: libspeex/resample_sse.h =================================================================== --- libspeex/resample_sse.h (revision 0) +++ libspeex/resample_sse.h (revision 0) @@ -0,0 +1,128 @@ +/* Copyright (C) 2002-2008 Jean-Marc Valin + * Copyright (C) 2008 Thorvald Natvig + */ +/** + @file resample_sse.h +
2008 May 03
0
Resampler, memory only variant
Hi, Here's the (hopefully) final version of the resampler, now always using st->mem as the buffer area. It only allocates buffers on the stack when it's necesarry to convert the output between int and float. -------------- next part -------------- Index: include/speex/speex_resampler.h =================================================================== ---
2011 Apr 08
0
[LLVMdev] [GSoC] Increase the coverage of Polly
Hi, 2011/4/8 Vlad Krylov <krvladislav at gmail.com>: > Hi. > > I see that to detect scops firstly we search for regions in CFG ( by > RegionInfo ) and then select regions that answer some requirements ( > in ScopDetection ). Because only affine expressions in conditions and > bounds are permissible, we trying to get scalar expressions into > affine form by
2009 Oct 26
1
[PATCH] Fix miscompile of SSE resampler
From: Thorvald Natvig <slicer at users.sourceforge.net> Some optimizing compilers miscompile the current SSE optimizations when full optimizations are enabled. By using output value pointer instead of a return value, we can bypass this misbehaviour. --- libspeex/resample.c | 8 ++++---- libspeex/resample_sse.h | 24 ++++++++---------------- 2 files changed, 12 insertions(+), 20
2011 Apr 07
3
[LLVMdev] [GSoC] Increase the coverage of Polly
Hi. I see that to detect scops firstly we search for regions in CFG ( by RegionInfo ) and then select regions that answer some requirements ( in ScopDetection ). Because only affine expressions in conditions and bounds are permissible, we trying to get scalar expressions into affine form by AffineSCEVIterator. At present there plugs for scev types Truncate, ZeroExtend, SignExtend, UDivExpr,
2011 Apr 08
2
[LLVMdev] [GSoC] Increase the coverage of Polly
2011/4/8 ether zhhb <etherzhhb at gmail.com>: > Hi, > > 2011/4/8 Vlad Krylov <krvladislav at gmail.com>: >> Hi. >> >> I see that to detect scops firstly we search for regions in CFG ( by >> RegionInfo ) and then select regions that answer some requirements ( >> in ScopDetection ). Because only affine expressions in conditions and >> bounds
2018 Apr 07
0
SCEV and LoopStrengthReduction Formulae
> > I realize this is a micro-op saving a single cycle. But this reduces the instruction count, one less > instr to decode in a potentially hot path. If this all makes sense, and seems like a reasonable addition > to llvm, would it make sense to implement this as a supplemental LSR formula, or as a separate pass? This seems reasonable to me so long as rbx has no other uses that
2011 Apr 13
0
ddply and nlminb
Hello I'm new to R (one week) so please excuse any obvious mistakes in my code or posting. I am attempting to fit a non linear function defining the relationship between dependent variable A and the variables PAR and T grouped by the condition Di. The following steps are taken in the Rcode below: 1) load the data (not shown) 2) define the function to be fit 3) define the starting values
2014 Jan 02
0
[PATCH] drm/nvc0-: Fix voltage obtained from vbios.
Coefficients are based on the formula: uV = 0.1 * arg[0] + 150.5 * arg[1] + 22.65025 * arg[2] It seems to be rounded downwards. I have no idea why the voltage isn't specified in the bios directly. Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com> ---- diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/vmap.c b/drivers/gpu/drm/nouveau/core/subdev/bios/vmap.c
2018 May 10
2
LLVM SCEV isAddRecNeverPoison and strength reduction
+CC llvm-dev On Tue, May 8, 2018 at 2:34 AM, Gal Zohar <Gal.Zohar at ceva-dsp.com> wrote: > I noticed that SCEV, when trying to perform strength reduction, doesn’t use > the ability to prove an induction variable does not signed/unsigned wrap due > to infinite loops. > > Is there an easy way to use the isAddRecNeverPoison function when > determining if strength reduction
2015 Sep 30
2
InstCombine wrongful (?) optimization on BinOp with SameOperands
Hi all, I have been looking at the way LLVM optimizes code before forwarding it to the backend I develop for my company and while building define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 { entry: %conv = zext i32 %x to i64 %conv1 = zext i32 %y to i64 %mul = mul nuw i64 %conv1, %conv %shr = lshr i64 %mul, 32 %xor = xor i64 %shr, %mul %conv2 = trunc i64 %xor to i32
2011 Apr 08
0
[LLVMdev] [GSoC] Increase the coverage of Polly
On 04/08/2011 08:35 PM, Vlad Krylov wrote: > 2011/4/8 ether zhhb<etherzhhb at gmail.com>: >> Hi, >> >> 2011/4/8 Vlad Krylov<krvladislav at gmail.com>: >>> Hi. >>> >>> I see that to detect scops firstly we search for regions in CFG ( by >>> RegionInfo ) and then select regions that answer some requirements ( >>> in
2011 Sep 01
0
[PATCH 3/5] resample: Add NEON optimized inner_product_single for fixed point
From: Jyri Sarha <jsarha at ti.com> Semantics of inner_product_single have also been changed to contain the final right shift and saturation so it can also be implemented in the optimal way for the used platform. This change affects fixed point calculations only. I also added a new fixed point macro SATURATE32PSHR(x, shift, a). It does pretty much the same thing as SATURATE32(PSHR32(x,