similar to: [LLVMdev] TLI vs. TTI

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] TLI vs. TTI"

2013 Jan 25
0
[LLVMdev] TargetLowering vs. TargetTransform
Hi Renato, I think that we need to improve ::isTruncateFree, ::isZextFree, etc to include all of the free conversions. Vector and Scalar. Non-free conversions are marked with setOperationAction so the generic parts of TTI should be able to give a reasonable cost estimation. The cost tables should contain cases that are not handled by TTI. So, if we have a clever DAGCombine optimization (that
2013 Jan 25
2
[LLVMdev] TargetLowering vs. TargetTransform
Hi all, I'm looking for a place where to put the costs of vector (and scalar) cast operations for ARM, but I noticed the TargetTransform methods call the TargetLowering ones when unsure. Now, I'm not sure... Many casts on ARM are free, and I could build a list of cases where it is true, but should I put this on the lowering or the transform? My main motivation is to get the costs right
2007 Nov 13
2
[LLVMdev] BasicAliasAnalysis and out-of-bound GEP indices
Hi! While investigating into the PR1782 I spent some time analyzing BasicAliasAnalysis.cpp. While the mentioned problem should be fixed now (I hope), I have discovered some other possibilities for a bug to occur. In the case of checking for aliasing of two pointer values, where at least one of them is a GEP instruction with out-of-bound indices, BasicAliasAnalysis can return NoAlias, even if the
2013 Jan 25
2
[LLVMdev] TargetLowering vs. TargetTransform
On 25 January 2013 17:48, Nadav Rotem <nrotem at apple.com> wrote: > I think that we need to improve ::isTruncateFree, ::isZextFree, etc to > include all of the free conversions. Vector and Scalar. > Hi Nadav, Yes, and the question is: TargetLowering's isZExtFree or TargetTransform's isZExtFree? TargetTransform (TT) only has the free checks on types, while TargetLowering
2007 Nov 13
0
[LLVMdev] BasicAliasAnalysis and out-of-bound GEP indices
It's an optimization opportunity! When behavior is undefined, we're free to interpret it to be "whatever makes optimization easiest." If the two do actually happen to alias, well, it's the programmer's fault anyways, because they were doing something undefined! --Owen On Nov 13, 2007, at 4:13 PM, Wojciech Matyjewicz wrote: > Hi! > > While investigating
1997 Sep 05
2
R-beta: help with R simulation
[[this bounced first, because it has 'help' in the Subject line ... -- Martin Maechler ]] I am a complete novice R programmer. (Though I know C quite well) I am trying to write some R code to do the following simulation. There is a 2-frame "movie" of noise and signal dots. the noise dots have random positions in each frame. The signal dots are placed randomly in frame 1,
2014 Jan 22
2
[LLVMdev] MergeFunctions: reduce complexity to O(log(N))
On 2014 Jan 22, at 07:35, Stepan Dyatkovskiy <stpworld at narod.ru> wrote: > Hi Raul and Duncan, > > Duncan, > Thank you for review. I hope to present fixed patch tomorrow. > > First, I would like to show few performance results: > > command: "time opt -mergefunc <test>" > > File: tramp3d-v4.ll, 12963 functions > Current
2013 Mar 22
4
[LLVMdev] proposed change to class BasicTTI
Just realized that BasicTransformInfoClass is an immutable pass. Not sure how to reconcile this with fact that there will be different answers needed depending on the subtarget. Seems like BasicTansformInfoClass should become a function pass that does not modify anything. On 03/22/2013 09:43 AM, Reed Kotler wrote: > Another way to do this would to be to have a reset virtual function >
2015 Sep 30
2
InstCombine wrongful (?) optimization on BinOp with SameOperands
Hi all, I have been looking at the way LLVM optimizes code before forwarding it to the backend I develop for my company and while building define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 { entry: %conv = zext i32 %x to i64 %conv1 = zext i32 %y to i64 %mul = mul nuw i64 %conv1, %conv %shr = lshr i64 %mul, 32 %xor = xor i64 %shr, %mul %conv2 = trunc i64 %xor to i32
2013 Mar 22
0
[LLVMdev] proposed change to class BasicTTI
Hi Reed, We will need to reconstruct the target machine and the TTI chain when the function attributes change. We currently don't have code for doing that but I suggest that you talk with Bill Wendling about the best way to implement this. Thanks, Nadav On Mar 22, 2013, at 11:30 AM, Reed Kotler <rkotler at mips.com> wrote: > Just realized that BasicTransformInfoClass is an
2014 Feb 27
3
[LLVMdev] MergeFunctions: reduce complexity to O(log(N))
Hi Nick, I tried to rework changes as you requested. One of patches (0004 with extra assertions) has been removed. > + bool isEquivalentType(Type *Ty1, Type *Ty2) const { > + return cmpType(Ty1, Ty2) == 0; > + } > > Why do we still need isEquivalentType? Can we nuke this? Yup. After applying all the patches isEquivalentType will be totally replaced with cmpType. All
2013 Jan 09
2
[LLVMdev] ARM vectorizer cost model
Hi Nadav, I'm interested in knowing how you'll work up the ARM cost model and how easy it'd be to split the work. As far as I can see, LoopVectorizationCostModel is the class that does all the work, with assistance from the target transform info. Do you think that updating ARMTTI would be the best course of action now, and inspect the differences in the CostModel later? I also
2016 Oct 16
2
[PATCH] exa: add GM10x acceleration support
rendercheck -f a8r8g8b8 passes as much as on a GK208, and xv appears to work. Very lightly tested. Instead of sticking coordinates into pushbufs, the vertex shader is modified to read them from a constbuf, indexed by vertex id. This approach could be used for all nvc0 generations, but I didn't want to rock the boat. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- Note: this
2014 Mar 13
2
[LLVMdev] MergeFunctions: reduce complexity to O(log(N))
Hi Nick, I have committed 0001 as r203788. I'm working on fixes for 0002 - 0014. > After reading through this patch series, I feel like I'm missing > something important. Where's the sort function? It looks like we're > still comparing all functions to all other functions. When you insert functions into std::set or its analogs it does all the job for you. Since
2013 Jan 09
0
[LLVMdev] ARM vectorizer cost model
Hi Renato, > I'm interested in knowing how you'll work up the ARM cost model and how easy it'd be to split the work. Yes, I am starting to work on the ARM cost model and I would appreciate any help in the form of: advice, performance measurements, patches, etc. I tune the cost model by running the cost model analysis pass and I compare the output of the analysis to the output
2013 Jan 10
2
[LLVMdev] ARM vectorizer cost model
On 9 January 2013 17:10, Nadav Rotem <nrotem at apple.com> wrote: > For example: > "opt -cost-model -analyze dumper.ll -mtriple=thumbv7 > -mcpu=cortex-a15" > > I also run the vectorizer with -debug-only=loop-vectorize because it dumps > the costs of all of the instructions with different vectorization factors, > and it also detects the different kinds
2015 May 11
2
[LLVMdev] about MemoryDependenceAnalysis usage
add -basicaa to your command line :) On Mon, May 11, 2015 at 7:15 AM, Willy WOLFF <willy.mh.wolff at gmail.com> wrote: > I play a bit more with MemoryDependenceAnalysis by wrapping my pass, and > call explicitely BasicAliasAnalysis. Its still using No Alias Analysis. > > How can I let MemoryDependenceAnalysis use BasicAliasAnalysis? > > Please, find attached my pass. >
2017 Dec 14
2
[RFC] Add TargetTransformInfo::isAllocaPtrValueNonZero and let ValueTracking depend on TargetTransformInfo
Some optimizations depend on whether alloca instruction always has non-zero value. Currently, this checking is done by isKnownNonZero() in ValueTracking, and it assumes alloca in address space 0 always has non-zero value but alloca in non-zero address spaces does not always have non-zero value. However, this assumption is incorrect for certain targets. For example, amdgcn---amdgiz target has
2015 Jan 17
3
[LLVMdev] loop multiversioning
Does LLVM have loop multiversioning ? it seems it does not with clang++ -O3 -mllvm -debug-pass=Arguments program.c -c bash-4.1$ clang++ -O3 -mllvm -debug-pass=Arguments fast_algorithms.c -c clang-3.6: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated Pass Arguments: -datalayout -notti -basictti -x86tti -targetlibinfo -no-aa -tbaa -scoped-noalias
2017 Dec 14
3
[RFC] Add TargetTransformInfo::isAllocaPtrValueNonZero and let ValueTracking depend on TargetTransformInfo
Hal, Thanks for your suggestion. I think that makes sense. Currently, non-zero alloca address space is already represented by data layout, e.g., the last component of the data layout of amdgcn---amdgiz target is -A5, which means alloca is in address space 5. How about adding a letter z to -A5 to indicate alloca may have zero value? i.e. -A5 means alloca is in address space 5 and always has