similar to: [LLVMdev] About the partial update clearence / dependency breaking mechanism

Displaying 20 results from an estimated 100 matches similar to: "[LLVMdev] About the partial update clearence / dependency breaking mechanism"

2013 Mar 25
0
[LLVMdev] About the partial update clearence / dependency breaking mechanism
On Mar 25, 2013, at 5:02 AM, Silviu Baranga <silbar01 at arm.com> wrote: > Hello, > > I am currently looking into the advantages of using the > partial update clearance / dependency breaking mechanism > for some ARM cores. > > It seems that the ARM specific code for this will always > return a clearance of 0 for VLD1LNd32 because of the following > code in
2016 Apr 18
2
[cfe-dev] [libunwind] __ELF__ macro for arm-none-eabi
On 18 April 2016 at 16:33, Silviu Baranga <Silviu.Baranga at arm.com> wrote: > Doing a grep "eabi" * -R | grep darwin in llvm I found the test divmod-eabi.ll > which uses the triple armv7-apple-darwin-eabi. What format does that have? Certainly not ELF. :) But I didn't mean "has eabi on triple", but "is in none-eabi mode", which may have to check a
2016 Apr 18
2
[cfe-dev] [libunwind] __ELF__ macro for arm-none-eabi
On 18 April 2016 at 16:18, Silviu Baranga <Silviu.Baranga at arm.com> wrote: > This doesn't look like something ACLE specific (I can't find it in the ACLE doc). Sorry, I didn't mean it was ACLE, only that you guys were fiddling with macros. :) > This seems to be a generic macro. I think it would make sense to define it > if we know we're emitting ELF. Since the
2014 Aug 15
2
[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
Hi, I have a problem regarding sub-register definitions and LiveIntervals on our target. When a subregister is defined, other parts of the register are always left untouched - they are neither read or def:ed. It however seems that Codegen treats subregister definitions as somehow clobbering the whole register. The SSA-code looks like this after isel: (Reg0 and Reg1 are 16bit registers. Reg2,
2015 Apr 29
2
[LLVMdev] [RFC][Float2Int] Converting (fcmp Pred, x * F, y) to (ICmp ...)
Hi, I'm trying expand the Float2Int pass in order to make it able to optimize expressions like f * x > y, where x and y are integers (we'll assume unsigned for simplicity) and f is a floating point constant. The optimization would convert the expression to something like: (a * x)/b > y where a and b are integers guessed by the compiler (currently using continued
2015 Apr 29
2
[LLVMdev] [RFC][Float2Int] Converting (fcmp Pred, x * F, y) to (ICmp ...)
> On Apr 29, 2015, at 2:33 PM, Matt Arsenault <arsenm2 at gmail.com> wrote: > >> On Apr 29, 2015, at 10:06 AM, Silviu Baranga <Silviu.Baranga at arm.com <mailto:Silviu.Baranga at arm.com>> wrote: >> >> Note that dividing by an integer constant should be a cheap operation >> compared to FP multiplication and comparison as this would get lowered to a
2016 Apr 16
2
[cfe-dev] [libunwind] __ELF__ macro for arm-none-eabi
On 16 April 2016 at 01:44, Zhao, Weiming via cfe-dev <cfe-dev at lists.llvm.org> wrote: > I'm building libunwind for ARM baremetal using clang. > I notice that __ELF__ is used in libunwind and the macro is only defined for > Linux target on ARM. > Should we also predefine that for arm-none-eabi target? Do you mean in Clang's ARMTargetInfo::getTargetDefines() ? I think
2014 Aug 19
2
[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
Hi Quentin, On 08/15/14 19:01, Quentin Colombet wrote: [...] >> The question is: How should true subregister definitions be >> expressed so that they do not interfere with each other? See the >> detailed problem description below. > > We do have a limitation in our current liveness tracking for > sub-register. Therefore, I am not sure that is possible. > >
2014 Aug 22
2
[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
Hi Quentin, On 08/19/14 18:58, Quentin Colombet wrote: [...] > It seems that you will have to debug further the *** Bad machine code: Instruction loads from dead spill slot *** before we can be of any help. Yes, I've done some more digging. Sorry for the long mail... I get: Inline spilling aN40_0_7:%vreg1954 [5000r,5056r:0)[5056r,5348r:1) 0 at 5000r 1 at 5056r At this point I have
2015 Apr 29
2
[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations
Hi, This is somewhat similar to the previous thread regarding missed vectorization opportunities (http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084765.html), but maybe different enough to require a new thread. I'm seeing some missed vectorization opportunities in the loop vectorizer because SCEV is not able to fold sext/zext expressions into recurrence expressions (AddRecExpr). This
2012 Jul 05
2
[LLVMdev] MachineOperand: Subreg defines and the Undef flag
Hi, This question relates to the undef flag in the context of sub-register def operands. 1) Firstly, the documentation (comments in the source code) says that in a sub-register def operand, the "IsUndef" flag refers to the part of the register that is not written. 2) Further, the documentation about readsReg() states that a sub-register def implicitly reads the other parts of the
2016 Nov 27
5
Extending Register Rematerialization
Hello LLVM Developers, We are working on extending currently available register rematerialization to include cases where sequence of multiple instructions is required to rematerialize a value. We had a discussion on this in community mailing list and link is here: http://lists.llvm.org/pipermail/llvm-dev/2016-September/subject.html#104777 >From the above discussion and studying the code we
2013 Jun 07
2
[LLVMdev] NEON vector instructions and the fast math IR flags
>> Darwin uses NEON for floating point, but does *not* (and should not). >> globally enable fast math flags. Use of NEON for FP needs to remain >> achievable without globally setting the fast math flags. Fast math may >> imply reasonably imply NEON, but the opposite direction is not accurate. | Good point. Fast math is probably a too tough requirement. I need to | look
2018 Mar 02
0
[RFC] llvm-mca: a static performance analysis tool
+Matthias > On Mar 2, 2018, at 6:42 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote: > >> Known limitations on X86 processors >> ----------------------------------- >> >> 1) Partial register updates versus full register updates. >> <snip> > > MachineOperand handles this. You just need to create the machine instrs. > >
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
> |I just looked again at the +neonfp flag. Compiling with and without > |+neonfp flag seems to only affect scalar types in the attached test > |case. If e.g. the LLVM vectorizer introduces vector instructions on > |LLVM-IR level floating point vectors still yield NEON assembly even if > |compiled with "-mattr=+neon,-neonfp". Is this expected? > > I'm virtually
2016 Jun 30
1
[Proposal][RFC] Strided Memory Access Vectorization
As a strong advocate of logical vector representation, I'm counting on community liking Michael's RFC and that'll proceed sooner than later. I plan to pitch in (e.g., perf experiments). >Probably can depend on the support provided by below RFC by Michael: > "Allow loop vectorizer to choose vector widths that generate illegal types" >In that case Loop Vectorizer will
2016 Jun 30
0
[Proposal][RFC] Strided Memory Access Vectorization
One common concern raised for cases where Loop Vectorizer generate bigger types than target supported: Based on VF currently we check the cost and generate the expected set of instruction[s] for bigger type. It has two challenges for bigger types cost is not always correct and code generation may not generate efficient instruction[s]. Probably can depend on the support provided by below RFC by
2018 Mar 02
5
[RFC] llvm-mca: a static performance analysis tool
Hi Andrew, Thanks for the feedback! On Fri, Mar 2, 2018 at 1:16 AM, Andrew Trick <atrick at apple.com> wrote: > > On Mar 1, 2018, at 9:22 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> > wrote: > > Hi all, > > At Sony we developed an LLVM based performance analysis tool named > llvm-mca. We > currently use it internally to statically measure the
2012 Jul 05
0
[LLVMdev] MachineOperand: Subreg defines and the Undef flag
On Jul 4, 2012, at 10:45 PM, Pranav Bhandarkar <pranavb at codeaurora.org> wrote: > Hi, > > This question relates to the undef flag in the context of sub-register def > operands. > > 1) Firstly, the documentation (comments in the source code) says that in a > sub-register def operand, the "IsUndef" flag refers to the part of the > register that is not
2016 Jun 18
2
[Proposal][RFC] Strided Memory Access Vectorization
>Vectorizer's output should be as clean as vector code can be so that analyses and optimizers downstream can >do a great job optimizing. Guess I should clarify this philosophical position of mine. In terms of vector code optimization that complicates the output of vectorizer: If vectorizer is the best place to perform the optimization, it should do so. This includes the cases like