Displaying 20 results from an estimated 100 matches similar to: "[LLVMdev] About the partial update clearence / dependency breaking mechanism"
2013 Mar 25
0
[LLVMdev] About the partial update clearence / dependency breaking mechanism
On Mar 25, 2013, at 5:02 AM, Silviu Baranga <silbar01 at arm.com> wrote:
> Hello,
>
> I am currently looking into the advantages of using the
> partial update clearance / dependency breaking mechanism
> for some ARM cores.
>
> It seems that the ARM specific code for this will always
> return a clearance of 0 for VLD1LNd32 because of the following
> code in
2016 Apr 18
2
[cfe-dev] [libunwind] __ELF__ macro for arm-none-eabi
On 18 April 2016 at 16:33, Silviu Baranga <Silviu.Baranga at arm.com> wrote:
> Doing a grep "eabi" * -R | grep darwin in llvm I found the test divmod-eabi.ll
> which uses the triple armv7-apple-darwin-eabi. What format does that have?
Certainly not ELF. :)
But I didn't mean "has eabi on triple", but "is in none-eabi mode",
which may have to check a
2016 Apr 18
2
[cfe-dev] [libunwind] __ELF__ macro for arm-none-eabi
On 18 April 2016 at 16:18, Silviu Baranga <Silviu.Baranga at arm.com> wrote:
> This doesn't look like something ACLE specific (I can't find it in the ACLE doc).
Sorry, I didn't mean it was ACLE, only that you guys were fiddling
with macros. :)
> This seems to be a generic macro. I think it would make sense to define it
> if we know we're emitting ELF.
Since the
2014 Aug 15
2
[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
Hi,
I have a problem regarding sub-register definitions and LiveIntervals on
our target. When a subregister is defined, other parts of the register
are always left untouched - they are neither read or def:ed.
It however seems that Codegen treats subregister definitions as somehow
clobbering the whole register.
The SSA-code looks like this after isel:
(Reg0 and Reg1 are 16bit registers. Reg2,
2015 Apr 29
2
[LLVMdev] [RFC][Float2Int] Converting (fcmp Pred, x * F, y) to (ICmp ...)
Hi,
I'm trying expand the Float2Int pass in order to make it able to optimize
expressions like f * x > y, where x and y are integers (we'll assume
unsigned for
simplicity) and f is a floating point constant. The optimization would
convert
the expression to something like:
(a * x)/b > y
where a and b are integers guessed by the compiler (currently using
continued
2015 Apr 29
2
[LLVMdev] [RFC][Float2Int] Converting (fcmp Pred, x * F, y) to (ICmp ...)
> On Apr 29, 2015, at 2:33 PM, Matt Arsenault <arsenm2 at gmail.com> wrote:
>
>> On Apr 29, 2015, at 10:06 AM, Silviu Baranga <Silviu.Baranga at arm.com <mailto:Silviu.Baranga at arm.com>> wrote:
>>
>> Note that dividing by an integer constant should be a cheap operation
>> compared to FP multiplication and comparison as this would get lowered to a
2016 Apr 16
2
[cfe-dev] [libunwind] __ELF__ macro for arm-none-eabi
On 16 April 2016 at 01:44, Zhao, Weiming via cfe-dev
<cfe-dev at lists.llvm.org> wrote:
> I'm building libunwind for ARM baremetal using clang.
> I notice that __ELF__ is used in libunwind and the macro is only defined for
> Linux target on ARM.
> Should we also predefine that for arm-none-eabi target?
Do you mean in Clang's ARMTargetInfo::getTargetDefines() ?
I think
2014 Aug 19
2
[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
Hi Quentin,
On 08/15/14 19:01, Quentin Colombet wrote:
[...]
>> The question is: How should true subregister definitions be
>> expressed so that they do not interfere with each other? See the
>> detailed problem description below.
>
> We do have a limitation in our current liveness tracking for
> sub-register. Therefore, I am not sure that is possible.
>
>
2014 Aug 22
2
[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
Hi Quentin,
On 08/19/14 18:58, Quentin Colombet wrote:
[...]
> It seems that you will have to debug further the *** Bad machine code: Instruction loads from dead spill slot *** before we can be of any help.
Yes, I've done some more digging. Sorry for the long mail...
I get:
Inline spilling aN40_0_7:%vreg1954 [5000r,5056r:0)[5056r,5348r:1)
0 at 5000r 1 at 5056r
At this point I have
2015 Apr 29
2
[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations
Hi,
This is somewhat similar to the previous thread regarding missed vectorization
opportunities (http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084765.html),
but maybe different enough to require a new thread.
I'm seeing some missed vectorization opportunities in the loop vectorizer because SCEV
is not able to fold sext/zext expressions into recurrence expressions (AddRecExpr).
This
2012 Jul 05
2
[LLVMdev] MachineOperand: Subreg defines and the Undef flag
Hi,
This question relates to the undef flag in the context of sub-register def
operands.
1) Firstly, the documentation (comments in the source code) says that in a
sub-register def operand, the "IsUndef" flag refers to the part of the
register that is not written.
2) Further, the documentation about readsReg() states that a sub-register
def implicitly reads the other parts of the
2016 Nov 27
5
Extending Register Rematerialization
Hello LLVM Developers,
We are working on extending currently available register rematerialization
to include cases where sequence of multiple instructions is required to
rematerialize a value.
We had a discussion on this in community mailing list and link is here:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/subject.html#104777
>From the above discussion and studying the code we
2013 Jun 07
2
[LLVMdev] NEON vector instructions and the fast math IR flags
>> Darwin uses NEON for floating point, but does *not* (and should not).
>> globally enable fast math flags. Use of NEON for FP needs to remain
>> achievable without globally setting the fast math flags. Fast math may
>> imply reasonably imply NEON, but the opposite direction is not accurate.
| Good point. Fast math is probably a too tough requirement. I need to
| look
2018 Mar 02
0
[RFC] llvm-mca: a static performance analysis tool
+Matthias
> On Mar 2, 2018, at 6:42 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>
>> Known limitations on X86 processors
>> -----------------------------------
>>
>> 1) Partial register updates versus full register updates.
>> <snip>
>
> MachineOperand handles this. You just need to create the machine instrs.
>
>
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
> |I just looked again at the +neonfp flag. Compiling with and without
> |+neonfp flag seems to only affect scalar types in the attached test
> |case. If e.g. the LLVM vectorizer introduces vector instructions on
> |LLVM-IR level floating point vectors still yield NEON assembly even if
> |compiled with "-mattr=+neon,-neonfp". Is this expected?
>
> I'm virtually
2016 Jun 30
1
[Proposal][RFC] Strided Memory Access Vectorization
As a strong advocate of logical vector representation, I'm counting on community liking Michael's RFC and that'll proceed sooner than later.
I plan to pitch in (e.g., perf experiments).
>Probably can depend on the support provided by below RFC by Michael:
> "Allow loop vectorizer to choose vector widths that generate illegal types"
>In that case Loop Vectorizer will
2016 Jun 30
0
[Proposal][RFC] Strided Memory Access Vectorization
One common concern raised for cases where Loop Vectorizer generate
bigger types than target supported:
Based on VF currently we check the cost and generate the expected set of
instruction[s] for bigger type. It has two challenges for bigger types cost
is not always correct and code generation may not generate efficient
instruction[s].
Probably can depend on the support provided by below RFC by
2018 Mar 02
5
[RFC] llvm-mca: a static performance analysis tool
Hi Andrew,
Thanks for the feedback!
On Fri, Mar 2, 2018 at 1:16 AM, Andrew Trick <atrick at apple.com> wrote:
>
> On Mar 1, 2018, at 9:22 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com>
> wrote:
>
> Hi all,
>
> At Sony we developed an LLVM based performance analysis tool named
> llvm-mca. We
> currently use it internally to statically measure the
2012 Jul 05
0
[LLVMdev] MachineOperand: Subreg defines and the Undef flag
On Jul 4, 2012, at 10:45 PM, Pranav Bhandarkar <pranavb at codeaurora.org> wrote:
> Hi,
>
> This question relates to the undef flag in the context of sub-register def
> operands.
>
> 1) Firstly, the documentation (comments in the source code) says that in a
> sub-register def operand, the "IsUndef" flag refers to the part of the
> register that is not
2016 Jun 18
2
[Proposal][RFC] Strided Memory Access Vectorization
>Vectorizer's output should be as clean as vector code can be so that analyses and optimizers downstream can
>do a great job optimizing.
Guess I should clarify this philosophical position of mine. In terms of vector code optimization that complicates
the output of vectorizer:
If vectorizer is the best place to perform the optimization, it should do so.
This includes the cases like