Displaying 12 results from an estimated 12 matches for "fglaser".
Did you mean:
glaser
2016 Apr 12
2
Implementing a proposed InstCombine optimization
Good point. The same argument seems to apply to copy() too so I suppose it depends how strict we want to be about it.
From: fglaser at apple.com [mailto:fglaser at apple.com] On Behalf Of escha at apple.com
Sent: 11 April 2016 20:55
To: Daniel Sanders
Cc: Alex Rosenberg; llvm-dev at lists.llvm.org; Carlos Liam
Subject: Re: [llvm-dev] Implementing a proposed InstCombine optimization
On Apr 11, 2016, at 4:23 AM, Daniel Sanders...
2018 Apr 09
1
SCEV and LoopStrengthReduction Formulae
> From: fglaser at apple.com <fglaser at apple.com> On Behalf Of escha at apple.com
> Sent: Saturday, April 7, 2018 8:22 AM
>
>> I realize this is a micro-op saving a single cycle. But this reduces the instruction count, one less
>> instr to decode in a potentially hot path. If this all m...
2017 Jan 31
0
Intercepting lowering of stack adjustments
> On Jan 30, 2017, at 6:18 AM, Martin J. O'Riordan via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> For a long time we have had code for custom lowering of adjustments to the stack pointer. But until recently we did not realise that we were handling only places that provided a fixed-value for such adjustments, and the ISD nodes ‘ADJCALLSTACKDOWN’ and ‘ADJCALLSTACKUP’ are
2018 Apr 07
0
SCEV and LoopStrengthReduction Formulae
>
> I realize this is a micro-op saving a single cycle. But this reduces the instruction count, one less
> instr to decode in a potentially hot path. If this all makes sense, and seems like a reasonable addition
> to llvm, would it make sense to implement this as a supplemental LSR formula, or as a separate pass?
This seems reasonable to me so long as rbx has no other uses that
2017 Jan 30
2
Intercepting lowering of stack adjustments
For a long time we have had code for custom lowering of adjustments to the
stack pointer. But until recently we did not realise that we were handling
only places that provided a fixed-value for such adjustments, and the ISD
nodes 'ADJCALLSTACKDOWN' and 'ADJCALLSTACKUP' are only described in our
TableGen descriptions for immediates. This hasn't previous mattered as LLVM
2015 Jan 16
3
[LLVMdev] git-svn authorship (was: Howdy + GIT)
Erik> I am surprised noone has
mentioned the one of the biggest
Erik> advantages of Git which is proper author attribution for
Erik> non-core and drive-by patch contributors.
>From what I can make of the git-svn docs, that LLVM committers should
be adding a "From: <email>" field to commit messages instead of "Patch
by <name>". If the original author is
2018 Apr 03
4
SCEV and LoopStrengthReduction Formulae
I am attempting to implement a minor loop strength reduction optimization for
targets that support compare and jump fusion, specifically
TTI::canMacroFuseCmp(). My approach might be wrong; however, I am soliciting
the idea for feedback, so that I can implement this correctly. My plan is to
add a Supplemental LSR formula to LoopStrengthReduce.cpp that optimizes the
following case, but perhaps
2016 Apr 11
4
Implementing a proposed InstCombine optimization
> I am not entirely sure this is safe. Transforming this to an fsub could change the value stored on platforms that implement negates using arithmetic instead of with bitmath (such as ours)
I think it's probably safe for IEEE754-2008 conformant platforms because negation was clarified to be a non-arithmetic bit flip that cannot cause exceptions in that specification. However, I'm sure
2015 Jan 22
2
[LLVMdev] X86TargetLowering::LowerToBT
> On Jan 22, 2015, at 1:22 PM, Fiona Glaser <fglaser at apple.com> wrote:
>
> According to Agner’s docs, many CPUs have slower BT than TEST; Haswell has only 0.5 inverse throughput as opposed to 0.25, Atom has 1 instead of 0.5, and Silvermont can’t even dual-issue BT (it locks both ALUs). So while BT does seem have a shorter instruction enc...
2015 Jan 19
6
[LLVMdev] X86TargetLowering::LowerToBT
I'm tracking down an X86 code generation malfeasance regarding BT (bit
test) and I have some questions.
This IR *matches* and then *X86TargetLowering::LowerToBT **is called:*
%and = and i64 %shl, %val * ; (val & (1 << index)) != 0 ; *bit test
with a *register* index
This IR *does not match* and so *X86TargetLowering::LowerToBT **is not
called:*
%and = lshr i64 %val, 25
2015 Jan 22
3
[LLVMdev] X86TargetLowering::LowerToBT
Is that even a valid instruction? I thought TEST only took 32-bit immediates.
Fiona
> On Jan 22, 2015, at 2:48 PM, Chris Sears <chris.sears at gmail.com> wrote:
>
> The problem is that REX TEST reg,#(1<<37) is 10 bytes vs 5 bytes for REX BT reg,37.
> That's a large space penalty to pay for a possible partial update stall.
>
> So the idea of generating BT for
2015 Jan 22
3
[LLVMdev] X86TargetLowering::LowerToBT
Yeah, the alternative is to do movabs and then test, which is doable but I’m not sure if it’s worth it (surely BT + risk of flags merging penalty has to be better than two ops, one of which is ~9-10 bytes).
Fiona
> On Jan 22, 2015, at 2:59 PM, Chris Sears <chris.sears at gmail.com> wrote:
>
> My bad on that. So that's what the comment meant.
> That means BT is pretty much