search for: fglaser

Displaying 12 results from an estimated 12 matches for "fglaser".

Did you mean: glaser
2016 Apr 12
2
Implementing a proposed InstCombine optimization
Good point. The same argument seems to apply to copy() too so I suppose it depends how strict we want to be about it. From: fglaser at apple.com [mailto:fglaser at apple.com] On Behalf Of escha at apple.com Sent: 11 April 2016 20:55 To: Daniel Sanders Cc: Alex Rosenberg; llvm-dev at lists.llvm.org; Carlos Liam Subject: Re: [llvm-dev] Implementing a proposed InstCombine optimization On Apr 11, 2016, at 4:23 AM, Daniel Sanders...
2018 Apr 09
1
SCEV and LoopStrengthReduction Formulae
> From: fglaser at apple.com <fglaser at apple.com> On Behalf Of escha at apple.com > Sent: Saturday, April 7, 2018 8:22 AM > >> I realize this is a micro-op saving a single cycle.  But this reduces the instruction count, one less >> instr to decode in a potentially hot path. If this all m...
2017 Jan 31
0
Intercepting lowering of stack adjustments
> On Jan 30, 2017, at 6:18 AM, Martin J. O'Riordan via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > For a long time we have had code for custom lowering of adjustments to the stack pointer. But until recently we did not realise that we were handling only places that provided a fixed-value for such adjustments, and the ISD nodes ‘ADJCALLSTACKDOWN’ and ‘ADJCALLSTACKUP’ are
2018 Apr 07
0
SCEV and LoopStrengthReduction Formulae
> > I realize this is a micro-op saving a single cycle. But this reduces the instruction count, one less > instr to decode in a potentially hot path. If this all makes sense, and seems like a reasonable addition > to llvm, would it make sense to implement this as a supplemental LSR formula, or as a separate pass? This seems reasonable to me so long as rbx has no other uses that
2017 Jan 30
2
Intercepting lowering of stack adjustments
For a long time we have had code for custom lowering of adjustments to the stack pointer. But until recently we did not realise that we were handling only places that provided a fixed-value for such adjustments, and the ISD nodes 'ADJCALLSTACKDOWN' and 'ADJCALLSTACKUP' are only described in our TableGen descriptions for immediates. This hasn't previous mattered as LLVM
2015 Jan 16
3
[LLVMdev] git-svn authorship (was: Howdy + GIT)
Erik> I am surprised noone has mentioned the one of the biggest Erik> advantages of Git which is proper author attribution for Erik> non-core and drive-by patch contributors. >From what I can make of the git-svn docs, that LLVM committers should be adding a "From: <email>" field to commit messages instead of "Patch by <name>". If the original author is
2018 Apr 03
4
SCEV and LoopStrengthReduction Formulae
I am attempting to implement a minor loop strength reduction optimization for targets that support compare and jump fusion, specifically TTI::canMacroFuseCmp(). My approach might be wrong; however, I am soliciting the idea for feedback, so that I can implement this correctly. My plan is to add a Supplemental LSR formula to LoopStrengthReduce.cpp that optimizes the following case, but perhaps
2016 Apr 11
4
Implementing a proposed InstCombine optimization
> I am not entirely sure this is safe. Transforming this to an fsub could change the value stored on platforms that implement negates using arithmetic instead of with bitmath (such as ours) I think it's probably safe for IEEE754-2008 conformant platforms because negation was clarified to be a non-arithmetic bit flip that cannot cause exceptions in that specification. However, I'm sure
2015 Jan 22
2
[LLVMdev] X86TargetLowering::LowerToBT
> On Jan 22, 2015, at 1:22 PM, Fiona Glaser <fglaser at apple.com> wrote: > > According to Agner’s docs, many CPUs have slower BT than TEST; Haswell has only 0.5 inverse throughput as opposed to 0.25, Atom has 1 instead of 0.5, and Silvermont can’t even dual-issue BT (it locks both ALUs). So while BT does seem have a shorter instruction enc...
2015 Jan 19
6
[LLVMdev] X86TargetLowering::LowerToBT
I'm tracking down an X86 code generation malfeasance regarding BT (bit test) and I have some questions. This IR *matches* and then *X86TargetLowering::LowerToBT **is called:* %and = and i64 %shl, %val * ; (val & (1 << index)) != 0 ; *bit test with a *register* index This IR *does not match* and so *X86TargetLowering::LowerToBT **is not called:* %and = lshr i64 %val, 25
2015 Jan 22
3
[LLVMdev] X86TargetLowering::LowerToBT
Is that even a valid instruction? I thought TEST only took 32-bit immediates. Fiona > On Jan 22, 2015, at 2:48 PM, Chris Sears <chris.sears at gmail.com> wrote: > > The problem is that REX TEST reg,#(1<<37) is 10 bytes vs 5 bytes for REX BT reg,37. > That's a large space penalty to pay for a possible partial update stall. > > So the idea of generating BT for
2015 Jan 22
3
[LLVMdev] X86TargetLowering::LowerToBT
Yeah, the alternative is to do movabs and then test, which is doable but I’m not sure if it’s worth it (surely BT + risk of flags merging penalty has to be better than two ops, one of which is ~9-10 bytes). Fiona > On Jan 22, 2015, at 2:59 PM, Chris Sears <chris.sears at gmail.com> wrote: > > My bad on that. So that's what the comment meant. > That means BT is pretty much