similar to: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

Displaying 20 results from an estimated 700 matches similar to: "[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz"

2010 Jan 15
0
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > Hi, > > On ARMv6T2 this turns cttz into rbit, clz instead of the 4 > instruction sequence it is now. > > I'm not sure if adding RBIT to ARMISD and doing this optimization in > the legalize pass is the best option, but the only better way I > could think of doing it was to add a bitreverse intrinsic to llvm
2010 Jan 15
1
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On Fri, Jan 15, 2010 at 6:03 PM, Chris Lattner <clattner at apple.com> wrote: > > On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > >> Hi, >> >> On ARMv6T2 this turns cttz into rbit, clz instead of the 4 >> instruction sequence it is now. >> >> I'm not sure if adding RBIT to ARMISD and doing this optimization in >> the legalize pass is
2010 Jan 15
2
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On 15 Jan 2010, at 18:03, Chris Lattner wrote: > On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > >> Other targets that I know of that could potentially benefit from >> this optimization being global (that have a clz and bitreverse >> instruction but not ctz) are AVR32 and C64x, neither of which llvm >> has backends for yet. > > When/if another
2010 Jan 15
0
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On Jan 15, 2010, at 11:37 AM, Richard Osborne wrote: > > On 15 Jan 2010, at 18:03, Chris Lattner wrote: > >> On Jan 14, 2010, at 10:13 PM, David Conrad wrote: >> >>> Other targets that I know of that could potentially benefit from >>> this optimization being global (that have a clz and bitreverse >>> instruction but not ctz) are AVR32 and C64x,
2010 Jan 18
1
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On Jan 15, 2010, at 2:52 PM, Jim Grosbach wrote: > > On Jan 15, 2010, at 11:37 AM, Richard Osborne wrote: > >> >> On 15 Jan 2010, at 18:03, Chris Lattner wrote: >> >>> On Jan 14, 2010, at 10:13 PM, David Conrad wrote: >>> >>>> Other targets that I know of that could potentially benefit from >>>> this optimization being
2010 Jan 19
1
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On Jan 15, 2010, at 10:03 AM, Chris Lattner wrote: > > When/if another target wants this, we could add a ISD::RBIT operation, > it doesn't need to be added at the llvm ir level, Blackfin can add with backwards carry, essentially doing (rbit (add (rbit a), (rbit b))) This is used for FFTs. I wasn't hoping to be able to pattern-match something so complicated.
2020 Jul 05
8
[RFC] carry-less multiplication instruction
<div> </div><div><div><p>Carry-less multiplication[1] instructions exist (at least optionally) on many architectures: armv8, RISC-V, x86_64, POWER, SPARC, C64x, and possibly more.</p><p>This proposal is to add a <code>llvm.clmul</code> instruction. Or if that is contentious, <code>llvm.experimental.bitmanip.clmul</code> instruction.
2020 Jul 09
2
[RFC] carry-less multiplication instruction
05.07.2020, 05:22, "Roman Lebedev" <lebedev.ri at gmail.com>: > On Sun, Jul 5, 2020 at 12:18 PM Shawn Landden via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >>  Carry-less multiplication[1] instructions exist (at least optionally) on many architectures: armv8, RISC-V, x86_64, POWER, SPARC, C64x, and possibly more. >> >>  This proposal is to add a
2020 May 18
2
Use Galois field New Instructions (GFNI) to combine affine instructions
On 5/18/20 8:24 PM, Craig Topper wrote: > I can tell you that your avx512 issue is that v64i8 gfni instructions also > require avx512bw to be enabled to make v64i8 a supported type. The C > intrinsics handling in the front end know this rule. But since you > generated your own intrinsics you bypassed that. Indeed that's the issue... I was stick with what Intel announces here
2019 Feb 20
2
proposal for optimization method
Hello everyone, I discovered a way to perform optimization on the following code (I gave an example that uses 32bit integer, but it works with any size.): const uint32 d,r;//d is an odd number //d is the divisor, r is the remainder bool check_remainder(uint32 x) { return x%d==r; } if we know d and r at compile time, and d is an odd integer, we can use modular multiplicative inverse to bypass the
2019 Aug 14
3
[RFC][RISCV] Selection of complex codegen patterns into RISCV bit manipulation instructions
Hi all, I'm currently working on the implementation for LLVM of the RISCV Bit Manipulation ISA extension described by Clifford Wolf in the following presentation: https://content.riscv.org/wp-content/uploads/2019/06/17.10-b_wolf.pdf and the following document: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.90.pdf The aim is to provide the intrinsic functions to the user in
2015 Nov 16
2
LLVM Weekly - #98, Nov 16th 2015
LLVM Weekly - #98, Nov 16th 2015 ================================ If you prefer, you can read a HTML version of this email at <http://llvmweekly.org/issue/98>. Welcome to the ninety-eighth issue of LLVM Weekly, a weekly newsletter (published every Monday) covering developments in LLVM, Clang, and related projects. LLVM Weekly is brought to you by [Alex Bradbury](http://asbradbury.org).
2009 Jun 12
2
[LLVMdev] Proposal for pluggable intrinsics
Hi all, Greetings. I'm a Ph.D. student in UIUC. Now I'm working on SAFECode, a research compiler based on LLVM which insert necessary runtime checks to guarantee memory-safety of programs. SAFECode needs to insert checks into the programs (say, please check this load instruction for me). Currently SAFECode inserts these checks as normal call instructions. It would be great that LLVM can
2014 Jan 06
5
[LLVMdev] LLVM Weekly - #1, Jan 6th 2014
LLVM Weekly - #1, Jan 6th 2014 ============================== Welcome to the inaugural issue of LLVM Weekly, a weekly newsletter (published every Monday) covering developments in LLVM, Clang, and related projects. I've been a long time lurker on the LLVM and Clang mailing lists and have been using LLVM extensively in my PhD research for the past 4 years. I thought it might be worthwhile to
2009 Jun 12
0
[LLVMdev] Proposal for pluggable intrinsics
On Jun 12, 2009, at 2:52 PM, Mai, Haohui wrote: > Greetings. I'm a Ph.D. student in UIUC. Now I'm working on SAFECode, a > research compiler based on LLVM which insert necessary runtime checks > to guarantee memory-safety of programs. SAFECode needs to insert > checks > into the programs (say, please check this load instruction for me). Hi. > Currently SAFECode
2018 May 16
2
Rotates, once again
On 2018-05-16 00:34, Sanjay Patel via llvm-dev wrote: > Vectorization goes overboard because the throughput cost model used by > the > vectorizers doesn't match the 6 IR instructions that correspond to 1 > x86 > rotate instruction. Instead, we have: > > [...] > > The broken cost model also affects unrolling and inlining. Size costs > are > overestimated
2016 May 03
3
Reasoning about known bits of the absolute value of a signed integer
I'm trying to reason about how to find certain bit positions of the absolute value of a given integer value. Specifically, I want to know the highest possibly set bit and lowest possibly set bit of the absolute value, in order to find the range between the two. Note that I'm specifically trying to be as conservative as possible. This is what I have so far: If the sign bit of the
2020 Jul 09
2
[RFC] carry-less multiplication instruction
(As per IRC discussion) I understand that the carry-less multiplication algorithm has it's uses since/and it is implemented as an instruction in many architectures and that adding it as a general-purpose intrinsic will allow us to drop target-specific intrinsics as by-product. What i do *NOT* understand is: what is the actual/main goal/driving factor of adding an LLVM intrinsic for it? The
2016 Sep 28
3
Load combine pass
Hi, I'm trying to optimize a pattern like this into a single i16 load: %1 = bitcast i16* %pData to i8* %2 = load i8, i8* %1, align 1 %3 = zext i8 %2 to i16 %4 = shl nuw i16 %3, 8 %5 = getelementptr inbounds i8, i8* %1, i16 1 %6 = load i8, i8* %5, align 1 %7 = zext i8 %6 to i16 %8 = shl nuw nsw i16 %7, 0 %9 = or i16 %8, %4 I came across load combine pass which is motivated
2010 Aug 12
2
[LLVMdev] Experimental C64X backend
Hi, Over the past few months I've been developing a LLVM backend for TIs C64X family of DSPs. It can be found as a co-processor in a variety of OMAP-based devices such as gumstix, beagleboard and even Nokia's N900 phone. A project I'm working on [0] has had need to put code on it, and we wanted to avoid TIs proprietary compiler. The DSP itself is a VLIW machine, with 64 32-bit