thr3ads.net - similar to: "[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin

Displaying 20 results from an estimated 700 matches similar to: "[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz"

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

2010 Jan 15

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > Hi, > > On ARMv6T2 this turns cttz into rbit, clz instead of the 4 > instruction sequence it is now. > > I'm not sure if adding RBIT to ARMISD and doing this optimization in > the legalize pass is the best option, but the only better way I > could think of doing it was to add a bitreverse intrinsic to llvm

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

2010 Jan 15

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

On Fri, Jan 15, 2010 at 6:03 PM, Chris Lattner <clattner at apple.com> wrote: > > On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > >> Hi, >> >> On ARMv6T2 this turns cttz into rbit, clz instead of the 4 >> instruction sequence it is now. >> >> I'm not sure if adding RBIT to ARMISD and doing this optimization in >> the legalize pass is

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

2010 Jan 15

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

On 15 Jan 2010, at 18:03, Chris Lattner wrote: > On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > >> Other targets that I know of that could potentially benefit from >> this optimization being global (that have a clz and bitreverse >> instruction but not ctz) are AVR32 and C64x, neither of which llvm >> has backends for yet. > > When/if another

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

2010 Jan 15

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

On Jan 15, 2010, at 11:37 AM, Richard Osborne wrote: > > On 15 Jan 2010, at 18:03, Chris Lattner wrote: > >> On Jan 14, 2010, at 10:13 PM, David Conrad wrote: >> >>> Other targets that I know of that could potentially benefit from >>> this optimization being global (that have a clz and bitreverse >>> instruction but not ctz) are AVR32 and C64x,

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

2010 Jan 18

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

On Jan 15, 2010, at 2:52 PM, Jim Grosbach wrote: > > On Jan 15, 2010, at 11:37 AM, Richard Osborne wrote: > >> >> On 15 Jan 2010, at 18:03, Chris Lattner wrote: >> >>> On Jan 14, 2010, at 10:13 PM, David Conrad wrote: >>> >>>> Other targets that I know of that could potentially benefit from >>>> this optimization being

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

2010 Jan 19

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

On Jan 15, 2010, at 10:03 AM, Chris Lattner wrote: > > When/if another target wants this, we could add a ISD::RBIT operation, > it doesn't need to be added at the llvm ir level, Blackfin can add with backwards carry, essentially doing (rbit (add (rbit a), (rbit b))) This is used for FFTs. I wasn't hoping to be able to pattern-match something so complicated.

[RFC] carry-less multiplication instruction

2020 Jul 05

[RFC] carry-less multiplication instruction

<div> </div><div><div><p>Carry-less multiplication[1] instructions exist (at least optionally) on many architectures: armv8, RISC-V, x86_64, POWER, SPARC, C64x, and possibly more.</p><p>This proposal is to add a <code>llvm.clmul</code> instruction. Or if that is contentious, <code>llvm.experimental.bitmanip.clmul</code> instruction.

[RFC] carry-less multiplication instruction

2020 Jul 09

[RFC] carry-less multiplication instruction

05.07.2020, 05:22, "Roman Lebedev" <lebedev.ri at gmail.com>: > On Sun, Jul 5, 2020 at 12:18 PM Shawn Landden via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Carry-less multiplication[1] instructions exist (at least optionally) on many architectures: armv8, RISC-V, x86_64, POWER, SPARC, C64x, and possibly more. >> >> This proposal is to add a

Use Galois field New Instructions (GFNI) to combine affine instructions

2020 May 18

Use Galois field New Instructions (GFNI) to combine affine instructions

On 5/18/20 8:24 PM, Craig Topper wrote: > I can tell you that your avx512 issue is that v64i8 gfni instructions also > require avx512bw to be enabled to make v64i8 a supported type. The C > intrinsics handling in the front end know this rule. But since you > generated your own intrinsics you bypassed that. Indeed that's the issue... I was stick with what Intel announces here

proposal for optimization method

2019 Feb 20

proposal for optimization method

Hello everyone, I discovered a way to perform optimization on the following code (I gave an example that uses 32bit integer, but it works with any size.): const uint32 d,r;//d is an odd number //d is the divisor, r is the remainder bool check_remainder(uint32 x) { return x%d==r; } if we know d and r at compile time, and d is an odd integer, we can use modular multiplicative inverse to bypass the

[RFC][RISCV] Selection of complex codegen patterns into RISCV bit manipulation instructions

2019 Aug 14

[RFC][RISCV] Selection of complex codegen patterns into RISCV bit manipulation instructions

Hi all, I'm currently working on the implementation for LLVM of the RISCV Bit Manipulation ISA extension described by Clifford Wolf in the following presentation: https://content.riscv.org/wp-content/uploads/2019/06/17.10-b_wolf.pdf and the following document: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.90.pdf The aim is to provide the intrinsic functions to the user in

LLVM Weekly - #98, Nov 16th 2015

2015 Nov 16

LLVM Weekly - #98, Nov 16th 2015

LLVM Weekly - #98, Nov 16th 2015 ================================ If you prefer, you can read a HTML version of this email at <http://llvmweekly.org/issue/98>. Welcome to the ninety-eighth issue of LLVM Weekly, a weekly newsletter (published every Monday) covering developments in LLVM, Clang, and related projects. LLVM Weekly is brought to you by [Alex Bradbury](http://asbradbury.org).

[LLVMdev] Proposal for pluggable intrinsics

2009 Jun 12

[LLVMdev] Proposal for pluggable intrinsics

Hi all, Greetings. I'm a Ph.D. student in UIUC. Now I'm working on SAFECode, a research compiler based on LLVM which insert necessary runtime checks to guarantee memory-safety of programs. SAFECode needs to insert checks into the programs (say, please check this load instruction for me). Currently SAFECode inserts these checks as normal call instructions. It would be great that LLVM can

[LLVMdev] LLVM Weekly - #1, Jan 6th 2014

2014 Jan 06

[LLVMdev] LLVM Weekly - #1, Jan 6th 2014

LLVM Weekly - #1, Jan 6th 2014 ============================== Welcome to the inaugural issue of LLVM Weekly, a weekly newsletter (published every Monday) covering developments in LLVM, Clang, and related projects. I've been a long time lurker on the LLVM and Clang mailing lists and have been using LLVM extensively in my PhD research for the past 4 years. I thought it might be worthwhile to

[LLVMdev] Proposal for pluggable intrinsics

2009 Jun 12

[LLVMdev] Proposal for pluggable intrinsics

On Jun 12, 2009, at 2:52 PM, Mai, Haohui wrote: > Greetings. I'm a Ph.D. student in UIUC. Now I'm working on SAFECode, a > research compiler based on LLVM which insert necessary runtime checks > to guarantee memory-safety of programs. SAFECode needs to insert > checks > into the programs (say, please check this load instruction for me). Hi. > Currently SAFECode

Rotates, once again

2018 May 16

Rotates, once again

On 2018-05-16 00:34, Sanjay Patel via llvm-dev wrote: > Vectorization goes overboard because the throughput cost model used by > the > vectorizers doesn't match the 6 IR instructions that correspond to 1 > x86 > rotate instruction. Instead, we have: > > [...] > > The broken cost model also affects unrolling and inlining. Size costs > are > overestimated

Reasoning about known bits of the absolute value of a signed integer

2016 May 03

Reasoning about known bits of the absolute value of a signed integer

I'm trying to reason about how to find certain bit positions of the absolute value of a given integer value. Specifically, I want to know the highest possibly set bit and lowest possibly set bit of the absolute value, in order to find the range between the two. Note that I'm specifically trying to be as conservative as possible. This is what I have so far: If the sign bit of the

[RFC] carry-less multiplication instruction

2020 Jul 09

[RFC] carry-less multiplication instruction

(As per IRC discussion) I understand that the carry-less multiplication algorithm has it's uses since/and it is implemented as an instruction in many architectures and that adding it as a general-purpose intrinsic will allow us to drop target-specific intrinsics as by-product. What i do *NOT* understand is: what is the actual/main goal/driving factor of adding an LLVM intrinsic for it? The

Load combine pass

2016 Sep 28

Load combine pass

Hi, I'm trying to optimize a pattern like this into a single i16 load: %1 = bitcast i16* %pData to i8* %2 = load i8, i8* %1, align 1 %3 = zext i8 %2 to i16 %4 = shl nuw i16 %3, 8 %5 = getelementptr inbounds i8, i8* %1, i16 1 %6 = load i8, i8* %5, align 1 %7 = zext i8 %6 to i16 %8 = shl nuw nsw i16 %7, 0 %9 = or i16 %8, %4 I came across load combine pass which is motivated

[LLVMdev] Experimental C64X backend

2010 Aug 12

[LLVMdev] Experimental C64X backend

Hi, Over the past few months I've been developing a LLVM backend for TIs C64X family of DSPs. It can be found as a co-processor in a variety of OMAP-based devices such as gumstix, beagleboard and even Nokia's N900 phone. A project I'm working on [0] has had need to put code on it, and we wanted to avoid TIs proprietary compiler. The DSP itself is a VLIW machine, with 64 32-bit

similar to: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz