Displaying 20 results from an estimated 700 matches similar to: "[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz"
2010 Jan 15
0
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On Jan 14, 2010, at 10:13 PM, David Conrad wrote:
> Hi,
>
> On ARMv6T2 this turns cttz into rbit, clz instead of the 4
> instruction sequence it is now.
>
> I'm not sure if adding RBIT to ARMISD and doing this optimization in
> the legalize pass is the best option, but the only better way I
> could think of doing it was to add a bitreverse intrinsic to llvm
2010 Jan 15
1
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On Fri, Jan 15, 2010 at 6:03 PM, Chris Lattner <clattner at apple.com> wrote:
>
> On Jan 14, 2010, at 10:13 PM, David Conrad wrote:
>
>> Hi,
>>
>> On ARMv6T2 this turns cttz into rbit, clz instead of the 4
>> instruction sequence it is now.
>>
>> I'm not sure if adding RBIT to ARMISD and doing this optimization in
>> the legalize pass is
2010 Jan 15
2
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On 15 Jan 2010, at 18:03, Chris Lattner wrote:
> On Jan 14, 2010, at 10:13 PM, David Conrad wrote:
>
>> Other targets that I know of that could potentially benefit from
>> this optimization being global (that have a clz and bitreverse
>> instruction but not ctz) are AVR32 and C64x, neither of which llvm
>> has backends for yet.
>
> When/if another
2010 Jan 15
0
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On Jan 15, 2010, at 11:37 AM, Richard Osborne wrote:
>
> On 15 Jan 2010, at 18:03, Chris Lattner wrote:
>
>> On Jan 14, 2010, at 10:13 PM, David Conrad wrote:
>>
>>> Other targets that I know of that could potentially benefit from
>>> this optimization being global (that have a clz and bitreverse
>>> instruction but not ctz) are AVR32 and C64x,
2010 Jan 18
1
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On Jan 15, 2010, at 2:52 PM, Jim Grosbach wrote:
>
> On Jan 15, 2010, at 11:37 AM, Richard Osborne wrote:
>
>>
>> On 15 Jan 2010, at 18:03, Chris Lattner wrote:
>>
>>> On Jan 14, 2010, at 10:13 PM, David Conrad wrote:
>>>
>>>> Other targets that I know of that could potentially benefit from
>>>> this optimization being
2010 Jan 19
1
[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
On Jan 15, 2010, at 10:03 AM, Chris Lattner wrote:
>
> When/if another target wants this, we could add a ISD::RBIT operation,
> it doesn't need to be added at the llvm ir level,
Blackfin can add with backwards carry, essentially doing
(rbit (add (rbit a), (rbit b)))
This is used for FFTs.
I wasn't hoping to be able to pattern-match something so complicated.
2020 Jul 05
8
[RFC] carry-less multiplication instruction
<div> </div><div><div><p>Carry-less multiplication[1] instructions exist (at least optionally) on many architectures: armv8, RISC-V, x86_64, POWER, SPARC, C64x, and possibly more.</p><p>This proposal is to add a <code>llvm.clmul</code> instruction. Or if that is contentious, <code>llvm.experimental.bitmanip.clmul</code> instruction.
2020 Jul 09
2
[RFC] carry-less multiplication instruction
05.07.2020, 05:22, "Roman Lebedev" <lebedev.ri at gmail.com>:
> On Sun, Jul 5, 2020 at 12:18 PM Shawn Landden via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Carry-less multiplication[1] instructions exist (at least optionally) on many architectures: armv8, RISC-V, x86_64, POWER, SPARC, C64x, and possibly more.
>>
>> This proposal is to add a
2020 May 18
2
Use Galois field New Instructions (GFNI) to combine affine instructions
On 5/18/20 8:24 PM, Craig Topper wrote:
> I can tell you that your avx512 issue is that v64i8 gfni instructions also
> require avx512bw to be enabled to make v64i8 a supported type. The C
> intrinsics handling in the front end know this rule. But since you
> generated your own intrinsics you bypassed that.
Indeed that's the issue... I was stick with what Intel announces here
2019 Feb 20
2
proposal for optimization method
Hello everyone, I discovered a way to perform optimization on the following
code (I gave an example that uses 32bit integer, but it works with any
size.):
const uint32 d,r;//d is an odd number
//d is the divisor, r is the remainder
bool check_remainder(uint32 x)
{
return x%d==r;
}
if we know d and r at compile time, and d is an odd integer, we can use
modular multiplicative inverse to bypass the
2019 Aug 14
3
[RFC][RISCV] Selection of complex codegen patterns into RISCV bit manipulation instructions
Hi all,
I'm currently working on the implementation for LLVM of the RISCV Bit
Manipulation ISA extension described by Clifford Wolf in the following
presentation:
https://content.riscv.org/wp-content/uploads/2019/06/17.10-b_wolf.pdf
and the following document:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.90.pdf
The aim is to provide the intrinsic functions to the user in
2015 Nov 16
2
LLVM Weekly - #98, Nov 16th 2015
LLVM Weekly - #98, Nov 16th 2015
================================
If you prefer, you can read a HTML version of this email at
<http://llvmweekly.org/issue/98>.
Welcome to the ninety-eighth issue of LLVM Weekly, a weekly newsletter
(published every Monday) covering developments in LLVM, Clang, and related
projects. LLVM Weekly is brought to you by [Alex
Bradbury](http://asbradbury.org).
2009 Jun 12
2
[LLVMdev] Proposal for pluggable intrinsics
Hi all,
Greetings. I'm a Ph.D. student in UIUC. Now I'm working on SAFECode, a
research compiler based on LLVM which insert necessary runtime checks
to guarantee memory-safety of programs. SAFECode needs to insert checks
into the programs (say, please check this load instruction for me).
Currently SAFECode inserts these checks as normal call instructions. It
would be great that LLVM can
2014 Jan 06
5
[LLVMdev] LLVM Weekly - #1, Jan 6th 2014
LLVM Weekly - #1, Jan 6th 2014
==============================
Welcome to the inaugural issue of LLVM Weekly, a weekly newsletter (published
every Monday) covering developments in LLVM, Clang, and related projects. I've
been a long time lurker on the LLVM and Clang mailing lists and have been
using LLVM extensively in my PhD research for the past 4 years. I thought it
might be worthwhile to
2009 Jun 12
0
[LLVMdev] Proposal for pluggable intrinsics
On Jun 12, 2009, at 2:52 PM, Mai, Haohui wrote:
> Greetings. I'm a Ph.D. student in UIUC. Now I'm working on SAFECode, a
> research compiler based on LLVM which insert necessary runtime checks
> to guarantee memory-safety of programs. SAFECode needs to insert
> checks
> into the programs (say, please check this load instruction for me).
Hi.
> Currently SAFECode
2018 May 16
2
Rotates, once again
On 2018-05-16 00:34, Sanjay Patel via llvm-dev wrote:
> Vectorization goes overboard because the throughput cost model used by
> the
> vectorizers doesn't match the 6 IR instructions that correspond to 1
> x86
> rotate instruction. Instead, we have:
>
> [...]
>
> The broken cost model also affects unrolling and inlining. Size costs
> are
> overestimated
2016 May 03
3
Reasoning about known bits of the absolute value of a signed integer
I'm trying to reason about how to find certain bit positions of the absolute value of a given integer value. Specifically, I want to know the highest possibly set bit and lowest possibly set bit of the absolute value, in order to find the range between the two.
Note that I'm specifically trying to be as conservative as possible.
This is what I have so far:
If the sign bit of the
2020 Jul 09
2
[RFC] carry-less multiplication instruction
(As per IRC discussion)
I understand that the carry-less multiplication algorithm has it's uses
since/and it is implemented as an instruction in many architectures
and that adding it as a general-purpose intrinsic will allow us
to drop target-specific intrinsics as by-product.
What i do *NOT* understand is: what is the actual/main goal/driving
factor of adding an LLVM intrinsic for it?
The
2016 Sep 28
3
Load combine pass
Hi,
I'm trying to optimize a pattern like this into a single i16 load:
%1 = bitcast i16* %pData to i8*
%2 = load i8, i8* %1, align 1
%3 = zext i8 %2 to i16
%4 = shl nuw i16 %3, 8
%5 = getelementptr inbounds i8, i8* %1, i16 1
%6 = load i8, i8* %5, align 1
%7 = zext i8 %6 to i16
%8 = shl nuw nsw i16 %7, 0
%9 = or i16 %8, %4
I came across load combine pass which is motivated
2010 Aug 12
2
[LLVMdev] Experimental C64X backend
Hi,
Over the past few months I've been developing a LLVM backend for TIs
C64X family of DSPs. It can be found as a co-processor in a variety of
OMAP-based devices such as gumstix, beagleboard and even Nokia's N900
phone. A project I'm working on [0] has had need to put code on it, and
we wanted to avoid TIs proprietary compiler.
The DSP itself is a VLIW machine, with 64 32-bit