thr3ads.net - similar to: "Rotates, once again"

Displaying 20 results from an estimated 40000 matches similar to: "Rotates, once again"

2018 May 15

Rotates, once again

Thanks for writing this up. I'd like to have this intrinsic too. Another argument for having the intrinsic is shown in PR37426: https://bugs.llvm.org/show_bug.cgi?id=37426 Vectorization goes overboard because the throughput cost model used by the vectorizers doesn't match the 6 IR instructions that correspond to 1 x86 rotate instruction. Instead, we have: $ opt 37426prevectorize.ll -S

Rotates, once again

2018 Jul 02

Rotates, once again

1. I'm not sure what you mean by "full vector" here - using the same shift distance for all lanes (as opposed to per-lane distances), or doing a treat-the-vector-as-bag-of-bits shift that doesn't have any internal lane boundaries? If the latter, that doesn't really help you much with implementing a per-lane rotate. I think the most useful generalization of a vector

Rotates, once again

2018 Jul 02

Rotates, once again

On 7/2/2018 3:16 PM, Sanjay Patel wrote: > I also agree that the per-element rotate for vectors is what we want for > this intrinsic. > > So I have this so far: > > declare i32 @llvm.catshift.i32(i32 %a, i32 %b, i32 %shift_amount) > declare <2 x i32> @llvm.catshift.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %shift_amount) > > For

Rotates, once again

2018 May 17

Rotates, once again

Thanks Sanjay! At this point the cost/benefit tradeoff for rotate intrinsics seems pretty good. John On 05/17/2018 11:14 AM, Sanjay Patel wrote: > A rotate intrinsic should be relatively close in cost/complexity to the > existing bswap. > > A grep of intrinsic::bswap says we'd probably add code in: > InstCombine > InstructionSimplify > ConstantFolding >

Rotates, once again

2018 May 16

Rotates, once again

On 2018-05-16 00:34, Sanjay Patel via llvm-dev wrote: > Vectorization goes overboard because the throughput cost model used by > the > vectorizers doesn't match the 6 IR instructions that correspond to 1 > x86 > rotate instruction. Instead, we have: > > [...] > > The broken cost model also affects unrolling and inlining. Size costs > are > overestimated

[LLVMdev] rol/ror llvm instruction set

2009 Feb 04

[LLVMdev] rol/ror llvm instruction set

On Tue, Feb 3, 2009 at 3:54 PM, Kasra <kasra_n500 at yahoo.com> wrote: > > I guess the backends could know about the instructions. But I am not convinced why it is beneficial not to have ROR and ROL instructions within llvm. > I guess I could ask you the opposite question: What is the benefit of having these? They would have to be mappable to the source language in some way. I'm

Rotates, once again

2018 May 16

Rotates, once again

On 5/16/18 1:58 PM, Sanjay Patel via llvm-dev wrote: > An informal metric might be: if the operation is supported as a > primitive op or built-in in source languages and it is supported as a > single target instruction, can we guarantee that 1-to-1 translation > through optimization? It seems perfectly reasonable for LLVM users to expect this to happen reliably. I'd like to

funnel shift, select, and poison

2019 Feb 25

funnel shift, select, and poison

Don't we need to distinguish funnel shift from the more specific rotate? I'm not seeing how rotate (a single input op shifted by some amount) gets into trouble like funnel shift (two variables concatenated and shifted by some amount). Eg, if in pseudo IR we have: %funnel_shift = fshl %x, %y, %sh ; this is problematic because either x or y can be poison, but we may not touch the poison when

Rotates, once again

2018 May 16

Rotates, once again

On Wed, May 16, 2018 at 11:27 AM, Manuel Jacob <me at manueljacob.de> wrote: > On 2018-05-16 00:34, Sanjay Patel via llvm-dev wrote: > >> Vectorization goes overboard because the throughput cost model used by the >> vectorizers doesn't match the 6 IR instructions that correspond to 1 x86 >> rotate instruction. Instead, we have: >> >> [...] >>

funnel shift, select, and poison

2019 Feb 25

funnel shift, select, and poison

We have these transforms from funnel shift to a simpler shift op: // fshl(X, 0, C) -> shl X, C // fshl(X, undef, C) -> shl X, C // fshl(0, X, C) -> lshr X, (BW-C) // fshl(undef, X, C) -> lshr X, (BW-C) These were part of: https://reviews.llvm.org/D54778 In all cases, one operand must be 0 or undef and the shift amount is a constant, so I think these are safe.

[LLVMdev] rol/ror llvm instruction set

2009 Feb 03

[LLVMdev] rol/ror llvm instruction set

--- On Tue, 2/3/09, Bill Wendling <isanbard at gmail.com> wrote: > From: Bill Wendling <isanbard at gmail.com> > Subject: Re: [LLVMdev] rol/ror llvm instruction set > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Cc: kasra_n500 at yahoo.com > Date: Tuesday, February 3, 2009, 2:52 PM > On Tue, Feb 3, 2009 at 2:45 PM, Dale Johannesen

funnel shift, select, and poison

2019 Feb 26

funnel shift, select, and poison

If I got poison propagation right, it's probably only by luck! Hopefully, the funnel shift bug is fixed here: https://reviews.llvm.org/rL354905 Nuno, IIUC this means that you do *not* need to change the funnel shift semantics in Alive. So I think that means we're still on track to go with John's suggestion that only select and phi can block poison? (I don't know of any

Rotates, once again

2018 May 17

Rotates, once again

A rotate intrinsic should be relatively close in cost/complexity to the existing bswap. A grep of intrinsic::bswap says we'd probably add code in: InstCombine InstructionSimplify ConstantFolding DemandedBits ValueTracking VectorUtils SelectionDAGBuilder But I don't think it's fair to view those additions as pure added cost. As an example, consider that we have to add hacks to

[LLVMdev] rotate

2012 Jul 29

[LLVMdev] rotate

Nice! Clever compiler.. On 07/28/2012 08:55 PM, Michael Gottesman wrote: > I can get clang/llvm to emit a rotate instruction on x86-64 when compiling C by just using -Os and the rotate from Hacker's Delight i.e., > > ====== > #include<stdlib.h> > #include<stdint.h> > > uint32_t ror(uint32_t input, size_t rot_bits) > { > return (input>>

funnel shift, select, and poison

2019 Feb 25

funnel shift, select, and poison

There's a question about the behavior of funnel shift [1] + select and poison here that reminds me of previous discussions about select and poison [2]: https://github.com/AliveToolkit/alive2/pull/32#discussion_r257528880 Example: define i8 @fshl_zero_shift_guard(i8 %x, i8 %y, i8 %sh) { %c = icmp eq i8 %sh, 0 %f = fshl i8 %x, i8 %y, i8 %sh %s = select i1 %c, i8 %x, i8 %f ; shift amount is 0

[LLVMdev] rol/ror llvm instruction set

2009 Feb 04

[LLVMdev] rol/ror llvm instruction set

--- On Tue, 2/3/09, Bill Wendling <isanbard at gmail.com> wrote: > From: Bill Wendling <isanbard at gmail.com> > Subject: Re: [LLVMdev] rol/ror llvm instruction set > To: kasra_n500 at yahoo.com, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Date: Tuesday, February 3, 2009, 4:17 PM > On Tue, Feb 3, 2009 at 3:54 PM, Kasra > <kasra_n500

[LLVMdev] rotate

2012 Jul 29

[LLVMdev] rotate

*NOTE* IIRC compiling this with -O0 on x86-64 can yield the wrong result since clang will emit shifts and on intel shifts are mod the register size: ===== .section __TEXT,__text,regular,pure_instructions .globl _ror .align 4, 0x90 _ror: ## @ror .cfi_startproc ## BB#0: pushq %rbp Ltmp2: .cfi_def_cfa_offset 16 Ltmp3: .cfi_offset %rbp, -16 movq %rsp, %rbp

[LLVMdev] rol/ror llvm instruction set

2009 Feb 03

[LLVMdev] rol/ror llvm instruction set

On Feb 3, 2009, at 2:35 PMPST, Mike Stump wrote: > On Feb 3, 2009, at 2:28 PM, Kasra wrote: >> I was looking around the LLVM instruction set and I failed to find >> ROL and ROR instructions. Is there any plans on adding these >> instructions to LLVM? > > Not sure what you mean: He's referring to the LLVM IR, I think, and it's true that doesn't have

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

2013 Nov 10

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

I had a similar problem with a backend for the 68HC12 family which also has no barrel shifter. Some 68HC12 CPUs support shift for just one of the 16-bit registers and only support rotation on the 2 8-bit subregs of that 16-bit register. That means the only practical solution for 32-bit shifts is to lower to a libcall but my situation for 16-bit shifts sounds similar to yours for 32-bit shifts. I

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

2013 Nov 10

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

I forgot to mention that I used EXTRACT_ELEMENT in my backend to get the high and low parts of an SDValue. On 10 Nov 2013, at 17:50, Steve Montgomery <stephen.montgomery3 at btinternet.com> wrote: > I had a similar problem with a backend for the 68HC12 family which also has no barrel shifter. Some 68HC12 CPUs support shift for just one of the 16-bit registers and only support rotation

similar to: Rotates, once again