thr3ads.net - similar to: "Bug: Library functions for ISD::SRA, ISD::SHL, and ISD::SRL"

Displaying 20 results from an estimated 500 matches similar to: "Bug: Library functions for ISD::SRA, ISD::SHL, and ISD::SRL"

Bug: Library functions for ISD::SRA, ISD::SHL, and ISD::SRL

2019 Jun 10

Bug: Library functions for ISD::SRA, ISD::SHL, and ISD::SRL

Hi Eli, Thanks for pointing to the CTLZ_ZERO_UNDEF “LibCall” implementation. I have not it in the version that I am currently using, so it’s nice to know that it’s implemented now. Incidentally, the CTLZ… implementation is IDENTICAL to what I am proposing for the Shifts. This is not just adding support for “out-of-tree-targets”, but giving consistency to the fact that we have perfectly defined

Bug: Library functions for ISD::SRA, ISD::SHL, and ISD::SRL

2019 Jun 11

Bug: Library functions for ISD::SRA, ISD::SHL, and ISD::SRL

Hi Eli, First of all, please I would appreciate that you try to not confuse my limited use of English with stupidity or lack or criteria in other subjects. I’m not English native, so please keep that in mind. You have been significantly helpful in the recent past so please keep on. Interestingly, you made a mention of a related but not identical issue. It is true that most (or all) processors

Out-of-line atomics implementation ways

2020 Oct 15

Out-of-line atomics implementation ways

Greetings everyone, I am working on Aarch64 LSE out-of-line atomics support in LLVM, porting this GCC series: https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01034.html After local design experiments I've got some questions about upstream-suitable ways of implementation. More specifically: 1. Pass to expand atomics to library helper functions calls. These helpers test for the presence

[LLVMdev] libcalls for shifts

2012 Jan 07

[LLVMdev] libcalls for shifts

Hello, my target has libcall support for long long shifts. I already have the following lines in my Lowering constructor: setLibcallName(RTLIB::SHL_I64, "__llshl"); setLibcallName(RTLIB::SRL_I64, "__llshru"); setLibcallName(RTLIB::SRA_I64, "__llshr"); and setOperationAction(ISD::SHL, MVT::i64, Expand); setOperationAction(ISD::SRA, MVT::i64,

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

2013 Nov 10

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

I had a similar problem with a backend for the 68HC12 family which also has no barrel shifter. Some 68HC12 CPUs support shift for just one of the 16-bit registers and only support rotation on the 2 8-bit subregs of that 16-bit register. That means the only practical solution for 32-bit shifts is to lower to a libcall but my situation for 16-bit shifts sounds similar to yours for 32-bit shifts. I

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

2013 Nov 09

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

Dear All, I am trying to custom lower 32-bit ISD::SHL and SHR in a backend for 6502 family CPUs. The particular subtarget has 16-bit registers at most, so a 32-bit result is not legal. Normally, if you mark this as "Legal" or "Expand", then it will expand the node into a more nodes as follows in an example: shl i32 %a , 2 => high_sdvalue = (or (shr %b, 14), (shl %c, 2) )

RTLIB and Custom Library calls

2020 Mar 02

RTLIB and Custom Library calls

Hello LLVM-Dev, Most of the processing for i64 and f64 types for our backend are emulation library calls. Some of the library calls are not defined in the RuntimeLibcalls.def Libcall enum so we have to define custom library calls. How is the ideal way of implementing the custom library calls? Providing us with a target backend having a similar functionality would also help us significantly. Say

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

2013 Nov 11

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

Hi Steve, Thanks for confirming that EXTRACT_ELEMENT is something I can use. I had seen it in the generated DAGs but was unsure whether I was "allowed" to use it, if that's the right word. I checked up on it more and indeed the mainstream targets like ARM use that node type in custom lowering code, so that should solve that. Perhaps in the future I might submit a patch for

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

2013 Nov 10

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

I forgot to mention that I used EXTRACT_ELEMENT in my backend to get the high and low parts of an SDValue. On 10 Nov 2013, at 17:50, Steve Montgomery <stephen.montgomery3 at btinternet.com> wrote: > I had a similar problem with a backend for the 68HC12 family which also has no barrel shifter. Some 68HC12 CPUs support shift for just one of the 16-bit registers and only support rotation

Out-of-line atomics implementation ways

2020 Oct 15

Out-of-line atomics implementation ways

Current precent in the codebase is the __sync_* libcalls. They have essentially the semantics you want, except that they're all seq_cst. On the LLVM side, I'd rather not have two ways to do the same thing, so I'd prefer to extend the existing mechanism. Adding 100 lines to RuntimeLibcalls.def seems a bit unfortunate, but I think you can reduce that using some C macros. On the

[LLVMdev] libcalls for shifts

2012 Jan 08

[LLVMdev] libcalls for shifts

On Sat, Jan 7, 2012 at 10:18 AM, Johannes Birgmeier <e0902998 at student.tuwien.ac.at> wrote: > Hello, > > my target has libcall support for long long shifts. I already have the > following lines in my Lowering constructor: > > setLibcallName(RTLIB::SHL_I64, "__llshl"); > setLibcallName(RTLIB::SRL_I64, "__llshru"); >

[RFC][VECLIB] how should we legalize VECLIB calls?

2018 Jul 02

[RFC][VECLIB] how should we legalize VECLIB calls?

It may not be a full solution for the problems you're trying to solve, but I don't know why adding to include/llvm/CodeGen/RuntimeLibcalls.def is a problem in itself. Certainly, it's a mess that could be organized, especially so we're not repeating everything for each data type as we do right now. So yes, I think that would allow us to remove the VecLib mappings because we are

[RFC][VECLIB] how should we legalize VECLIB calls?

2018 Jul 02

[RFC][VECLIB] how should we legalize VECLIB calls?

On 07/02/2018 04:33 PM, Saito, Hideki wrote: > > > > >It may not be a full solution for the problems you're trying to solve > > > > If we are inventing a new solution, I’d like it also to solve OpenMP > declare simd legalization issue. If a small extension of existing scheme > > works for mathlib only, I’m happy to take that and discuss OpenMP >

[RFC][VECLIB] how should we legalize VECLIB calls?

2018 Jul 02

[RFC][VECLIB] how should we legalize VECLIB calls?

Adding to Ashutosh's comments, We are also interested in making LLVM generate vector math library calls that are available with glibc (version > 2.22). reference: https://sourceware.org/glibc/wiki/libmvec Using the example case given in the reference, we found there are 2 vector versions for "sin" (4 X double) with same VF namely _ZGVcN4v_sin (avx) version and _ZGVdN4v_sin

[LLVMdev] troubles with ISD::FPOWI

2014 Sep 18

[LLVMdev] troubles with ISD::FPOWI

Hi, I'm stumped by how to handle fpowi. Here is the context: my architecture has i64, f32, and f64 registers. No i32. For calls & returns, we promote i32 to i64. There is no support in the architecture to perform fpowi - it has to go through the runtime. I'm using gfortran + dragonegg + llvm3.4 to generate .ll files via plugin. The fortran expression REAL = REAL ** INTEGER*4

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

2009 May 21

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

On Wed, May 20, 2009 at 4:55 PM, Dan Gohman <gohman at apple.com> wrote: > Can you explain why you chose the approach of using a new pass? > I pictured removing LegalizeDAG's type legalization code would > mostly consist of finding all the places that use TLI.getTypeAction > and just deleting code for handling its Expand and Promote. Are you > anticipating something more

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

2009 May 20

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

On May 20, 2009, at 1:34 PM, Eli Friedman wrote: > On Wed, May 20, 2009 at 1:19 PM, Eli Friedman > <eli.friedman at gmail.com> wrote: > >> Per subject, this patch adding an additional pass to handle vector >> >> operations; the idea is that this allows removing the code from >> >> LegalizeDAG that handles illegal types, which should be a significant

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

2009 May 21

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

On Wed, May 20, 2009 at 5:26 PM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Wed, May 20, 2009 at 4:55 PM, Dan Gohman <gohman at apple.com> wrote: >> Can you explain why you chose the approach of using a new pass? >> I pictured removing LegalizeDAG's type legalization code would >> mostly consist of finding all the places that use TLI.getTypeAction

[PATCH] nv50/ir: optimmize shl(a, 0) to a

2017 Apr 29

[PATCH] nv50/ir: optimmize shl(a, 0) to a

On Sat, Apr 29, 2017 at 12:46 PM, Karol Herbst <karolherbst at gmail.com> wrote: > helps two alien isolation shaders > > shader-db: > total instructions in shared programs : 4251497 -> 4251494 (-0.00%) > total gprs used in shared programs : 513962 -> 513962 (0.00%) > total local used in shared programs : 29797 -> 29797 (0.00%) > total bytes used in shared

[PATCH v2] nv50/ir: optimize shl(a, 0) to a

2017 Apr 29

[PATCH v2] nv50/ir: optimize shl(a, 0) to a

On Sat, Apr 29, 2017 at 6:09 PM, Karol Herbst <karolherbst at gmail.com> wrote: > helps two alien isolation shaders > > shader-db: > total instructions in shared programs : 4251497 -> 4251494 (-0.00%) > total gprs used in shared programs : 513962 -> 513962 (0.00%) > total local used in shared programs : 29797 -> 29797 (0.00%) > total bytes used in shared

similar to: Bug: Library functions for ISD::SRA, ISD::SHL, and ISD::SRL