similar to: MCRegisterClass mandatory vs preferred alignment?

Displaying 20 results from an estimated 1000 matches similar to: "MCRegisterClass mandatory vs preferred alignment?"

2015 Aug 31
2
MCRegisterClass mandatory vs preferred alignment?
On 08/31/2015 03:59 PM, Matthias Braun wrote: > Looks to me like the alignment is specified in tablegen. From Target.td: > > class RegisterClass<string namespace, list<ValueType> regTypes, int alignment, > dag regList, RegAltNameIndex idx = NoRegAltName> > > X86RegisterInfo.td: > > def VR256 : RegisterClass<"X86", [v32i8,
2011 Aug 25
2
[LLVMdev] AVX spill alignment
Hey guys, Are spills/reloads of AVX registers using aligned stores/loads? I can't seem to find the code that aligns the stack slots to 32-bytes. Could someone point me in the right direction? Thanks, Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110825/b5724dec/attachment.html>
2011 Sep 01
0
[LLVMdev] AVX spill alignment
On Aug 25, 2011, at 4:17 PM, Cameron McInally wrote: > Hey guys, > > Are spills/reloads of AVX registers using aligned stores/loads? Yes. > I can't > seem to find the code that aligns the stack slots to 32-bytes. Could > someone point me in the right direction? The register class has 256-bit spill alignment: def VR256 : RegisterClass<"X86", [v32i8, v16i16,
2018 Jul 23
4
[LoopVectorizer] Improving the performance of dot product reduction loop
~Craig On Mon, Jul 23, 2018 at 4:24 PM Hal Finkel <hfinkel at anl.gov> wrote: > > On 07/23/2018 05:22 PM, Craig Topper wrote: > > Hello all, > > This code https://godbolt.org/g/tTyxpf is a dot product reduction loop > multipying sign extended 16-bit values to produce a 32-bit accumulated > result. The x86 backend is currently not able to optimize it as well as gcc
2018 Jul 24
4
[LoopVectorizer] Improving the performance of dot product reduction loop
On Tue, Jul 24, 2018 at 6:10 AM Hal Finkel <hfinkel at anl.gov> wrote: > > On 07/23/2018 06:37 PM, Craig Topper wrote: > > > ~Craig > > > On Mon, Jul 23, 2018 at 4:24 PM Hal Finkel <hfinkel at anl.gov> wrote: > >> >> On 07/23/2018 05:22 PM, Craig Topper wrote: >> >> Hello all, >> >> This code https://godbolt.org/g/tTyxpf
2018 Jul 23
3
[LoopVectorizer] Improving the performance of dot product reduction loop
Hello all, This code https://godbolt.org/g/tTyxpf is a dot product reduction loop multipying sign extended 16-bit values to produce a 32-bit accumulated result. The x86 backend is currently not able to optimize it as well as gcc and icc. The IR we are getting from the loop vectorizer has several v8i32 adds and muls inside the loop. These are fed by v8i16 loads and sexts from v8i16 to v8i32. The
2020 Jun 30
5
[RFC] Semi-Automatic clang-format of files with low frequency
I 100% get that we might not like the decisions clang-format is making, but how does one overcome this when adding new code? The pre-merge checks enforce clang-formatting before commit and that's a common review comment anyway for those who didn't join the pre-merge checking group. I'm just wondering are we not all following the same guidelines? Concerns of clang-format not being good
2011 Sep 01
1
[LLVMdev] AVX spill alignment
Ah, thanks. That seems easy enough. Sorry to be pedantic, but does that snippet also handle cases where the frame pointer, %rbp, needs to be 32-byte aligned when dynamic allocas are present? I've looked at the ABI, but I don't see any guarantees about 32-byte frame alignment for AVX. That can be trouble when spill slots are based off of the frame pointer, not the stack pointer. Please
2018 Jul 23
2
[LoopVectorizer] Improving the performance of dot product reduction loop
On 07/23/2018 06:23 PM, Hal Finkel via llvm-dev wrote: > > On 07/23/2018 05:22 PM, Craig Topper wrote: >> Hello all, >> >> This code https://godbolt.org/g/tTyxpf is a dot product reduction >> loop multipying sign extended 16-bit values to produce a 32-bit >> accumulated result. The x86 backend is currently not able to optimize >> it as well as gcc and icc.
2012 Jun 25
3
[LLVMdev] Boolean floats and v4i1
Hello, I'm working on support for the SIMD instruction set on our new BG/Q supercomputer. This instruction set is v4f64 (with the exception of some int <-> fp conversions, floating-point only). The vectorized comparisons, logical operations and selects also exclusively use floating-point inputs. For those inputs that are logically vectors of booleans the system uses the following
2012 Jun 25
2
[LLVMdev] Boolean floats and v4i1
On Mon, 25 Jun 2012 05:45:57 +0000 "Rotem, Nadav" <nadav.rotem at intel.com> wrote: > Hi Hal, > > Why do say that the type v4i64 is broken ? You can specify that this > type has no legal operations and the codegen will lower ("legalize") > them to something that works on your platform. For example, the AND operation is really only an AND operation
2012 Jun 25
0
[LLVMdev] Boolean floats and v4i1
You could set the AND operation action to custom. The problem is that you would have no way of knowing if the type 'v4i64' originated from v4i1 or v4i64. And I don't think that you can use SimplifyDemandedBits (to discover if only the high bit is set) during the legalizer because the DAG is in a strange state, but I could be mistaken on this one. Okay, here is another idea.
2012 Jun 25
0
[LLVMdev] Boolean floats and v4i1
Hi Hal, Why do say that the type v4i64 is broken ? You can specify that this type has no legal operations and the codegen will lower ("legalize") them to something that works on your platform. Nadav -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel Sent: Monday, June 25, 2012 06:28 To: LLVM Developers
2019 Jul 18
2
Question about TableGen RegisterClass definition
Hi All, I have a question about TableGen RegisterClass definition. I need to map different size of MVTs into a register class as below. def TestReg : RegisterClass<"Test", [v8i32, v4i32], ...> When I look at TableGen and CodeGen, it looks the types are used as following: 1. MCRegisterClass's RegSize and Alignment 2. SpillSize in TableGen 3. Type constraint for instruction
2012 Mar 06
2
[LLVMdev] Recent changes to MCRegisterClass fields: uint8_t is too narrow
Hi all, in r152019 (from ctopper), the number of available registers of any type in a machine description is decreased to 256 because it needs to be encoded in uint8_t now. I'm trying to support an experimental embedded architecture with more registers (out of tree), but now that becomes impossible. Anyone knows a solution? Thanks, Bjorn De Sutter Computer Systems Lab Ghent University
2012 Mar 06
0
[LLVMdev] Recent changes to MCRegisterClass fields: uint8_t is too narrow
I changed it to uint16_t in r152100. Is that enough for your architecture? On Tue, Mar 6, 2012 at 12:24 AM, Bjorn De Sutter < bjorn.desutter at elis.ugent.be> wrote: > Hi all, > > in r152019 (from ctopper), the number of available registers of any type > in a machine description is decreased to 256 because it needs to be encoded > in uint8_t now. I'm trying to support an
2012 Jul 30
2
[LLVMdev] Vector promotion broken for <2 x [i8|i16]>
Hrmm.... PromoteVectorOp doesn't seem to follow this at all. http://llvm.org/svn/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp SDValue VectorLegalizer::PromoteVectorOp(SDValue Op) { // Vector "promotion" is basically just bitcasting and doing the operation // in a different type. For example, x86 promotes ISD::AND on v2i32 to // v1i64. EVT VT =
2012 Jul 30
2
[LLVMdev] Vector promotion broken for <2 x [i8|i16]>
v4i8 itself is a legal type, just not on the 'AND' operation. So there seems to be multiple problems here. 1) PromoteVectorOp doesn't handle the case where the types are not the same size, this occurs because #2 2) getTypeToPromoteTo doesn't actual check to see if the type it should promote to makes any sense. 3) PromoteVectorOp also doesn't handle the case where
2010 Aug 04
2
[LLVMdev] x86 Vector Shuffle Patterns
I have a few questions about the new vector shuffle matching code in the x86 .td files. It's a big improvement over the old system and provides the context that code generation for AVX needs. This is great! I'm asking because I'm having some trouble converting some AVX patterns over to the new system. I'm getting this error from tblgen: VyPERM2F128PDirrmi: (set:isVoid
2016 Aug 17
3
RFC: Disambiguate RegClass->getSize()
*** Problem The documentation for function "getSize" in both MCRegisterClass and TargetRegisterClass states: /// getSize - Return the size of the register in bytes, which is also the size /// of a stack slot allocated to hold a spilled copy of this register. The problem is that these two values are not always the same. For example, Hexagon has predicate registers that are 8