thr3ads.net - search: "copy_to

Displaying 19 results from an estimated 19 matches for "copy_to_regclass".

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

...: FR32, FR64, and VR128. The sub_ss and sub_sd indexes used to play a role in selecting the right register class, but not any longer. That is all derived from the instruction descriptions now. As far as I can tell, all sub-register operations involving sub_ss and sub_sd can simply be replaced with COPY_TO_REGCLASS: def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2), sub_sd))>; Becomes: def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), (VMOVSDrr VR128:$...

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

...>, DwarfRegNum<[18, 22, 22]>; > ... I'm confused. Below you note that they are used in patterns, so they are certainly mentioned more than just in the code above. > As far as I can tell, all sub-register operations involving sub_ss and > sub_sd can simply be replaced with COPY_TO_REGCLASS: > > def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), > (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2), > sub_sd))>; > > Becomes: > > def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$...

Need help figuring out a isNopCopy() assert

2020 Apr 16

Need help figuring out a isNopCopy() assert

I'm trying to fix a bug in the PowerPC SPE backend that prevents a bunch of FreeBSD ports from building, including gtk20. The attached file, generated from the following C source, triggers the "Def == PreviousDef" assertion in isNopCopy(): typedef float a; typedef struct { a b, c; } complex; d(complex *e, complex *h) { double f = h->c, g = h->b; i(g); e->c = g *

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

On Jul 26, 2012, at 9:43 AM, dag at cray.com wrote: > Jakob Stoklund Olesen <jolesen at apple.com> writes: > >> As far as I can tell, all sub-register operations involving sub_ss and >> sub_sd can simply be replaced with COPY_TO_REGCLASS: >> >> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), >> (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2), >> sub_sd))>; >> >> Becomes: >> >> def : Pat<(v4i32...

COPYs between register classes

2020 Feb 22

COPYs between register classes

...of such copies (also at -O0), but this turned out to not be quite simple. Just selecting a pseudo instruction with a custom inserter does not seem to work since CopyToReg/CopyFromReg have special handlings in InstrEmitter. I then tried in SystemZDAGToDAG.cpp to select a CopyToReg to (CopyToReg COPY_TO_REGCLASS), which worked fine. But I could not get the same results with CopyFromReg. (COPY_TO_REGCLASS CopyFromReg) only resulted in a later COPY into GR32, but the COPY from the Access register was still first made to GRX32. One alternative might be to let InstrEmitter deduce the needed register class...

Matching ConstantFPSDNode tablegen

2018 Jun 07

Matching ConstantFPSDNode tablegen

...eral match like: %v = call <4 x float> @foo(i32 15, float %s, float 0.0, <8 x i32> %rsrc, <4 x i32> %samp, i1 0, i32 0, i32 0) ret <4 x float> %v def : XXXPat<(v4f32 (int_foo i32:$mask, f32:$s, 0, v8i32:$rsrc, v4i32:$sampler, i1:$unorm, 0, i32:$cachepolicy)), (FOO_MI (COPY_TO_REGCLASS ?:$s, 32RegClass), ?:$rsrc, ?:$sampler, (as_i32imm ?:$mask), (as_i1imm ?:$unorm), (as_i1imm ?:$cachepolicy), (as_i1imm ?:$cachepolicy), 0, 0, 0, { 0 })>; which would be ideal. This seems to be because OPC_CheckInteger only checks for ConstantSDNode and not ConstantFPSDNode. So it was suggested...

How to specify the RegisterClass of an IMPLICIT_DEF?

2018 Apr 12

How to specify the RegisterClass of an IMPLICIT_DEF?

Hi, I'm implementing the built_vector as an IMPLICIT_DEF followed by INSERT_SUBREGs. This approach is the one of the SPARC architecture. def : Pat<(build_vector (f32 fpimm:$a1), (f32 fpimm:$a2)), (INSERT_SUBREG(INSERT_SUBREG (v2f32 (IMPLICIT_DEF)), (i32 (COPY_TO_REGCLASS (MOVSUTO_A_iSLo (bitcast_fpimm_to_i32 f32:$a1)), FPUaOffsetClass)), A_UNIT_PART), (i32 (COPY_TO_REGCLASS (MOVSUTO_A_iSLo (bitcast_fpimm_to_i32 f32:$a2)), FPUaOffsetClass)), B_UNIT_PART)>; This work quite well: an IMPLICIT_DEF:v2f32 is generated. Selected selection DAG: BB#0 &...

How to specify the RegisterClass of an IMPLICIT_DEF?

2018 Apr 12

How to specify the RegisterClass of an IMPLICIT_DEF?

...it. In instruction selection, it's usually the type of the value that determines the register class of the register holding it. Also, the register classes of instruction operands can restrict it further. If you want to put a certain value into a register from a specific class, you can use COPY_TO_REGCLASS (just like the code you quoted does): Dst = COPY_TO_REGCLASS Src, RegClass -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers

2018 Apr 23

pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers

...mple, the result of FMUL_A_oo is implicitly the register FA_ROUTMUL. I have defined FPUaROUTMULRegisterClass containing only FA_ROUTMUL. During the instruction lowering, in order to avoid frequent spill out of FA_ROUTMUL, I systematically copy the result of FMUL_A_oo to a virtual register through a COPY_TO_REGCLASS. def : Pat<(fdiv f32:$OffsetA, f32:$OffsetB), (COPY_TO_REGCLASS (FDIV_A_oo FPUaOffsetOperand:$OffsetA,FPUaOffsetOperand:$OffsetB),FPUaOffsetClass)>; The instruction lowering goes as expected all instances of FMUL_A_oo are followed by a COPY, freeing the usage of FPUaROUTMU...

[RFC] Half-Precision Support in the Arm Backends

2018 Jan 18

[RFC] Half-Precision Support in the Arm Backends

...10, 0b01, 0, (outs SPR:$Sd), (ins SPR:$Sm), IIC_fpCVTSH, "vcvtb", ".f32.f16\t$Sd, $Sm", []>, Requires<[HasFP16]>, Sched<[WriteFPCVT]>; def : FP16Pat<(f16_to_fp GPR:$a), (VCVTBHS (COPY_TO_REGCLASS GPR:$a, SPR))>; def : FullFP16Pat<(f32 (fpextend HPR:$Sm)), (VCVTBHS (COPY_TO_REGLASS HPR:$Sm, SPR)>; I'm not sure of the COPY_TO_REGLASS semantics, but I would (dangerously) assume that it when it comes to copying the values between registers, it will be noticed...

[RFC] Half-Precision Support in the Arm Backends

2018 Jan 18

[RFC] Half-Precision Support in the Arm Backends

...lr when we don't have the Armv8.2-A FP16 instructions available, and thus only have the conversion instructions. The problem is in the conversion rules, some rewrite rules to be more specific, and I think this is one of the culprits: def : Pat<(f16_to_fp GPR:$a), (VCVTBHS (COPY_TO_REGCLASS GPR:$a, SPR))>; This rewrite rule is supposed to first move the GPR reg in to a S-registers: vmov s0, r0 and then to the conversion: vcvtb.f32.f16 s0, s0 This rewrite rule gets triggered because the ISEL DAG has indeed this funny node f16_to_fp (which models this register move...

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

Jakob Stoklund Olesen <jolesen at apple.com> writes: >> What happens if the result of the above pattern using COPY_TO_REGCLASS >> is spilled? Will we get a 64-bit store or a 128-bit store? > > This behavior isn't affected by the change. FR64 registers are spilled > with 64-bit stores, and VR128 registers are spilled with 128-bit > stores. > > When the register coalescer removes a copy between V...

[RFC] Half-Precision Support in the Arm Backends

2017 Dec 06

[RFC] Half-Precision Support in the Arm Backends

Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good. I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16 and FP16_TO_FP nodes are created to avoid inefficient code generation. I will double check if I can't achieve the same without using these nodes (because I really would like to get completely rid of them). Cheers, Sjoerd.

[LLVMdev] Constraints with Subregisters

2011 Oct 15

[LLVMdev] Constraints with Subregisters

Hello, is there a way to formulate a constraint like this: let Constraints = "${src:sub_even} != $dst" in { ... } , that is, only if a subregister of $src != $dst then ...? Perhaps this is entirely the wrong way anyway. I'm trying to implement (s/z/any)ext & trunc from 32 to 64 bit integer on a TI C64x processor. 64 bit ints are always stored in two adjacent registers (not

[LLVMdev] Types in TableGen instruction selection patterns

2013 Apr 20

[LLVMdev] Types in TableGen instruction selection patterns

On Sun, Mar 24, 2013 at 12:50:02PM -0700, Jakob Stoklund Olesen wrote: > I have updated TableGen to support a new format for instruction selection patterns. > > Before: > > def : Pat<(subc IntRegs:$b, IntRegs:$c), (SUBCCrr IntRegs:$b, IntRegs:$c)>; > > After: > > def : Pat<(subc i32:$b, i32:$c), (SUBCCrr $b, $c)>; > > Since the pattern matching

InlineAsm and allocation to wrong register for indirect access

2016 Jul 21

InlineAsm and allocation to wrong register for indirect access

Hi, I am seeing a case, in a private port, of an inline asm with indirect memory references being allocated invalid registers (i.e. registers that cannot be used on loads). For example, the inline asm constraint is correct: call void asm sideeffect "MOV $$r0, $0\0AMOV $$r0, $1\0A", "*m,*m,~{r0}"(i16* @a, i16* %b) #1, !srcloc !1 but then $0 and $1 are allocated to registers

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

On Jul 26, 2012, at 10:28 AM, dag at cray.com wrote: > Jakob Stoklund Olesen <jolesen at apple.com> writes: > >>> What happens if the result of the above pattern using COPY_TO_REGCLASS >>> is spilled? Will we get a 64-bit store or a 128-bit store? >> >> This behavior isn't affected by the change. FR64 registers are spilled >> with 64-bit stores, and VR128 registers are spilled with 128-bit >> stores. >> >> When the register coal...

[LLVMdev] Types in TableGen instruction selection patterns

2013 Mar 24

[LLVMdev] Types in TableGen instruction selection patterns

I have updated TableGen to support a new format for instruction selection patterns. Before: def : Pat<(subc IntRegs:$b, IntRegs:$c), (SUBCCrr IntRegs:$b, IntRegs:$c)>; After: def : Pat<(subc i32:$b, i32:$c), (SUBCCrr $b, $c)>; Since the pattern matching happens on a DAG with type labels, not register classes, I think it makes more sense to specify types directly on the input

Lowering ISD::TRUNCATE

2018 Aug 06

Lowering ISD::TRUNCATE

I'm working on defining the instructions and implementing the lowering code for a Z80 backend. For now, the backend supports only the native CPU-supported datatypes, which are 8 and 16 bits wide (i.e. no 32 bit long, float, ... yet). So far, a lot of the simple stuff like immediate loads and return values is very straightforward, but now I got stuck with ISD::TRUNCATE, as in:

search for: copy_to_regclass