thr3ads.net - search: "f64"

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

4

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

...;396>; > > .reg .s16 %rc<396>; > > .reg .s16 %rs<396>; > > .reg .s32 %r<396>; > > .reg .s64 %rl<396>; > > .reg .f32 %f<396>; > > .reg .f64 %fl<396>; > > > > mov.f64 %fl0, examples_2E_mandelbrot_2F_square_param_0; > > mul.f64 %fl0, %fl0, %fl0; > > mov.f64 func_retval0, %fl0; > > ret; > > } > > > >...

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

0

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

...> > > ) > > > { > > > .reg .pred %p<396>; > > > .reg .s16 %rc<396>; > > > .reg .s16 %rs<396>; > > > .reg .s32 %r<396>; > > > .reg .s64 %rl<396>; > > > .reg .f32 %f<396>; > > > .reg .f64 %fl<396>; > > > > > > mov.f64 %fl0, examples_2E_mandelbrot_2F_square_param_0; > > > mul.f64 %fl0, %fl0, %fl0; > > > mov.f64 func_retval0, %fl0; > > > ret; > > > } > > > > > > // .globl &g...

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

0

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

....reg .s16 %rc<396>; >> > .reg .s16 %rs<396>; >> > .reg .s32 %r<396>; >> > .reg .s64 %rl<396>; >> > .reg .f32 %f<396>; >> > .reg .f64 %fl<396>; >> > >> > mov.f64 %fl0, examples_2E_mandelbrot_2F_square_param_0; >> > mul.f64 %fl0, %fl0, %fl0; >> > mov.f64 func_retval0, %fl0; >> > ret; >> > } &g...

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

1

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

...%rc<396>; > >> > .reg .s16 %rs<396>; > >> > .reg .s32 %r<396>; > >> > .reg .s64 %rl<396>; > >> > .reg .f32 %f<396>; > >> > .reg .f64 %fl<396>; > >> > > >> > mov.f64 %fl0, > examples_2E_mandelbrot_2F_square_param_0; > >> > mul.f64 %fl0, %fl0, %fl0; > >> > mov.f64 func_retval0, %fl0; > >> >...

[LLVMdev] "Cannot select" error in 2.9

2011 Jul 29

2

[LLVMdev] "Cannot select" error in 2.9

...double %f0, %f1 %2 = add double %1, %f2 %3 = add double %2, %f3 %4 = add double %3, %f4 %5 = add double %4, %f5 %6 = add double %5, %f6 %7 = add double %6, %f7 %8 = add double %7, %f8 %9 = add double %8, %f9 ret double %9 } LLVM error I get: LLVM ERROR: Cannot select: 0xd1b720: f64 = add 0xd1b620, 0xd1ae20 [ORD=9] [ID=32] 0xd1b620: f64 = add 0xd1b520, 0xd1ac20 [ORD=8] [ID=31] 0xd1b520: f64 = add 0xd1b420, 0xd1a920 [ORD=7] [ID=30] 0xd1b420: f64 = add 0xd1b320, 0xd1a610 [ORD=6] [ID=29] 0xd1b320: f64 = add 0xd1b220, 0xd1a410 [ORD=5] [ID=28] 0xd1b220...

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

2

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

...andelbrot_2F_square( .reg .b64 examples_2E_mandelbrot_2F_square_param_0 ) { .reg .pred %p<396>; .reg .s16 %rc<396>; .reg .s16 %rs<396>; .reg .s32 %r<396>; .reg .s64 %rl<396>; .reg .f32 %f<396>; .reg .f64 %fl<396>; mov.f64 %fl0, examples_2E_mandelbrot_2F_square_param_0; mul.f64 %fl0, %fl0, %fl0; mov.f64 func_retval0, %fl0; ret; } // .globl examples_2E_mandelbrot_2F_calc_2D_iteration .func (.reg .b64 func_retval0) examples_2E_mandelbrot_2...

[LLVMdev] "Cannot select" error in 2.9

2011 Jul 29

0

[LLVMdev] "Cannot select" error in 2.9

...gt; %4 = add double %3, %f4 > %5 = add double %4, %f5 > %6 = add double %5, %f6 > %7 = add double %6, %f7 > %8 = add double %7, %f8 > %9 = add double %8, %f9 > ret double %9 > } > > > LLVM error I get: > > LLVM ERROR: Cannot select: 0xd1b720: f64 = add 0xd1b620, 0xd1ae20 [ORD=9] > [ID=32] > 0xd1b620: f64 = add 0xd1b520, 0xd1ac20 [ORD=8] [ID=31] > 0xd1b520: f64 = add 0xd1b420, 0xd1a920 [ORD=7] [ID=30] > 0xd1b420: f64 = add 0xd1b320, 0xd1a610 [ORD=6] [ID=29] > 0xd1b320: f64 = add 0xd1b220, 0xd1a410 [ORD=...

Pseudo-instruction that overwrites its input register

2017 May 30

2

Pseudo-instruction that overwrites its input register

...let BaseName = "XSMADDADP" in { let isCommutable = 1 in def XSMADDADP : XX3Form<60, 33, (outs vsfrc:$XT), (ins vsfrc:$XTi, vsfrc:$XA, vsfrc:$XB), "xsmaddadp $XT, $XA, $XB", IIC_VecFP, [(set f64:$XT, (fma f64:$XA, f64:$XB, f64:$XTi))]>, RegConstraint<"$XTi = $XT">, NoEncode<"$XTi">, AltVSXFMARel; If I'm reading this right, this matches an instruction that updates $XT by taking the current $XT, an...

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

0

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

...聽聽聽聽聽聽 .reg .pred %p<396>; >聽聽聽聽聽聽聽聽 .reg .s16 %rc<396>; >聽聽聽聽聽聽聽聽 .reg .s16 %rs<396>; >聽聽聽聽聽聽聽聽 .reg .s32 %r<396>; >聽聽聽聽聽聽聽聽 .reg .s64 %rl<396>; >聽聽聽聽聽聽聽聽 .reg .f32 %f<396>; >聽聽聽聽聽聽聽聽 .reg .f64 %fl<396>; > >聽聽聽聽聽聽聽聽 mov.f64 %fl0, examples_2E_mandelbrot_2F_square_param_0; >聽聽聽聽聽聽聽聽 mul.f64聽聽聽聽聽聽聽聽 %fl0, %fl0, %fl0; >聽聽聽聽聽聽聽聽 mov.f64 func_retval0, %fl0; >聽聽聽聽聽聽聽聽 ret; > } > >聽聽聽聽聽聽聽聽 // .globl聽聽聽聽聽聽 exampl...

lower 64 bits constant

2015 Dec 02

4

lower 64 bits constant

...d 'double'. So I define them like this: def CONSTI64 : InstFOO<(outs GRWideRegs:$dst), (ins i64imm:$src), "const-long $dst, $src", [(set (i64 GRWideRegs:$dst), imm:$src)]>{ let isMoveImm = 1; } def CONSTF64 : InstFOO<(outs GRWideRegs:$dst), (ins f64imm:$src), "const-double $dst, $src", [(set (f64 GRWideRegs:$dst), fpimm:$src)]>{ let isMoveImm = 1; } GRWideRegs can be f64 and i64. However, the 'const-long' w...

[LLVMdev] FP emulation (continued)

2006 Nov 17

2

[LLVMdev] FP emulation (continued)

Hi, I still have some questions about FP emulation for my embedded target. To recap a bit: My target only has integer registers and no hardware support for FP. FP is supported only via emulation. Only f64 is supported. All FP operations should be implemented to use i32 registers. Based on the fruitful discussions on this list I was already able to implement mapping of the FP operations to special library calls. I also implemented a simple version of the register mapping, where I introduced a bogu...

[PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions

2014 Jul 05

1

[PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32/64->(U16/32, S16/32) and (U16/32, S16/32)->F32 No piglit regressions observed on nv50 and nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V2: fix usage of wrong variable V3: enable F64 support V4: - disable F64 support again - handle saturate flag: clamp to min/max if needed .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 121 +++++++++++++++++++++ 1 file changed, 121 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/driv...

Potential bug in SelectionDAGLegalize::ConvertNodeToLibcall()?

2019 Jan 02

5

Potential bug in SelectionDAGLegalize::ConvertNodeToLibcall()?

Hi, I have a custom lowering operation on ISD::BITCAST for the PowerPC/SPE target, to convert 'f64 bitcast (i64 build_pair i32, i32)' into a 'f64 BUILD_SPE64 i32, i32' node, which can be seen at https://reviews.llvm.org/D54583. However, when building compiler-rt's lib/builtins/divdc3.c an assertion is triggered that BUILD_PAIR is not legal on line 24. There should be no bitcast(...

[LLVMdev] FP emulation (continued)

2006 Nov 20

0

[LLVMdev] FP emulation (continued)

On Fri, 17 Nov 2006, Roman Levenstein wrote: > I still have some questions about FP emulation for my embedded target. > To recap a bit: > My target only has integer registers and no hardware support for FP. FP > is supported only via emulation. Only f64 is supported. All FP > operations should be implemented to use i32 registers. ok > allocation. But anyway, I have an almost working compiler with integer > and FP support for my rather specific embedded target! This shows a > very impressive quality of the LLVM compiler. Great! >...

[LLVMdev] fp_round libcall

2008 Jul 07

5

[LLVMdev] fp_round libcall

Hi, I'm trying to emit FP_ROUND f64 -> f32 considering a mips target that only supports single float point operations. The problem is that f32 is considered legal on this target but f64 doesn't and the only way I can codegen this instruction is using setConvertAction(MVT::f64, MVT::f32, Expand), which issues a EmitStackConvert...

[LLVMdev] Multi instruction pattern help

2008 Oct 07

3

[LLVMdev] Multi instruction pattern help

Chris, Thanks for the help, this will help me with writing more patterns, but I am still hitting another roadblock. I attempted what you suggested and it fixed that issue, but then it started giving a warning that there is an unknown node in the resulting pattern. // unsigned int: f64->i32 ==> f64->f32 + f32->i32 def : Pat<(i32 (fp_to_uint (f64 GPR:$src0))), (i32 (fp_to_uint (f32 (dp_to_fp (f64 GPR:$src0)))))>; 1>Building AMDil.td instruction selector implementation with tblgen 1>(fp_to_uint:i32 (dp_to_fp:f32 GPR:f64:$src0)) 1>f:\hq\main...

Potential bug in SelectionDAGLegalize::ConvertNodeToLibcall()?

2019 Jan 03

3

Potential bug in SelectionDAGLegalize::ConvertNodeToLibcall()?

.... Particularly: * Make LowerCallTo() a virtual function, so it can be wrapped by a subclass. * Implement LowerCallTo() in PPCTargetLowering to wrap TargetLowering::LowerCallTo() and legalize the return node when targeting SPE. * Augment PPCTargetLowering::LowerCall_32SVR4() to legalize MVT::f64 arguments that have been pre-processed into EXTRACT_ELEMENT(i64 BITCAST f64, 0/1) The purpose of this being to legalize intermediate illegal types post-type legalization. Is there a better approach? Comments from anyone else? - Justin On Wed, 2 Jan 2019 11:39:59 -0500 Nemanja Ivanovic <...

[LLVMdev] FP emulation (continued)

2006 Nov 20

3

[LLVMdev] FP emulation (continued)

...underestimated how much LLVM legalizer/expander relay on expandable types to be integers (see my explanations below). --- Chris Lattner <sabre at nondot.org> wrote: > > Another opportunity, as Chris indicated in his previous mails (see > > below), would be to expose the fact that f64 regs really are > integer > > registers. > > Right. > > >> The target independent parts would need to know how to do this. > >> Specifically it would need to know how to "expand" f64 to 2x i32. > > > > I tried to implement it, but I still...

Potential bug in SelectionDAGLegalize::ConvertNodeToLibcall()?

2019 Jan 04

2

Potential bug in SelectionDAGLegalize::ConvertNodeToLibcall()?

...ual function, so it can be wrapped by a >> subclass. >> * Implement LowerCallTo() in PPCTargetLowering to wrap >> TargetLowering::LowerCallTo() and legalize the return node when >> targeting SPE. >> * Augment PPCTargetLowering::LowerCall_32SVR4() to legalize MVT::f64 >> arguments that have been pre-processed into >> EXTRACT_ELEMENT(i64 BITCAST f64, 0/1) >> >> The purpose of this being to legalize intermediate illegal types >> post-type legalization. >> >> Is there a better approach? Comments from anyone else? >&...

Pseudo-instruction that overwrites its input register

2017 May 28

2

Pseudo-instruction that overwrites its input register

On Sun, 28 May 2017, David Chisnall wrote: >> let Constraints = "@earlyclobber $reg" in >> def LDWRdPtr : Pseudo<(outs DREGS:$reg), >> (ins PTRREGS:$ptrreg), >> "ldw\t$reg, $ptrreg", >> [(set i16:$reg, (load i16:$ptrreg))]>, >>

search for: f64