thr3ads.net - search: "ogt"

Displaying 20 results from an estimated 47 matches for "ogt".

Did you mean: ogg

[LLVMdev] Invalid comparison instruction generation

2008 Nov 11

[LLVMdev] Invalid comparison instruction generation

Eli, Using the variables from the original IR, assuming tmp == tmp1 and assume the value is not nan ogt(tmp, tmp1) is !isnan(tmp) && !isnan(tmp1) && tmp > tmp1, or false ule(tmp, tmp1) is isnan(tmp) || isnan(tmp1) || tmp <= tmp1, or true So, this is invalid, or am I misunderstanding what ogt and ule stand for? Assuming this is valid, why convert comparison instructions instead...

[LLVMdev] Invalid comparison instruction generation

2008 Nov 10

[LLVMdev] Invalid comparison instruction generation

...dr store double %y, double* %y.addr store double addrspace(11)* %result, double addrspace(11)** %result.addr %tmp = load double* %x.addr ; <double> [#uses=1] %tmp1 = load double* %y.addr ; <double> [#uses=1] %cmp = fcmp ogt double %tmp, %tmp1 ; <i1> [#uses=1] br i1 %cmp, label %ifthen, label %ifend ifthen: ; preds = %entry %tmp2 = load double addrspace(11)** %result.addr ; <double addrspace(11)*> [#uses=1] %tmp3 = load double* %x.addr ; <dou...

[LLVMdev] Invalid comparison instruction generation

2008 Nov 11

[LLVMdev] Invalid comparison instruction generation

On Mon, Nov 10, 2008 at 5:00 PM, Villmow, Micah <Micah.Villmow at amd.com> wrote: > Eli, > Using the variables from the original IR, > assuming tmp == tmp1 and assume the value is not nan > ogt(tmp, tmp1) is !isnan(tmp) && !isnan(tmp1) && tmp > tmp1, or false > ule(tmp, tmp1) is isnan(tmp) || isnan(tmp1) || tmp <= tmp1, or true Correct; in fact, ogt and ule are exact opposites. > So, this is invalid, or am I misunderstanding what ogt and ule stand > for?...

[LLVMdev] Invalid comparison instruction generation

2008 Nov 11

[LLVMdev] Invalid comparison instruction generation

On Mon, Nov 10, 2008 at 3:06 PM, Villmow, Micah <Micah.Villmow at amd.com> wrote: > With the above kernel run through llc with -march=x86 > -view-dag-combine1-dags I still see the ogt as the comparison operation, but > when I run it with llc -march=x86 -view-legalize-dags the ogt node has been > transformed into a ule. Okay... I can see that in the attached graph. > So, my question is, how do I get llvm to stop doing invalid translation of > comparison instructions...

[StructurizeCFG] Trouble with branches out of a loop

2015 Nov 02

[StructurizeCFG] Trouble with branches out of a loop

...ithm used? As an aside, is there any documentation for the algorithm used? Is it based on a published paper? The input IR I have is the following: define <4 x float> @structurizer_test(<4 x float> %inp.coerce) { %1 = extractelement <4 x float> %inp.coerce, i32 0 %2 = fcmp ogt float %1, 0.000000e+00 br i1 %2, label %.lr.ph.i, label %._crit_edge.i .lr.ph.i: ; preds = %7, %0 %i.03.i = phi float [ %8, %7 ], [ 0.000000e+00, %0 ] %ret.02.i = phi <4 x float> [ %5, %7 ], [ <float 1.000000e+00, float 1.000000e+00, float 1.0...

[LLVMdev] Irreducible CFG from tail duplication

2008 Jul 24

[LLVMdev] Irreducible CFG from tail duplication

...oks like there was once an open project for a pass to make irreducible graphs reducible. Was that ever implemented? - Mark ; "opt -inline -tailduplicate" makes an irreducible CFG from this code @x = weak global float 0.0 define internal fastcc void @foo(float %f) { entry: %b = fcmp ogt float %f, 0.0 br i1 %b, label %then, label %continue then: store float 0.0, float* @x br label %continue continue: ret void } define void @test() { entry: %x = load float* @x call fastcc void @foo( float %x ) %neg = sub float 0.0, %x call fastcc void @foo( float %neg ) ret void }

[LLVMdev] some thoughts on the semantics of !fpmath

2012 Apr 17

[LLVMdev] some thoughts on the semantics of !fpmath

...; With !fpmath, theoretically there are ways this can fail. > > I don't understand how it can not be safe: if the metadata is dropped > then the optimizers have to be more strict, thus it is safe. Here's an example: %z = fadd float %x, %y, !fpmath !{ float 1000.0 } %p = fcmp ogt %z, 0.0 br i1 %p, label %true, label %false true: %d = call float @llvm.sqrt.f32(float %z) This ought to be safe, since no matter now imprecise the fadd is, the compare and branch protects the sqrt. Now suppose a metadata-unaware optimizer can do value-range analysis and can prove that the fa...

[LLVMdev] Irreducible CFG from tail duplication

2008 Jul 24

[LLVMdev] Irreducible CFG from tail duplication

...hat "then.i2" and "continue.i3" both have predecessors ; "then.i" and "foo.exit", making an irreducible "X" in the control-flow graph. @x = weak global float 0.000000e+00 define void @test() { entry: %x = load float* @x %b.i = fcmp ogt float %x, 0.000000e+00 %neg = sub float 0.000000e+00, %x %b.i1 = fcmp ogt float %neg, 0.000000e+00 br i1 %b.i, label %then.i, label %continue.i then.i: ; preds = %entry store float 0.000000e+00, float* @x br i1 %b.i1, label %then.i2, label %c...

[LLVMdev] Error when cond of select instruction is a vector

2011 Oct 19

[LLVMdev] Error when cond of select instruction is a vector

...loca <2 x float> %Cy119 = alloca <2 x float> br label %B1 B1: ; preds = %entry %0 = load <2 x float>* %Cy119 %1 = fptosi <2 x float> %0 to <2 x i32> %2 = sitofp <2 x i32> %1 to <2 x float> %3 = fcmp ogt <2 x float> %0, zeroinitializer %4 = fadd <2 x float> %2, <float 1.000000e+00, float 1.000000e+00> %5 = select <2 x i1> %3, <2 x float> %4, <2 x float> %2 %6 = fcmp oeq <2 x float> %2, %0 %7 = select <2 x i1> %6, <2 x float> %0, <2 x...

rsync error: protocol incompatibility (code 2) at rsync.c, using --iconv=. and (code 2) and rsync-3.0.0pre8

2008 Feb 07

rsync error: protocol incompatibility (code 2) at rsync.c, using --iconv=. and (code 2) and rsync-3.0.0pre8

...pre8] caused by the attached document file. I'm using libiconv-1.9.1 and a linux-2.4.31 Kernel System based on an open embedded linux similar to openslug libc-2.3.90 system my rsync command line ist as follows: /sbin/rsync -v --log-file=/var/log/rsync.status.log --delete-before --partial -y -ogt -vaxH /mnt/download /export/backup Thank you very much for any help, Edmond -------------- next part -------------- A non-text attachment was scrubbed... Name: Release Notes - Microsoft Internet Explorer 5.5 - German Language settings.zip Type: application/x-zip-compressed Size: 10812 bytes Desc...

[LLVMdev] branch on vector compare?

2012 Sep 02

[LLVMdev] branch on vector compare?

...based on a vector compare. I've found a slow way (below) which goes through memory. Is there some idiom I'm missing so that it would use for instance movmsk for SSE or vcmpgt & cr6 for altivec? Or do I need to resort to calling the intrinsic directly? Thanks, Stephen. %16 = fcmp ogt <4 x float> %15, %cr %17 = extractelement <4 x i1> %16, i32 0 %18 = extractelement <4 x i1> %16, i32 1 %19 = extractelement <4 x i1> %16, i32 2 %20 = extractelement <4 x i1> %16, i32 3 %21 = or i1 %17, %18 %22 = or i1 %19, %20 %23 = or i1 %21, %22 br i1...

[LLVMdev] Irreducible CFG from tail duplication

2008 Jul 24

[LLVMdev] Irreducible CFG from tail duplication

On Thu, Jul 24, 2008 at 2:00 PM, Mark Leone <markleone at gmail.com> wrote: > Is irreducibility a problem for existing LLVM passes? There aren't any LLVM passes that expect a reducible CFG at the moment; of course, some passes are more effective with reducible CFGs. > It looks like > there was once an open project for a pass to make irreducible graphs > reducible. Was that

[LLVMdev] branch on vector compare?

2012 Sep 04

[LLVMdev] branch on vector compare?

Roland Scheidegger <sroland <at> vmware.com> writes: > This looks quite similar to something I filed a bug on (12312). Michael > Liao submitted fixes for this, so I think > if you change it to > %16 = fcmp ogt <4 x float> %15, %cr > %17 = sext <4 x i1> %16 to <4 x i32> > %18 = bitcast <4 x i32> %17 to i128 > %19 = icmp ne i128 %18, 0 > br i1 %19, label %true1, label %false2 > > should do the trick (one cmpps + one ptest + one br instruction). > This,...

[LLVMdev] [Help] How can we call an object's virtual function inside IR?

2010 Jan 05

[LLVMdev] [Help] How can we call an object's virtual function inside IR?

...fine i1 @expr(double* %record) { entry: %0 = getelementptr double* %record, i32 0 ; <double*> [#uses=1] %1 = load double* %0 ; <double> [#uses=1] %2 = frem double %1, 1.000000e+01 ; <double> [#uses=1] %3 = fcmp ogt double %2, 3.000000e+00 ; <i1> [#uses=1] ret i1 %3 } Now, I would like to change the type of the argument from double * to a pointer to an object ( C++ class ) like ClassA * %record, and inside the function body, I would like to call the virtual functions of the ClassA %rec...

[LLVMdev] branch on vector compare?

2012 Sep 04

[LLVMdev] branch on vector compare?

...both i1 types, it will be allowed and will result > in something which can be branched on. > > I have quite a bit of reading ahead it seems! This looks quite similar to something I filed a bug on (12312). Michael Liao submitted fixes for this, so I think if you change it to %16 = fcmp ogt <4 x float> %15, %cr %17 = sext <4 x i1> %16 to <4 x i32> %18 = bitcast <4 x i32> %17 to i128 %19 = icmp ne i128 %18, 0 br i1 %19, label %true1, label %false2 should do the trick (one cmpps + one ptest + one br instruction). This, however, requires sse41 which I don...

Unsafe floating point operation (FDiv & FRem) in LoopVectorizer

2018 Sep 25

Unsafe floating point operation (FDiv & FRem) in LoopVectorizer

...body, %vector.ph %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ] %0 = getelementptr inbounds float, float* %C, i64 %index %1 = bitcast float* %0 to <8 x float>* %wide.load = load <8 x float>, <8 x float>* %1, align 4, !tbaa !2, !alias.scope !6 %2 = fcmp ogt <8 x float> %wide.load, %broadcast.splat30 %3 = getelementptr inbounds float, float* %B, i64 %index %4 = bitcast float* %3 to <8 x float>* %wide.masked.load = call <8 x float> @llvm.masked.load.v8f32.p0v8f32(<8 x float>* %4, i32 4, <8 x i1> %2, <8 x float>...

[LLVMdev] Error when cond of select instruction is a vector

2011 Oct 19

[LLVMdev] Error when cond of select instruction is a vector

[LLVMdev] branch on vector compare?

2012 Sep 03

[LLVMdev] branch on vector compare?

> > which goes through memory. Is there some idiom I'm missing so that it would use > > for instance movmsk for SSE or vcmpgt & cr6 for altivec? > > I don't think you are missing anything: LLVM IR has no support for horizontal > operations like or'ing the elements of a vector of boolean together. The code > generators do try to recognize a few idioms and

Speculative execution of FP divide Instructions - Call-Graph Simplify

2017 Mar 15

Speculative execution of FP divide Instructions - Call-Graph Simplify

Hi all, I came across an issue caused by the Call-Graph Simplify Pass. Here is a a small repro: ``` define double @foo(double %a1, double %a2, double %a3) #0 { entry: %a_mul = fmul double %a1, %a2 %a_cmp = fcmp ogt double %a3, %a_mul br i1 %a_cmp, label %a.then, label %a.end a.then: %a_div = fdiv double %a_mul, %a3 br label %a.end a.end: %a_factor = phi double [ %a_div, %a.then ], [ 1.000000e+00, %entry ] ret double %a_factor } ``` Here, the conditional is guarding a possible division by zero. How...

[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm

2017 Mar 01

[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm

...602, align 1 %rtb_Sum3_737 = load double, double* %rtb_Sum3_, align 8 %_rtP_738 = load %P_repro_T.2*, %P_repro_T.2** %_rtP_, align 8 %603 = getelementptr inbounds %P_repro_T.2, %P_repro_T.2* %_rtP_738, i64 0, i32 154 %_rtP__Switch_Threshold = load double, double* %603, align 1 %604 = fcmp ogt double %rtb_Sum3_737, %_rtP__Switch_Threshold %_rtB_740 = load %B_repro_T.0*, %B_repro_T.0** %_rtB_, align 8 br i1 %604, label %true73, label %false74 working.ll is a slightly different model from broken.ll, in that it loads the "zero value" from memory and does fcmp ogt instead of f...

search for: ogt