Tobias Klausmann
2015-Jan-11 19:56 UTC
[Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
On 11.01.2015 20:19, Ilia Mirkin wrote:> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> >> On 11.01.2015 01:58, Ilia Mirkin wrote: >>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann >>> <tobias.johannes.klausmann at mni.thm.de> wrote: >>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>>> {S16/32})->F32 >>>> >>>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> >>>> --- >>>> V2: beat me, whip me, split out F64 >>>> >>>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 >>>> ++++++++++++++++++++++ >>>> 1 file changed, 81 insertions(+) >>>> >>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> index 9a0bb60..741c74f 100644 >>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue >>>> &imm0, int s) >>>> i->op = OP_MOV; >>>> break; >>>> } >>>> + case OP_CVT: { >>>> + Storage res; >>>> + bld.setPosition(i, true); /* make sure bld is init'ed */ >>>> + switch(i->dType) { >>>> + case TYPE_U16: >>>> + switch (i->sType) { >>>> + case TYPE_F32: >>>> + if (i->saturate) >>>> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >>>> + UINT16_MAX)); >>> Where did this saturate stuff come from? It doesn't make sense to >>> saturate to a non-float dtype. I'd go ahead and just >>> assert(!i->saturate) in the int dtype cases. >>> >>> One does wonder what the hw does if the float doesn't fit in the >>> destination... whether it saturates or not. I don't hugely care >>> though. >> Actually i can't remember why that was added in the first place, i'll go >> ahead and follow your advice here. > Oh wait... this was to support saturating an array access into a u16... > > const int sat = (i->op == OP_TXF) ? 1 : 0; > DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32; > bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat; > > So... basically if the source is a U32 and the dest is a U16, we want > to saturate there? IMO this is such a minor use-case that it doesn't > really matter. However I guess you can keep the saturate bits around > if you like.We can do it with or without the saturate if we rely on the test, assert(!i->saturate)'ing is the only thing that breaks the test you sure meant: glsl-resource-not-bound 1DArray glsl-resource-not-bound 2DArray glsl-resource-not-bound 2DMSArray> > -ilia
Ilia Mirkin
2015-Jan-11 19:57 UTC
[Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
On Sun, Jan 11, 2015 at 2:56 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote:> > > On 11.01.2015 20:19, Ilia Mirkin wrote: >> >> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann >> <tobias.johannes.klausmann at mni.thm.de> wrote: >>> >>> >>> On 11.01.2015 01:58, Ilia Mirkin wrote: >>>> >>>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann >>>> <tobias.johannes.klausmann at mni.thm.de> wrote: >>>>> >>>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>>>> {S16/32})->F32 >>>>> >>>>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> >>>>> --- >>>>> V2: beat me, whip me, split out F64 >>>>> >>>>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 >>>>> ++++++++++++++++++++++ >>>>> 1 file changed, 81 insertions(+) >>>>> >>>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> index 9a0bb60..741c74f 100644 >>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, >>>>> ImmediateValue >>>>> &imm0, int s) >>>>> i->op = OP_MOV; >>>>> break; >>>>> } >>>>> + case OP_CVT: { >>>>> + Storage res; >>>>> + bld.setPosition(i, true); /* make sure bld is init'ed */ >>>>> + switch(i->dType) { >>>>> + case TYPE_U16: >>>>> + switch (i->sType) { >>>>> + case TYPE_F32: >>>>> + if (i->saturate) >>>>> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >>>>> + UINT16_MAX)); >>>> >>>> Where did this saturate stuff come from? It doesn't make sense to >>>> saturate to a non-float dtype. I'd go ahead and just >>>> assert(!i->saturate) in the int dtype cases. >>>> >>>> One does wonder what the hw does if the float doesn't fit in the >>>> destination... whether it saturates or not. I don't hugely care >>>> though. >>> >>> Actually i can't remember why that was added in the first place, i'll go >>> ahead and follow your advice here. >> >> Oh wait... this was to support saturating an array access into a u16... >> >> const int sat = (i->op == OP_TXF) ? 1 : 0; >> DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32; >> bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat; >> >> So... basically if the source is a U32 and the dest is a U16, we want >> to saturate there? IMO this is such a minor use-case that it doesn't >> really matter. However I guess you can keep the saturate bits around >> if you like. > > We can do it with or without the saturate if we rely on the test, > assert(!i->saturate)'ing is the only thing that breaks the test you sure > meant: > > glsl-resource-not-bound 1DArray > glsl-resource-not-bound 2DArray > glsl-resource-not-bound 2DMSArrayHm, those are the only times that a texelFetch is done in piglit with a constant layer index, I guess.
Tobias Klausmann
2015-Jan-11 20:17 UTC
[Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
On 11.01.2015 20:57, Ilia Mirkin wrote:> On Sun, Jan 11, 2015 at 2:56 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> >> On 11.01.2015 20:19, Ilia Mirkin wrote: >>> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann >>> <tobias.johannes.klausmann at mni.thm.de> wrote: >>>> >>>> On 11.01.2015 01:58, Ilia Mirkin wrote: >>>>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann >>>>> <tobias.johannes.klausmann at mni.thm.de> wrote: >>>>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>>>>> {S16/32})->F32 >>>>>> >>>>>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> >>>>>> --- >>>>>> V2: beat me, whip me, split out F64 >>>>>> >>>>>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 >>>>>> ++++++++++++++++++++++ >>>>>> 1 file changed, 81 insertions(+) >>>>>> >>>>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>>> index 9a0bb60..741c74f 100644 >>>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>>> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, >>>>>> ImmediateValue >>>>>> &imm0, int s) >>>>>> i->op = OP_MOV; >>>>>> break; >>>>>> } >>>>>> + case OP_CVT: { >>>>>> + Storage res; >>>>>> + bld.setPosition(i, true); /* make sure bld is init'ed */ >>>>>> + switch(i->dType) { >>>>>> + case TYPE_U16: >>>>>> + switch (i->sType) { >>>>>> + case TYPE_F32: >>>>>> + if (i->saturate) >>>>>> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >>>>>> + UINT16_MAX)); >>>>> Where did this saturate stuff come from? It doesn't make sense to >>>>> saturate to a non-float dtype. I'd go ahead and just >>>>> assert(!i->saturate) in the int dtype cases. >>>>> >>>>> One does wonder what the hw does if the float doesn't fit in the >>>>> destination... whether it saturates or not. I don't hugely care >>>>> though. >>>> Actually i can't remember why that was added in the first place, i'll go >>>> ahead and follow your advice here. >>> Oh wait... this was to support saturating an array access into a u16... >>> >>> const int sat = (i->op == OP_TXF) ? 1 : 0; >>> DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32; >>> bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat; >>> >>> So... basically if the source is a U32 and the dest is a U16, we want >>> to saturate there? IMO this is such a minor use-case that it doesn't >>> really matter. However I guess you can keep the saturate bits around >>> if you like. >> We can do it with or without the saturate if we rely on the test, >> assert(!i->saturate)'ing is the only thing that breaks the test you sure >> meant: >> >> glsl-resource-not-bound 1DArray >> glsl-resource-not-bound 2DArray >> glsl-resource-not-bound 2DMSArray > Hm, those are the only times that a texelFetch is done in piglit with > a constant layer index, I guess.Ok, i'll keep the saturates for (U/S)16 to for once satisfy the "dependency" you posted up there and to be future proof if somebody implements something similar(?) for the S16 one!
Apparently Analagous Threads
- [RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions
- [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
- [PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions
- [PATCH] nv50/ir: Handle OP_CVT when folding constant expressions
- [PATCH v3 1/2] nv50/ir: Add support for the double Type to BuildUtil