thr3ads.net - Nouveau - [Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Tobias Klausmann

2015-Jan-11 19:56 UTC

[Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

On 11.01.2015 20:19, Ilia Mirkin wrote:> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann
> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>
>> On 11.01.2015 01:58, Ilia Mirkin wrote:
>>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann
>>> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and
(U{16/32},
>>>> {S16/32})->F32
>>>>
>>>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann
at mni.thm.de>
>>>> ---
>>>> V2: beat me, whip me, split out F64
>>>>
>>>>    .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 81
>>>> ++++++++++++++++++++++
>>>>    1 file changed, 81 insertions(+)
>>>>
>>>> diff --git
a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>> index 9a0bb60..741c74f 100644
>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i,
ImmediateValue
>>>> &imm0, int s)
>>>>          i->op = OP_MOV;
>>>>          break;
>>>>       }
>>>> +   case OP_CVT: {
>>>> +      Storage res;
>>>> +      bld.setPosition(i, true); /* make sure bld is
init'ed */
>>>> +      switch(i->dType) {
>>>> +      case TYPE_U16:
>>>> +         switch (i->sType) {
>>>> +         case TYPE_F32:
>>>> +            if (i->saturate)
>>>> +               res.data.u16 =
util_iround(CLAMP(imm0.reg.data.f32, 0,
>>>> +                                                UINT16_MAX));
>>> Where did this saturate stuff come from? It doesn't make sense
to
>>> saturate to a non-float dtype. I'd go ahead and just
>>> assert(!i->saturate) in the int dtype cases.
>>>
>>> One does wonder what the hw does if the float doesn't fit in
the
>>> destination... whether it saturates or not. I don't hugely care
>>> though.
>> Actually i can't remember why that was added in the first place,
i'll go
>> ahead and follow your advice here.
> Oh wait... this was to support saturating an array access into a u16...
>
>           const int sat = (i->op == OP_TXF) ? 1 : 0;
>           DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32;
>           bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat;
>
> So... basically if the source is a U32 and the dest is a U16, we want
> to saturate there? IMO this is such a minor use-case that it doesn't
> really matter. However I guess you can keep the saturate bits around
> if you like.We can do it with or without the saturate if we rely on the test, 
assert(!i->saturate)'ing is the only thing that breaks the test you sure 
meant:

glsl-resource-not-bound 1DArray
glsl-resource-not-bound 2DArray
glsl-resource-not-bound 2DMSArray
>
>    -ilia

Ilia Mirkin

2015-Jan-11 19:57 UTC

head link

[Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

On Sun, Jan 11, 2015 at 2:56 PM, Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de> wrote:>
>
> On 11.01.2015 20:19, Ilia Mirkin wrote:
>>
>> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann
>> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>>
>>>
>>> On 11.01.2015 01:58, Ilia Mirkin wrote:
>>>>
>>>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann
>>>> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>>>>
>>>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and
(U{16/32},
>>>>> {S16/32})->F32
>>>>>
>>>>> Signed-off-by: Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de>
>>>>> ---
>>>>> V2: beat me, whip me, split out F64
>>>>>
>>>>>    .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 81
>>>>> ++++++++++++++++++++++
>>>>>    1 file changed, 81 insertions(+)
>>>>>
>>>>> diff --git
a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>> index 9a0bb60..741c74f 100644
>>>>> ---
a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>> +++
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i,
>>>>> ImmediateValue
>>>>> &imm0, int s)
>>>>>          i->op = OP_MOV;
>>>>>          break;
>>>>>       }
>>>>> +   case OP_CVT: {
>>>>> +      Storage res;
>>>>> +      bld.setPosition(i, true); /* make sure bld is
init'ed */
>>>>> +      switch(i->dType) {
>>>>> +      case TYPE_U16:
>>>>> +         switch (i->sType) {
>>>>> +         case TYPE_F32:
>>>>> +            if (i->saturate)
>>>>> +               res.data.u16 =
util_iround(CLAMP(imm0.reg.data.f32, 0,
>>>>> +                                               
UINT16_MAX));
>>>>
>>>> Where did this saturate stuff come from? It doesn't make
sense to
>>>> saturate to a non-float dtype. I'd go ahead and just
>>>> assert(!i->saturate) in the int dtype cases.
>>>>
>>>> One does wonder what the hw does if the float doesn't fit
in the
>>>> destination... whether it saturates or not. I don't hugely
care
>>>> though.
>>>
>>> Actually i can't remember why that was added in the first
place, i'll go
>>> ahead and follow your advice here.
>>
>> Oh wait... this was to support saturating an array access into a u16...
>>
>>           const int sat = (i->op == OP_TXF) ? 1 : 0;
>>           DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32;
>>           bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate =
sat;
>>
>> So... basically if the source is a U32 and the dest is a U16, we want
>> to saturate there? IMO this is such a minor use-case that it
doesn't
>> really matter. However I guess you can keep the saturate bits around
>> if you like.
>
> We can do it with or without the saturate if we rely on the test,
> assert(!i->saturate)'ing is the only thing that breaks the test you
sure
> meant:
>
> glsl-resource-not-bound 1DArray
> glsl-resource-not-bound 2DArray
> glsl-resource-not-bound 2DMSArray
Hm, those are the only times that a texelFetch is done in piglit with
a constant layer index, I guess.

Tobias Klausmann

2015-Jan-11 20:17 UTC

head link

[Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

On 11.01.2015 20:57, Ilia Mirkin wrote:> On Sun, Jan 11, 2015 at 2:56 PM, Tobias Klausmann
> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>
>> On 11.01.2015 20:19, Ilia Mirkin wrote:
>>> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann
>>> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>>>
>>>> On 11.01.2015 01:58, Ilia Mirkin wrote:
>>>>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann
>>>>> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>>>>> Folding for conversions: F32->(U{16/32}, S{16/32})
and (U{16/32},
>>>>>> {S16/32})->F32
>>>>>>
>>>>>> Signed-off-by: Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de>
>>>>>> ---
>>>>>> V2: beat me, whip me, split out F64
>>>>>>
>>>>>>     .../drivers/nouveau/codegen/nv50_ir_peephole.cpp  
| 81
>>>>>> ++++++++++++++++++++++
>>>>>>     1 file changed, 81 insertions(+)
>>>>>>
>>>>>> diff --git
a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>>>
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>>> index 9a0bb60..741c74f 100644
>>>>>> ---
a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>>> +++
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>>> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction
*i,
>>>>>> ImmediateValue
>>>>>> &imm0, int s)
>>>>>>           i->op = OP_MOV;
>>>>>>           break;
>>>>>>        }
>>>>>> +   case OP_CVT: {
>>>>>> +      Storage res;
>>>>>> +      bld.setPosition(i, true); /* make sure bld is
init'ed */
>>>>>> +      switch(i->dType) {
>>>>>> +      case TYPE_U16:
>>>>>> +         switch (i->sType) {
>>>>>> +         case TYPE_F32:
>>>>>> +            if (i->saturate)
>>>>>> +               res.data.u16 =
util_iround(CLAMP(imm0.reg.data.f32, 0,
>>>>>> +                                               
UINT16_MAX));
>>>>> Where did this saturate stuff come from? It doesn't
make sense to
>>>>> saturate to a non-float dtype. I'd go ahead and just
>>>>> assert(!i->saturate) in the int dtype cases.
>>>>>
>>>>> One does wonder what the hw does if the float doesn't
fit in the
>>>>> destination... whether it saturates or not. I don't
hugely care
>>>>> though.
>>>> Actually i can't remember why that was added in the first
place, i'll go
>>>> ahead and follow your advice here.
>>> Oh wait... this was to support saturating an array access into a
u16...
>>>
>>>            const int sat = (i->op == OP_TXF) ? 1 : 0;
>>>            DataType sTy = (i->op == OP_TXF) ? TYPE_U32 :
TYPE_F32;
>>>            bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy,
src)->saturate = sat;
>>>
>>> So... basically if the source is a U32 and the dest is a U16, we
want
>>> to saturate there? IMO this is such a minor use-case that it
doesn't
>>> really matter. However I guess you can keep the saturate bits
around
>>> if you like.
>> We can do it with or without the saturate if we rely on the test,
>> assert(!i->saturate)'ing is the only thing that breaks the test
you sure
>> meant:
>>
>> glsl-resource-not-bound 1DArray
>> glsl-resource-not-bound 2DArray
>> glsl-resource-not-bound 2DMSArray
> Hm, those are the only times that a texelFetch is done in piglit with
> a constant layer index, I guess.Ok, i'll keep the saturates for (U/S)16 to for once satisfy the 
"dependency" you posted up there and to be future proof if somebody 
implements something similar(?) for the S16 one!

Possibly Parallel Threads

Search for more maybe matching threads

Nouveau - Jan 2015 - [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

[Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

[Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

[Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

Possibly Parallel Threads