search for: v_cvt_u32_f32_e32

Displaying 1 result from an estimated 1 matches for "v_cvt_u32_f32_e32".

2016 Oct 03
5
Is this undefined behavior optimization legal?
...a common way to implement this function on a target with 32-bit registers would be to zero initialize a 32-bit register to hold the initial vector and then 'mask' and 'or' the inserted value with the initial vector. In AMDGPU assembly it would look something like: v_mov_b32 v0, 0 v_cvt_u32_f32_e32 v1, s0 v_and_b32 v1, v1, 0x000000ff v_or_b32 v0, v0, v1 The optimization the SelectionDAG does for us in this function, though, ends up removing the mask operation. Which gives us: v_mov_b32 v0, 0 v_cvt_u32_f32_e32 v1, s0 v_or_b32 v0, v0, v1 The reason the SelectionDAG is doing this is because...