thr3ads.net - search: "fabiang"

Displaying 3 results from an estimated 3 matches for "fabiang".

Did you mean: fabian

2018 Aug 21

different output with fast-math flag

This is of course not homework. I am trying to understand how fast math optimizations work in llvm. When I compared IR for both the programs, the only thing I have noticed is that fdiv and fmul are replaced with fdiv fast and fmul fast. Not sure what happens in fdiv fast and fmul fast. I feel that its because d/max is really small number and fast-math does not care about small numbers and consider

Rotates, once again

2018 Jul 02

Rotates, once again

1. I'm not sure what you mean by "full vector" here - using the same shift distance for all lanes (as opposed to per-lane distances), or doing a treat-the-vector-as-bag-of-bits shift that doesn't have any internal lane boundaries? If the latter, that doesn't really help you much with implementing a per-lane rotate. I think the most useful generalization of a vector

BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()

2018 Nov 25

BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()

bswapdi2 for i386 is correct Bits 31:0 of the source are loaded into edx. Bits 63:32 are loaded into eax. Those are each bswapped. The ABI for the return is edx contains bits [63:32] and eax contains [31:0]. This is opposite of how the register were loaded. ~Craig On Sun, Nov 25, 2018 at 10:36 AM Craig Topper <craig.topper at gmail.com> wrote: > bswapsi2 on the x86-64 isn't using

search for: fabiang