Eli Friedman <eli.friedman at gmail.com> writes:> int_x86_avx_vhadd_pd_xmm doesn't exist on trunk. Why does it exist on > your branch if the semantics are exactly equivalent to > int_x86_sse3_hadd_pd? The register allocator can handle converting to > three-address form if the target provides the appropriate hooks.Because in some cases users may want to explicitly use non-VEX encoded instructions. So we need to differentiate. -Dave
On Mon, Sep 13, 2010 at 8:27 AM, David A. Greene <dag at cray.com> wrote:> Eli Friedman <eli.friedman at gmail.com> writes: > >> int_x86_avx_vhadd_pd_xmm doesn't exist on trunk. Why does it exist on >> your branch if the semantics are exactly equivalent to >> int_x86_sse3_hadd_pd? The register allocator can handle converting to >> three-address form if the target provides the appropriate hooks. > > Because in some cases users may want to explicitly use non-VEX encoded > instructions. So we need to differentiate.Can you give an example of such a scenario? In answer to your original question, it's probably just a matter of messing with the relevant generator in TableGen, relatively straightforward. Your syntax is probably insufficient, though: how will the table generator decide which intrinsic to use for __builtin_ia32_haddpd coming from a frontend? -Eli
On Mon, Sep 13, 2010 at 8:27 AM, David A. Greene <dag at cray.com> wrote:> Eli Friedman <eli.friedman at gmail.com> writes: > >> int_x86_avx_vhadd_pd_xmm doesn't exist on trunk. Why does it exist on >> your branch if the semantics are exactly equivalent to >> int_x86_sse3_hadd_pd? The register allocator can handle converting to >> three-address form if the target provides the appropriate hooks. > > Because in some cases users may want to explicitly use non-VEX encoded > instructions. So we need to differentiate.I don't see why one would like to emit 256-bit wide reg instructions and at the same time non-VEX encoded 128-bit ones. For cases like this one can compile the former alone and then link with regular sse code, right? -- Bruno Cardoso Lopes http://www.brunocardoso.cc
On Sep 13, 2010, at 1:24 PM, Bruno Cardoso Lopes wrote:> On Mon, Sep 13, 2010 at 8:27 AM, David A. Greene <dag at cray.com> wrote: >> Eli Friedman <eli.friedman at gmail.com> writes: >> >>> int_x86_avx_vhadd_pd_xmm doesn't exist on trunk. Why does it exist on >>> your branch if the semantics are exactly equivalent to >>> int_x86_sse3_hadd_pd? The register allocator can handle converting to >>> three-address form if the target provides the appropriate hooks. >> >> Because in some cases users may want to explicitly use non-VEX encoded >> instructions. So we need to differentiate. > > I don't see why one would like to emit 256-bit wide reg instructions and at the > same time non-VEX encoded 128-bit ones. For cases like this one can compile > the former alone and then link with regular sse code, right? >Yep. I don't see any reason either. -eric
Eli Friedman <eli.friedman at gmail.com> writes:> On Mon, Sep 13, 2010 at 8:27 AM, David A. Greene <dag at cray.com> wrote: >> Eli Friedman <eli.friedman at gmail.com> writes: >> >>> int_x86_avx_vhadd_pd_xmm doesn't exist on trunk. Why does it exist on >>> your branch if the semantics are exactly equivalent to >>> int_x86_sse3_hadd_pd? The register allocator can handle converting to >>> three-address form if the target provides the appropriate hooks. >> >> Because in some cases users may want to explicitly use non-VEX encoded >> instructions. So we need to differentiate. > > Can you give an example of such a scenario?Simulator validation, for example.> In answer to your original question, it's probably just a matter of > messing with the relevant generator in TableGen, relatively > straightforward. Your syntax is probably insufficient, though: how > will the table generator decide which intrinsic to use for > __builtin_ia32_haddpd coming from a frontend?These are GCC builtins. They only get used by the C generating backend and I use them in debug scenarios. In those cases I don't really care which gets chosen. For "normal" usage the codegen knows exactly what to emit (the actual machine instruction). -Dave