Displaying 3 results from an estimated 3 matches for "mbmi2".
Did you mean:
bmi2
2018 Dec 30
3
[cfe-dev] Portable multiplication 64 x 64 -> 128 for int128 reimplementation
...lso has the
same check so it doesn't improve portability much.
~Craig
On Sat, Dec 29, 2018 at 4:44 PM Arthur O'Dwyer via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi Pawel,
>
> There is the _mulx_u64 intrinsic, but it currently requires the hardware
> flag "-mbmi2".
>
> https://github.com/Quuxplusone/WideIntProofOfConcept/blob/master/wider.h#L89-L99
>
> On Clang 3.8.1 and earlier, the _addcarry_u64 and _subborrow_u64
> intrinsics required the hardware flag `-madx`, even though they didn't use
> the hardware ADX/ADOX instructions. M...
2018 Dec 29
2
Portable multiplication 64 x 64 -> 128 for int128 reimplementation
Hi,
For some maybe dumb reasons I try to write a portable version of int128.
What is very valuable for this implementation is access to MUL instruction
on x86 which provides full 64 x 64 -> 128 bit multiplication. An equally
useful on ARM would be UMULH instruction.
Well, the way you can access this on clang / GCC is to use __int128 type or
use inline assembly. MSVC provides an intrinsic for
2018 Dec 31
0
[cfe-dev] Portable multiplication 64 x 64 -> 128 for int128 reimplementation
...they might be better alternative to inline assembly.
>> Is there a one for regular MUL?
>>
>
> I'm not sure, but I think there currently does not exist any intrinsic to
> generate the top half of a 64x64=128 multiply, except for `_mulx_64`.
> If Clang stopped requiring `-mbmi2`, I would then expect the `_mulx_64`
> intrinsic to generate a regular MUL instruction; similar to
> how_addcarry_u64 generates ADCX/ADOX when available/useful and a regular
> ADC otherwise.
> MSVC calls this intrinsic `_umul128
> <https://docs.microsoft.com/en-us/cpp/intrinsics/u...