search for: mbmi2

Displaying 3 results from an estimated 3 matches for "mbmi2".

Did you mean: bmi2
2018 Dec 30
3
[cfe-dev] Portable multiplication 64 x 64 -> 128 for int128 reimplementation
...lso has the same check so it doesn't improve portability much. ~Craig On Sat, Dec 29, 2018 at 4:44 PM Arthur O'Dwyer via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi Pawel, > > There is the _mulx_u64 intrinsic, but it currently requires the hardware > flag "-mbmi2". > > https://github.com/Quuxplusone/WideIntProofOfConcept/blob/master/wider.h#L89-L99 > > On Clang 3.8.1 and earlier, the _addcarry_u64 and _subborrow_u64 > intrinsics required the hardware flag `-madx`, even though they didn't use > the hardware ADX/ADOX instructions. M...
2018 Dec 29
2
Portable multiplication 64 x 64 -> 128 for int128 reimplementation
Hi, For some maybe dumb reasons I try to write a portable version of int128. What is very valuable for this implementation is access to MUL instruction on x86 which provides full 64 x 64 -> 128 bit multiplication. An equally useful on ARM would be UMULH instruction. Well, the way you can access this on clang / GCC is to use __int128 type or use inline assembly. MSVC provides an intrinsic for
2018 Dec 31
0
[cfe-dev] Portable multiplication 64 x 64 -> 128 for int128 reimplementation
...they might be better alternative to inline assembly. >> Is there a one for regular MUL? >> > > I'm not sure, but I think there currently does not exist any intrinsic to > generate the top half of a 64x64=128 multiply, except for `_mulx_64`. > If Clang stopped requiring `-mbmi2`, I would then expect the `_mulx_64` > intrinsic to generate a regular MUL instruction; similar to > how_addcarry_u64 generates ADCX/ADOX when available/useful and a regular > ADC otherwise. > MSVC calls this intrinsic `_umul128 > <https://docs.microsoft.com/en-us/cpp/intrinsics/u...