Hello everyone, I discovered a way to perform optimization on the following code (I gave an example that uses 32bit integer, but it works with any size.): const uint32 d,r;//d is an odd number //d is the divisor, r is the remainder bool check_remainder(uint32 x) { return x%d==r; } if we know d and r at compile time, and d is an odd integer, we can use modular multiplicative inverse to bypass the division operation. I wrote the following code to calculate the M.M.I of a number over 2^b. I give anyone the permission to use it (unlicensed). uint64_t find_mod_mul_inverse(uint64_t x, uint64_t bits) { if (bits > 64 || ((x&1)==0)) return 0;// invalid parameters uint64_t mask; if (bits == 64) mask = -1; else { mask = 1; mask<<=bits; mask--; } x&=mask; uint64_t result=1, state=x, ctz=0; while(state!=1ULL) { ctz=__builtin_ctzll(state^1); result|=1ULL<<ctz; state+=x<<ctz; state&=mask; } return result; } now consider the following steps: from the 2 constants (d and r) we create 3 constants (with the same bit length): constants uint32 s,u,mmi; mmi = find_mod_mul_inverse(d,32); s = (r*mmi); u = (UINT32_MAX-r)/d; // UINT32_MAX corresponds to pow(2,32)-1. the idea behind these constants is the following formula: mmi_of(d)*x=x/d+(x%d)*mmi_of(d) now after we generated the constants, we will just emit the following code instead of the former: bool check_remainder(uint32 x) { return ((x*mmi)-s)<=u; } Anyway, I looked at the file IntegerDivision.cpp, it seems to me that this new optimization is more effective then the optimization used there. However, I have no experience with compiler development, so I can just give you my idea. if further explanation is needed, just ask. I tested my method and it gives the correct results. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190220/9e9e5aec/attachment.html>
Craig Topper via llvm-dev
2019-Feb-21 05:27 UTC
[llvm-dev] proposal for optimization method
It looks like gcc already does something like this. https://godbolt.org/z/JWljkL ~Craig On Wed, Feb 20, 2019 at 8:45 PM or dv via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hello everyone, I discovered a way to perform optimization on the > following code (I gave an example that uses 32bit integer, but it works > with any size.): > > const uint32 d,r;//d is an odd number > //d is the divisor, r is the remainder > bool check_remainder(uint32 x) > { > return x%d==r; > } > > if we know d and r at compile time, and d is an odd integer, we can use > modular multiplicative inverse to bypass the division operation. > I wrote the following code to calculate the M.M.I of a number over 2^b. I > give anyone the permission to use it (unlicensed). > > uint64_t find_mod_mul_inverse(uint64_t x, uint64_t bits) > { > if (bits > 64 || ((x&1)==0)) > return 0;// invalid parameters > uint64_t mask; > if (bits == 64) > mask = -1; > else > { > mask = 1; > mask<<=bits; > mask--; > } > x&=mask; > uint64_t result=1, state=x, ctz=0; > while(state!=1ULL) > { > ctz=__builtin_ctzll(state^1); > result|=1ULL<<ctz; > state+=x<<ctz; > state&=mask; > } > return result; > } > > now consider the following steps: > from the 2 constants (d and r) we create 3 constants (with the same bit length): > constants uint32 s,u,mmi; > mmi = find_mod_mul_inverse(d,32); > s = (r*mmi); > u = (UINT32_MAX-r)/d; // UINT32_MAX corresponds to pow(2,32)-1. > the idea behind these constants is the following formula: > mmi_of(d)*x=x/d+(x%d)*mmi_of(d) > > now after we generated the constants, we will just emit the following code instead of the former: > bool check_remainder(uint32 x) > { > return ((x*mmi)-s)<=u; > } > > Anyway, I looked at the file IntegerDivision.cpp, it seems to me that this new optimization is more effective then the optimization used there. However, I have no experience with compiler development, so I can just give you my idea. if further explanation is needed, just ask. I tested my method and it gives the correct results. > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190220/6a922c0c/attachment.html>
Eli Friedman via llvm-dev
2019-Feb-21 19:55 UTC
[llvm-dev] proposal for optimization method
There’s a work-in-progress patch for the case r=0: https://reviews.llvm.org/D50222 . (Not sure what the current status of that patch is.) -Eli From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Craig Topper via llvm-dev Sent: Wednesday, February 20, 2019 9:27 PM To: or dv <dvoreader at gmail.com> Cc: llvm-dev <llvm-dev at lists.llvm.org> Subject: [EXT] Re: [llvm-dev] proposal for optimization method It looks like gcc already does something like this. https://godbolt.org/z/JWljkL ~Craig On Wed, Feb 20, 2019 at 8:45 PM or dv via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hello everyone, I discovered a way to perform optimization on the following code (I gave an example that uses 32bit integer, but it works with any size.): const uint32 d,r;//d is an odd number //d is the divisor, r is the remainder bool check_remainder(uint32 x) { return x%d==r; } if we know d and r at compile time, and d is an odd integer, we can use modular multiplicative inverse to bypass the division operation. I wrote the following code to calculate the M.M.I of a number over 2^b. I give anyone the permission to use it (unlicensed). uint64_t find_mod_mul_inverse(uint64_t x, uint64_t bits) { if (bits > 64 || ((x&1)==0)) return 0;// invalid parameters uint64_t mask; if (bits == 64) mask = -1; else { mask = 1; mask<<=bits; mask--; } x&=mask; uint64_t result=1, state=x, ctz=0; while(state!=1ULL) { ctz=__builtin_ctzll(state^1); result|=1ULL<<ctz; state+=x<<ctz; state&=mask; } return result; } now consider the following steps: from the 2 constants (d and r) we create 3 constants (with the same bit length): constants uint32 s,u,mmi; mmi = find_mod_mul_inverse(d,32); s = (r*mmi); u = (UINT32_MAX-r)/d; // UINT32_MAX corresponds to pow(2,32)-1. the idea behind these constants is the following formula: mmi_of(d)*x=x/d+(x%d)*mmi_of(d) now after we generated the constants, we will just emit the following code instead of the former: bool check_remainder(uint32 x) { return ((x*mmi)-s)<=u; } Anyway, I looked at the file IntegerDivision.cpp, it seems to me that this new optimization is more effective then the optimization used there. However, I have no experience with compiler development, so I can just give you my idea. if further explanation is needed, just ask. I tested my method and it gives the correct results. _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190221/9d0485b8/attachment-0001.html>
Possibly Parallel Threads
- [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
- [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
- [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz
- [LLVMdev] Proposal for pluggable intrinsics
- [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz