Michael McConville via llvm-dev
2016-Feb-07 04:05 UTC
[llvm-dev] [PATCH] strlen -> strnlen optimization
This addition converts strlen() calls to strnlen() when the result is compared to a constant. For example, the following: strlen(s) < 5 Becomes: strnlen(s, 5) < 5 That way, we don't have to walk through the entire string. There is the added overhead of maintaining a counter when using strnlen(), but I thought I'd start with the general case. It may make sense to only use this optimization with small constants. On the other hand, the impact of the counter may be negligible in many or most cases due to hardware-level optimizations. I often notice the idiom of comparing a packet or buffer against a minimum length as a sanity check. I was surprised to see that this isn't optimized, so I thought I'd give it a try. nlewycky on IRC seemed to think it was a good idea. I'm interested to hear what others think. I have little C++ experience, so there are likely improvements to be made to my patch. I tried to adhere to the style and conventions of the surrounding code. This reintroduces emitStrNLen(), which was removed a couple months ago in r253514. The only change is a getParent()->getParent() --> getModule() conversion, which was done in emitStrLen() after emitStrNLen() was removed. This tests successfully for me on Ubuntu 14.04.3. Thanks for your time, Michael Index: lib/Transforms/Utils/BuildLibCalls.cpp ==================================================================--- lib/Transforms/Utils/BuildLibCalls.cpp (revision 260011) +++ lib/Transforms/Utils/BuildLibCalls.cpp (working copy) @@ -52,6 +52,28 @@ return CI; } +Value *llvm::emitStrNLen(Value *Ptr, Value *MaxLen, IRBuilder<> &B, + const DataLayout &DL, const TargetLibraryInfo *TLI) { + if (!TLI->has(LibFunc::strlen)) + return nullptr; + + Module *M = B.GetInsertBlock()->getModule(); + AttributeSet AS[2]; + AS[0] = AttributeSet::get(M->getContext(), 1, Attribute::NoCapture); + Attribute::AttrKind AVs[2] = { Attribute::ReadOnly, Attribute::NoUnwind }; + AS[1] = AttributeSet::get(M->getContext(), AttributeSet::FunctionIndex, AVs); + + LLVMContext &Context = B.GetInsertBlock()->getContext(); + Constant *StrNLen = M->getOrInsertFunction( + "strnlen", AttributeSet::get(M->getContext(), AS), + DL.getIntPtrType(Context), B.getInt8PtrTy(), DL.getIntPtrType(Context), nullptr); + CallInst *CI = B.CreateCall(StrNLen, {castToCStr(Ptr, B), MaxLen}, "strnlen"); + if (const Function *F = dyn_cast<Function>(StrNLen->stripPointerCasts())) + CI->setCallingConv(F->getCallingConv()); + + return CI; +} + Value *llvm::emitStrChr(Value *Ptr, char C, IRBuilder<> &B, const TargetLibraryInfo *TLI) { if (!TLI->has(LibFunc::strchr)) Index: lib/Transforms/Utils/SimplifyLibCalls.cpp ==================================================================--- lib/Transforms/Utils/SimplifyLibCalls.cpp (revision 260011) +++ lib/Transforms/Utils/SimplifyLibCalls.cpp (working copy) @@ -555,6 +555,49 @@ if (isOnlyUsedInZeroEqualityComparison(CI)) return B.CreateZExt(B.CreateLoad(Src, "strlenfirst"), CI->getType()); + // strlen(x) < y --> strnlen(x, y+1) < y + // + // We ensure that there is only one user to avoid interfering with + // CSE. + if (!CI->hasOneUse() || !TLI->has(LibFunc::strnlen)) + return nullptr; + User *U = CI->user_back(); + ICmpInst *IC; + if (!(IC = dyn_cast<ICmpInst>(U))) + return nullptr; + IntegerType *SizeType = DL.getIntPtrType(B.GetInsertBlock()->getContext()); + Value *LHS = IC->getOperand(0), *RHS = IC->getOperand(1); + + ConstantInt *Con; + if (!((Con = dyn_cast<ConstantInt>(LHS)) || (Con = dyn_cast<ConstantInt>(RHS)))) + return nullptr; + uint64_t con_val = Con->getZExtValue(); + + if (RHS == CI) + IC->swapOperands(); + + switch (IC->getPredicate()) { + case ICmpInst::ICMP_EQ: + case ICmpInst::ICMP_NE: + case ICmpInst::ICMP_UGT: + case ICmpInst::ICMP_UGE: + case ICmpInst::ICMP_ULE: + case ICmpInst::ICMP_SGT: + case ICmpInst::ICMP_SGE: + case ICmpInst::ICMP_SLE: + // XXX: check for wrapping + if (con_val == UINT64_MAX) + return nullptr; + return emitStrNLen(Src, ConstantInt::get(SizeType, con_val + 1), + B, DL, TLI); + case ICmpInst::ICMP_ULT: + case ICmpInst::ICMP_SLT: + return emitStrNLen(Src, Con, + B, DL, TLI); + default: + return nullptr; + } + return nullptr; }
Mehdi Amini via llvm-dev
2016-Feb-07 06:47 UTC
[llvm-dev] [PATCH] strlen -> strnlen optimization
Hi, (llvm-dev to BCC) Thanks for this! Patches are supposed to be sent to llvm-commits and not llvm-dev (http://llvm.org/docs/DeveloperPolicy.html ). Also it seems that your patch was inline in the email and not attached? I couldn't apply it for some reason. First off, you're missing test cases for each ICMP possibility, and one for the wrapping case (see test/Transforms/InstCombine/simplify-libcalls.ll if you need examples). Also, see other comments inline below.> On Feb 6, 2016, at 8:05 PM, Michael McConville via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > This addition converts strlen() calls to strnlen() when the result is > compared to a constant. For example, the following: > > strlen(s) < 5 > > Becomes: > > strnlen(s, 5) < 5 > > That way, we don't have to walk through the entire string. There is the > added overhead of maintaining a counter when using strnlen(), but I > thought I'd start with the general case. It may make sense to only use > this optimization with small constants. On the other hand, the impact of > the counter may be negligible in many or most cases due to > hardware-level optimizations. > > I often notice the idiom of comparing a packet or buffer against a > minimum length as a sanity check. I was surprised to see that this isn't > optimized, so I thought I'd give it a try. > > nlewycky on IRC seemed to think it was a good idea. I'm interested to > hear what others think. > > I have little C++ experience, so there are likely improvements to be > made to my patch. I tried to adhere to the style and conventions of the > surrounding code. > > This reintroduces emitStrNLen(), which was removed a couple months ago > in r253514. The only change is a getParent()->getParent() --> > getModule() conversion, which was done in emitStrLen() after > emitStrNLen() was removed. > > This tests successfully for me on Ubuntu 14.04.3. > > Thanks for your time, > Michael > > > Index: lib/Transforms/Utils/BuildLibCalls.cpp > ==================================================================> --- lib/Transforms/Utils/BuildLibCalls.cpp (revision 260011) > +++ lib/Transforms/Utils/BuildLibCalls.cpp (working copy) > @@ -52,6 +52,28 @@ > return CI; > } > > +Value *llvm::emitStrNLen(Value *Ptr, Value *MaxLen, IRBuilder<> &B, > + const DataLayout &DL, const TargetLibraryInfo *TLI) {Interestingly it was removed from the implementation but not from the header...> + if (!TLI->has(LibFunc::strlen)) > + return nullptr; > + > + Module *M = B.GetInsertBlock()->getModule(); > + AttributeSet AS[2]; > + AS[0] = AttributeSet::get(M->getContext(), 1, Attribute::NoCapture); > + Attribute::AttrKind AVs[2] = { Attribute::ReadOnly, Attribute::NoUnwind }; > + AS[1] = AttributeSet::get(M->getContext(), AttributeSet::FunctionIndex, AVs); > + > + LLVMContext &Context = B.GetInsertBlock()->getContext(); > + Constant *StrNLen = M->getOrInsertFunction( > + "strnlen", AttributeSet::get(M->getContext(), AS), > + DL.getIntPtrType(Context), B.getInt8PtrTy(), DL.getIntPtrType(Context), nullptr); > + CallInst *CI = B.CreateCall(StrNLen, {castToCStr(Ptr, B), MaxLen}, "strnlen"); > + if (const Function *F = dyn_cast<Function>(StrNLen->stripPointerCasts())) > + CI->setCallingConv(F->getCallingConv()); > + > + return CI; > +} > + > Value *llvm::emitStrChr(Value *Ptr, char C, IRBuilder<> &B, > const TargetLibraryInfo *TLI) { > if (!TLI->has(LibFunc::strchr)) > Index: lib/Transforms/Utils/SimplifyLibCalls.cpp > ==================================================================> --- lib/Transforms/Utils/SimplifyLibCalls.cpp (revision 260011) > +++ lib/Transforms/Utils/SimplifyLibCalls.cpp (working copy) > @@ -555,6 +555,49 @@ > if (isOnlyUsedInZeroEqualityComparison(CI)) > return B.CreateZExt(B.CreateLoad(Src, "strlenfirst"), CI->getType()); > > + // strlen(x) < y --> strnlen(x, y+1) < y > + // > + // We ensure that there is only one user to avoid interfering with > + // CSE. > + if (!CI->hasOneUse() || !TLI->has(LibFunc::strnlen)) > + return nullptr;You are adding this optimization inside a function that try multiple optimizations in sequence. An early return is not nice in this context because it prevent from adding other transformation in the same function afterwards. The cleanest way to keep early exits and avoid deep nesting is to extract this transformation in a static helper function.> + User *U = CI->user_back(); > + ICmpInst *IC; > + if (!(IC = dyn_cast<ICmpInst>(U))) > + return nullptr;ICmpInst *IC = dyn_cast<ICmpInst>(U); if (!IC) return nullptr;> + IntegerType *SizeType = DL.getIntPtrType(B.GetInsertBlock()->getContext()); > + Value *LHS = IC->getOperand(0), *RHS = IC->getOperand(1); > + > + ConstantInt *Con; > + if (!((Con = dyn_cast<ConstantInt>(LHS)) || (Con = dyn_cast<ConstantInt>(RHS)))) > + return nullptr;I think you can assume the constant to be on the right (and remove the swapOperands): ConstantInt *Con = dyn_cast<ConstantInt>(IC->getOperand(1)); if(!Con) return nullptr;> + uint64_t con_val = Con->getZExtValue(); > + > + if (RHS == CI) > + IC->swapOperands(); > + > + switch (IC->getPredicate()) { > + case ICmpInst::ICMP_EQ: > + case ICmpInst::ICMP_NE: > + case ICmpInst::ICMP_UGT: > + case ICmpInst::ICMP_UGE: > + case ICmpInst::ICMP_ULE: > + case ICmpInst::ICMP_SGT: > + case ICmpInst::ICMP_SGE: > + case ICmpInst::ICMP_SLE: > + // XXX: check for wrapping > + if (con_val == UINT64_MAX) > + return nullptr;ISTM that the wrapping check assumes that "sizeof(size_t) == 8", which is not always true. Best, Mehdi> + return emitStrNLen(Src, ConstantInt::get(SizeType, con_val + 1), > + B, DL, TLI); > + case ICmpInst::ICMP_ULT: > + case ICmpInst::ICMP_SLT: > + return emitStrNLen(Src, Con, > + B, DL, TLI); > + default: > + return nullptr; > + } > + > return nullptr; > } > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Joerg Sonnenberger via llvm-dev
2016-Feb-07 12:04 UTC
[llvm-dev] [PATCH] strlen -> strnlen optimization
On Sat, Feb 06, 2016 at 11:05:14PM -0500, Michael McConville via llvm-dev wrote:> This addition converts strlen() calls to strnlen() when the result is > compared to a constant. For example, the following: > > strlen(s) < 5 > > Becomes: > > strnlen(s, 5) < 5 > > That way, we don't have to walk through the entire string. There is the > added overhead of maintaining a counter when using strnlen(), but I > thought I'd start with the general case. It may make sense to only use > this optimization with small constants. On the other hand, the impact of > the counter may be negligible in many or most cases due to > hardware-level optimizations.This is an optimisation I am very vary of two two reasons: (1) strnlen is only POSIX2008, so missing on many slightly older systems. (2) I do believe that the counting is quite harmful to the performance. Additionally, I wouldn't be surprised if many systems don't consider strnlen to be the fast path, so it would be even worse than using e.g. memchr for this. Joerg
Michael McConville via llvm-dev
2016-Feb-07 18:21 UTC
[llvm-dev] [PATCH] strlen -> strnlen optimization
Joerg Sonnenberger wrote:> On Sat, Feb 06, 2016 at 11:05:14PM -0500, Michael McConville via llvm-dev wrote: > > This addition converts strlen() calls to strnlen() when the result is > > compared to a constant. For example, the following: > > > > strlen(s) < 5 > > > > Becomes: > > > > strnlen(s, 5) < 5 > > > > That way, we don't have to walk through the entire string. There is the > > added overhead of maintaining a counter when using strnlen(), but I > > thought I'd start with the general case. It may make sense to only use > > this optimization with small constants. On the other hand, the impact of > > the counter may be negligible in many or most cases due to > > hardware-level optimizations. > > This is an optimisation I am very vary of two two reasons: > (1) strnlen is only POSIX2008, so missing on many slightly older > systems.That's why we check whether it exists. glibc has had strnlen since 1996, OpenBSD since 2010, FreeBSD since 2009, OS X since ~2010. I expect that the vast majority of Unix-like systems have it. Regardless, I don't think this optimization does any significant harm to systems that lack strnlen.> (2) I do believe that the counting is quite harmful to the performance.I'll benchmark. If we limit the optimization to cases where the constant is (for example) < 20, a small performance hit from the counter may not matter.> Additionally, I wouldn't be surprised if many systems don't consider > strnlen to be the fast path, so it would be even worse than using e.g. > memchr for this.The strnlen implementations I've looked at don't seem much different from their strlen counterparts in the common libc's I looked at. For example, glibc has asm implementations of both for common platforms. FreeBSD only has asm implementations of strlen, but those are only for ARM and MIPS. OpenBSD doesn't use asm for either. I wouldn't be surprised if the extremely simple nature of the functions means that asm implementations aren't noticeably faster.