Gautam Chakrabarti via llvm-dev
2020-Jan-07 06:36 UTC
[llvm-dev] Semantics of any out-of-bounds address returned by GEP
Hi, I am looking to get some clarification about out-of-bounds address that may be returned by a GEP. My understanding is that in the absence of the inbounds keyword, a GEP may return a pointer that is out-of-bounds of the underlying allocated object. Is it correct that any such out-of-bounds address must not be dereferenced? I am a bit confused by the current wordings on this. https://llvm.org/docs/LangRef.html#getelementptr-instruction says: "If the inbounds keyword is not present, .... The result value of the getelementptr may be outside the object pointed to by the base pointer. The result value may not necessarily be used to access memory though, even if it happens to point into allocated storage." This seems to say that the out-of-bounds pointer may or may not be used to access memory. However, https://llvm.org/docs/GetElementPtr.html#how-is-gep-different-from-ptrtoint-arithmetic-and-inttoptr says: "Also, GEP carries additional pointer aliasing rules. It's invalid to take a GEP from one object, address into a different separately allocated object, and dereference it. IR producers (front-ends) must follow this rule, and consumers (optimizers, specifically alias analysis) benefit from being able to rely on it." This quite clearly says the out-of-bounds address shall not be dereferenced through. This also seems to be the assumption in AliasAnalysis implementation (as already stated in above documentation). Thanks, Gautam ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ----------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200107/4efc42a2/attachment.html>
Nuno Lopes via llvm-dev
2020-Jan-07 15:19 UTC
[llvm-dev] Semantics of any out-of-bounds address returned by GEP
You're right: non-inbounds GEPs may return OOB pointers. OOB pointers cannot be dereferenced. OOB pointers can still be used as input to GEP to bring them back to being within bounds. OOB pointers can also be used for pointer comparisons. Nuno From: Gautam Chakrabarti via llvm-dev Sent: 07 January 2020 06:37 To: LLVM Dev <llvm-dev at lists.llvm.org> Subject: [llvm-dev] Semantics of any out-of-bounds address returned by GEP Hi, I am looking to get some clarification about out-of-bounds address that may be returned by a GEP. My understanding is that in the absence of the inbounds keyword, a GEP may return a pointer that is out-of-bounds of the underlying allocated object. Is it correct that any such out-of-bounds address must not be dereferenced? I am a bit confused by the current wordings on this. https://llvm.org/docs/LangRef.html#getelementptr-instruction says: "If the inbounds keyword is not present, .... The result value of the getelementptr may be outside the object pointed to by the base pointer. The result value may not necessarily be used to access memory though, even if it happens to point into allocated storage." This seems to say that the out-of-bounds pointer may or may not be used to access memory. However, https://llvm.org/docs/GetElementPtr.html#how-is-gep-different-from-ptrtoint- arithmetic-and-inttoptr says: "Also, GEP carries additional pointer aliasing rules. It's invalid to take a GEP from one object, address into a different separately allocated object, and dereference it. IR producers (front-ends) must follow this rule, and consumers (optimizers, specifically alias analysis) benefit from being able to rely on it." This quite clearly says the out-of-bounds address shall not be dereferenced through. This also seems to be the assumption in AliasAnalysis implementation (as already stated in above documentation). Thanks, Gautam -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200107/7dd7402a/attachment-0001.html>
Michael Kruse via llvm-dev
2020-Jan-07 17:11 UTC
[llvm-dev] Semantics of any out-of-bounds address returned by GEP
Am Di., 7. Jan. 2020 um 00:37 Uhr schrieb Gautam Chakrabarti via llvm-dev <llvm-dev at lists.llvm.org>:> However, https://llvm.org/docs/GetElementPtr.html#how-is-gep-different-from-ptrtoint-arithmetic-and-inttoptr says: > > > > “Also, GEP carries additional pointer aliasing rules. It’s invalid to take a GEP from one object, address into a different separately allocated object, and dereference it. IR producers (front-ends) must follow this rule, and consumers (optimizers, specifically alias analysis) benefit from being able to rely on it.” > > > > This quite clearly says the out-of-bounds address shall not be dereferenced through. This also seems to be the assumption in AliasAnalysis implementation (as already stated in above documentation).I think this is not a property of the GEP instruction, but pointer provenance [1]. Basically, compilers can assume that the result of arithmetic on pointers to a memory object (e.g. malloc or allocation or stack variable) will still point to the same memory object. [1] https://blog.regehr.org/archives/1621 Michael