James Y Knight via llvm-dev
2016-Jan-15 06:28 UTC
[llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang
(Sorry for the duplicate mail, Richard, I accidentally sent a copy only to you before.) On Thu, Jan 14, 2016 at 10:26 PM, Richard Smith via cfe-dev < cfe-dev at lists.llvm.org> wrote:> [1]: That's not completely true, as it's possible to create an object > that is underaligned for its type: > > struct __attribute__((packed)) A { > char k; > struct B { int n; } b; > } a; > A::B *p = &a.b; // Misaligned pointer, now guaranteed OK > int main() { > int *q = &p->n; // UB? UBSan diagnoses this member access > return *q; // Obviously UB > } > > It seems that we do need to have some syntactic rules for how far the > known alignment propagates to handle this case; your proposed rules don't > do the right thing here. > >I don't think any new rules are really needed here. With your example, I expect even doing "int x = a.b.n;" to be okay. The value "a.b" is actually (effectively) typed "struct A::B __attribute__((aligned(1))) *", and thus fine to access a sub-object of, as the pointer is NOT misaligned for its type! And similarly, the type of &a.b.n is "int __attribute__((aligned(1))) *", and also thus not misaligned. While casting the result of "&a.b" to "A::B *" in the expression "A::B *p &a.b" is fine (well, becomes more _explicitly_ fine with your suggested modification to the standard text), once you've done that, the new type _does_ require an alignment of alignof(int), and thus, the value of "p" is misaligned for its type. And so, you can no longer dereference it nor access members. That seems all fine and sensible to me. If you need to give a name "p" to the value, you can still do like this: typedef struct A::B __attribute__((aligned(1))) misaligned_A_B; misaligned_A_B *p = &a.b; int x = p->n; AFAICT, this all works today, as well. Now, of course, the "aligned" attribute being able to reduce the alignment of a type -- but only when used in a typedef (!!!) -- is a non-standard GCC extension which is not even properly documented anywhere. And it fails to work in some places, like trying to use it as a template parameter (GCC actually diagnoses that as not working; clang silently ignores it). So, I'm not saying everything is completely hunky-dory here. But I do think it's essentially a workable and sane model. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/11cb3580/attachment.html>
John McCall via llvm-dev
2016-Jan-15 08:14 UTC
[llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang
> On Jan 14, 2016, at 10:28 PM, James Y Knight <jyknight at google.com> wrote: > > (Sorry for the duplicate mail, Richard, I accidentally sent a copy only to you before.) > > On Thu, Jan 14, 2016 at 10:26 PM, Richard Smith via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote: > [1]: That's not completely true, as it's possible to create an object that is underaligned for its type: > > struct __attribute__((packed)) A { > char k; > struct B { int n; } b; > } a; > A::B *p = &a.b; // Misaligned pointer, now guaranteed OK > int main() { > int *q = &p->n; // UB? UBSan diagnoses this member access > return *q; // Obviously UB > } > > It seems that we do need to have some syntactic rules for how far the known alignment propagates to handle this case; your proposed rules don't do the right thing here. > > > I don't think any new rules are really needed here. > > With your example, I expect even doing "int x = a.b.n;" to be okay. The value "a.b" is actually (effectively) typed "struct A::B __attribute__((aligned(1))) *", and thus fine to access a sub-object of, as the pointer is NOT misaligned for its type! And similarly, the type of &a.b.n is "int __attribute__((aligned(1))) *", and also thus not misaligned. > > While casting the result of "&a.b" to "A::B *" in the expression "A::B *p = &a.b" is fine (well, becomes more _explicitly_ fine with your suggested modification to the standard text), once you've done that, the new type _does_ require an alignment of alignof(int), and thus, the value of "p" is misaligned for its type. And so, you can no longer dereference it nor access members. That seems all fine and sensible to me. > > If you need to give a name "p" to the value, you can still do like this: > typedef struct A::B __attribute__((aligned(1))) misaligned_A_B; > misaligned_A_B *p = &a.b; > int x = p->n; > > AFAICT, this all works today, as well. > > Now, of course, the "aligned" attribute being able to reduce the alignment of a type -- but only when used in a typedef (!!!) -- is a non-standard GCC extension which is not even properly documented anywhere. And it fails to work in some places, like trying to use it as a template parameter (GCC actually diagnoses that as not working; clang silently ignores it). > > So, I'm not saying everything is completely hunky-dory here. But I do think it's essentially a workable and sane model.The question at hand is whether we should require the user to write this: misaligned_A_B *p = &a.b; instead of, say: A::B *p = &a.b; int x = *(misaligned_int*) &p->n; because we want to reserve the right to invoke undefined behavior and propagate our “knowledge" that p is 4-byte-aligned to “improve” the 1-byte-aligned access on the next line. My contention is that this is a clean and elegantly simple formal model that is disastrously bad for actual users because it is no longer possible to locally work around a mis-alignment bug without tracking the entire history of the pointer. It is the sort of compiler policy that gets people to roll their eyes and ask for new options called things like -fno-strict-type-alignment which gradually get adopted by 90% of projects. John. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/0db745b4/attachment.html>
David Chisnall via llvm-dev
2016-Jan-18 11:18 UTC
[llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang
Hi John, On 15 Jan 2016, at 08:14, John McCall via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > The question at hand is whether we should require the user to write this: > misaligned_A_B *p = &a.b; > instead of, say: > A::B *p = &a.b; > int x = *(misaligned_int*) &p->n; > because we want to reserve the right to invoke undefined behavior and propagate our “knowledge" that p is 4-byte-aligned to “improve” the 1-byte-aligned access on the next line. > > My contention is that this is a clean and elegantly simple formal model that is disastrously bad for actual users because it is no longer possible to locally work around a mis-alignment bug without tracking the entire history of the pointer. It is the sort of compiler policy that gets people to roll their eyes and ask for new options called things like -fno-strict-type-alignment which gradually get adopted by 90% of projects.I’ve had the misfortune to look at a lot of code that does unaligned access over the last few years. By far the most common reason for it that I’ve seen is networking code that uses packed structures to represent packets. For example: __attribute__((packed)) struct somePacket { uint8_t a; uint32_t b; // ... }; In your model, what happens when: - I use field b directly? - I take the address of field b and store it in an int* variable? David