thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang [Jan 2016]

If this information is useful, please help other people find it:
Share via:

James Y Knight via llvm-dev

2016-Jan-15 06:28 UTC

[llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang

(Sorry for the duplicate mail, Richard, I accidentally sent a copy only to
you before.)

On Thu, Jan 14, 2016 at 10:26 PM, Richard Smith via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
>  [1]: That's not completely true, as it's possible to create an
object
> that is underaligned for its type:
>
> struct __attribute__((packed)) A {
>   char k;
>   struct B { int n; } b;
> } a;
> A::B *p = &a.b; // Misaligned pointer, now guaranteed OK
> int main() {
>   int *q = &p->n; // UB? UBSan diagnoses this member access
>   return *q; // Obviously UB
> }
>
> It seems that we do need to have some syntactic rules for how far the
> known alignment propagates to handle this case; your proposed rules
don't
> do the right thing here.
>
>I don't think any new rules are really needed here.

With your example, I expect even doing "int x = a.b.n;" to be okay.
The
value "a.b" is actually (effectively) typed "struct A::B
__attribute__((aligned(1))) *", and thus fine to access a sub-object of, as
the pointer is NOT misaligned for its type! And similarly, the type of
&a.b.n is "int __attribute__((aligned(1))) *", and also thus not
misaligned.

While casting the result of "&a.b" to "A::B *" in the
expression "A::B *p &a.b" is fine (well, becomes more _explicitly_
fine with your suggested
modification to the standard text), once you've done that, the new type
_does_ require an alignment of alignof(int), and thus, the value of
"p" is
misaligned for its type. And so, you can no longer dereference it nor
access members. That seems all fine and sensible to me.

If you need to give a name "p" to the value, you can still do like
this:
  typedef struct A::B __attribute__((aligned(1))) misaligned_A_B;
  misaligned_A_B *p = &a.b;
  int x = p->n;

AFAICT, this all works today, as well.

Now, of course, the "aligned" attribute being able to reduce the
alignment
of a type -- but only when used in a typedef (!!!) -- is a non-standard GCC
extension which is not even properly documented anywhere. And it fails to
work in some places, like trying to use it as a template parameter (GCC
actually diagnoses that as not working; clang silently ignores it).

So, I'm not saying everything is completely hunky-dory here. But I do think
it's essentially a workable and sane model.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/11cb3580/attachment.html>

John McCall via llvm-dev

2016-Jan-15 08:14 UTC

head link

[llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang

> On Jan 14, 2016, at 10:28 PM, James Y Knight <jyknight at google.com>
wrote:
> 
> (Sorry for the duplicate mail, Richard, I accidentally sent a copy only to
you before.)
> 
> On Thu, Jan 14, 2016 at 10:26 PM, Richard Smith via cfe-dev <cfe-dev at
lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>  [1]: That's not completely true, as it's possible to create an
object that is underaligned for its type:
> 
> struct __attribute__((packed)) A {
>   char k;
>   struct B { int n; } b;
> } a;
> A::B *p = &a.b; // Misaligned pointer, now guaranteed OK
> int main() {
>   int *q = &p->n; // UB? UBSan diagnoses this member access
>   return *q; // Obviously UB
> }
> 
> It seems that we do need to have some syntactic rules for how far the known
alignment propagates to handle this case; your proposed rules don't do the
right thing here.
> 
> 
> I don't think any new rules are really needed here.
> 
> With your example, I expect even doing "int x = a.b.n;" to be
okay. The value "a.b" is actually (effectively) typed "struct
A::B __attribute__((aligned(1))) *", and thus fine to access a sub-object
of, as the pointer is NOT misaligned for its type! And similarly, the type of
&a.b.n is "int __attribute__((aligned(1))) *", and also thus not
misaligned.
> 
> While casting the result of "&a.b" to "A::B *" in
the expression "A::B *p = &a.b" is fine (well, becomes more
_explicitly_ fine with your suggested modification to the standard text), once
you've done that, the new type _does_ require an alignment of alignof(int),
and thus, the value of "p" is misaligned for its type. And so, you can
no longer dereference it nor access members. That seems all fine and sensible to
me.
> 
> If you need to give a name "p" to the value, you can still do
like this:
>   typedef struct A::B __attribute__((aligned(1))) misaligned_A_B;
>   misaligned_A_B *p = &a.b;
>   int x = p->n;
> 
> AFAICT, this all works today, as well.
> 
> Now, of course, the "aligned" attribute being able to reduce the
alignment of a type -- but only when used in a typedef (!!!) -- is a
non-standard GCC extension which is not even properly documented anywhere. And
it fails to work in some places, like trying to use it as a template parameter
(GCC actually diagnoses that as not working; clang silently ignores it).
> 
> So, I'm not saying everything is completely hunky-dory here. But I do
think it's essentially a workable and sane model.
The question at hand is whether we should require the user to write this:
  misaligned_A_B *p = &a.b;
instead of, say:
  A::B *p = &a.b;
  int x = *(misaligned_int*) &p->n;
because we want to reserve the right to invoke undefined behavior and propagate
our “knowledge" that p is 4-byte-aligned to “improve” the 1-byte-aligned
access on the next line.

My contention is that this is a clean and elegantly simple formal model that is
disastrously bad for actual users because it is no longer possible to locally
work around a mis-alignment bug without tracking the entire history of the
pointer.  It is the sort of compiler policy that gets people to roll their eyes
and ask for new options called things like -fno-strict-type-alignment which
gradually get adopted by 90% of projects.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/0db745b4/attachment.html>

David Chisnall via llvm-dev

2016-Jan-18 11:18 UTC

head link

[llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang

Hi John,

On 15 Jan 2016, at 08:14, John McCall via llvm-dev <llvm-dev at
lists.llvm.org> wrote:> 
> The question at hand is whether we should require the user to write this:
>   misaligned_A_B *p = &a.b;
> instead of, say:
>   A::B *p = &a.b;
>   int x = *(misaligned_int*) &p->n;
> because we want to reserve the right to invoke undefined behavior and
propagate our “knowledge" that p is 4-byte-aligned to “improve” the
1-byte-aligned access on the next line.
> 
> My contention is that this is a clean and elegantly simple formal model
that is disastrously bad for actual users because it is no longer possible to
locally work around a mis-alignment bug without tracking the entire history of
the pointer.  It is the sort of compiler policy that gets people to roll their
eyes and ask for new options called things like -fno-strict-type-alignment which
gradually get adopted by 90% of projects.
I’ve had the misfortune to look at a lot of code that does unaligned access over
the last few years.  By far the most common reason for it that I’ve seen is
networking code that uses packed structures to represent packets.  For example:

__attribute__((packed))
struct somePacket
{
	uint8_t a;
	uint32_t b;
	// ...
};

In your model, what happens when:

- I use field b directly?
- I take the address of field b and store it in an int* variable?

David

llvm dev - Jan 2016 - [cfe-dev] RFC: Enforcing pointer type alignment in Clang

[llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang

[llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang

[llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang