On Jul 25, 2011, at 8:25 AM, Duncan Sands wrote:> Hi Chris,
> 
>>> Chris made major changes to the LLVM type system, which requires
major changes
>>> to dragonegg (c.f. the patch Chris applied to llvm-gcc).  I'm
on holiday which
>>> is why I haven't taken care of it yet.
>> 
>> Yeah, sorry about that Duncan.  I can't directly hack on the code,
but I'm happy to answer questions when you get back.
> 
> there seems to be a nasty flaw in the scheme you devised for llvm-gcc,
involving
> arrays.
I believe it.  I just did the minimal possible thing to get llvm-gcc to
bootstrap for me, because some people have trouble letting go of the past,
I'm sorry for the impact that had on dragonegg.  Clang is at least working
now :)
>  Consider the following mutually recursive arrays of structs:
> 
>  struct S has a field which is a pointer to array type A.
>  The element type of array type A is struct T.
>  struct T has a field which is a pointer to array type B.
>  The element type of array type B is struct S.
> 
> When trying to convert one of the structs or arrays, with the current
scheme you
> inevitably end up forming an array of opaque struct types.  This
immediately
> blows up dragonegg because it wants to compute the array size to see if it
needs
> to append padding.  It can blow up llvm-gcc too, because it also wants to
get
> the size of arrays in more exotic circumstances.
> 
> Do you have any suggestions on the best way to handle this?
Right, there are similar cases involving function pointers etc.  For example,
here's an evil case where converting a struct converts a function pointer
that wants the struct being converted (kudos to Eli):
struct foo {
  void (*FP)(struct foo);
} g;
If you run this through clang, you get:
%struct.foo = type { {}* }
@g = common global %struct.foo zeroinitializer, align 8
and any uses of "FP" do a bitcast to the right type.  The way this
works in clang (and the attempt in llvm-gcc which probably is pretty close just
in need of debugging) is that when a struct is being converted, if we go to
convert a pointer underneath it, we disable converting the pointee (just
returning {}).  At one stage in its development, this was pretty hardcore: we
were lowering stuff like:
struct x { int a; };
struct y {
  struct x *P;
} h;
to y = "{ {}* }".
This is effectively your #3 below.
> (3) convert all pointers to i8*.  I'm not sure how much this would get
in
> the way of the optimizers.  Note that dragonegg distinguishes between
> register types and in-memory types (I think clang does the same), and
pointer
> types used for registers would still be sensible (struct S* rather than
i8*).
Yep, I did this to beat out all of the assumptions in the clang IRGen that was
assuming that the type return by a field access load was sane (they'd blow
up when they got an unexpected {}* instead of a function pointer or whatever).
Once everything worked, I added the 'isSafeToConvert' logic in
clang/lib/CodeGen/CodeGenTypes.cpp. The idea is that when converting the first
"g" example, there is no way we can do better than {}*.  OTOH, in the
second "h" example there is no problem recursively converting X to a
nice type, and isSafeToConvert returns true for it.  Because we can recurse into
it, we get pretty IR for that case (and this handles almost every other common
case):
%struct.y = type { %struct.x* }
%struct.x = type { i32 }
@h = common global %struct.y zeroinitializer, align 8
This gets clang the benefit of getting pretty types that are maximally friendly
to the optimizer (and the compiler hacking reading IR!) while still being able
to fall back to an ugly type when it is required by conversion.
-Chris