thr3ads.net - llvm dev - [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information [Aug 2012]

If this information is useful, please help other people find it:
Share via:

Chris Lattner

2012-Aug-28 05:22 UTC

[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

<moving this to llvmdev now that the lists are back up!>

On Aug 23, 2012, at 4:37 PM, Dan Gohman <gohman at apple.com>
wrote:> On Aug 23, 2012, at 4:05 PM, Chris Lattner <clattner at apple.com>
wrote:
>> On Aug 23, 2012, at 3:59 PM, Dan Gohman <gohman at apple.com>
wrote:
>>> On Aug 23, 2012, at 3:31 PM, Chris Lattner <clattner at
apple.com> wrote:
>>>> Interesting approach.  The IR type for a struct may or may not
be enough to describe holes (think unions and other cases), have you considered
a more explicit MDNode that describes the ranges of any holes?
>>> 
>>> What's the issue with unions? Do you mean unions containing
structs
>>> containing holes?
>> 
>> Unions don't lower to a unique or useful IR type.  In general,
I'm skeptical of anything that uses IR types to reason about source level
types (except primitives like integers and floats).
> 
> I'm confused. It seems a big difference here between your expectations
> and my understanding is that you're expecting to see source level types
> here, whereas it hadn't even occurred to me that we should try to
represent
> source level types.
My point here is that the frontend reasons about two things: 1) a source level
construct of a type, and 2) LLVM IR types.   The LLVM IR type lowering is not
guaranteed cover all fields in the source type (e.g. in the case of unions).

Let me give you a dumb example.  Consider:

union x {
  struct { char b;  int c; } a;
  short b;
} u;

On my system, Clang codegen's this to:

%union.x = type { %struct.anon }
%struct.anon = type { i8, i32 }

This isn't a safe IR type to use to describe a memcpy (because it
wouldn't copy all of "b"), so implementing your proposal would
requiring implementing yet-another conversion from AST types to LLVM types that
*is* guaranteed to cover all the fields.

Instead of implementing this, it would be a lot easier for clang to walk a type
and produce a mask describing all the holes in a type, using a simple recursive
algorithm (where union intersects the member "hole sets", finding that
byte 3/4 of the union is a hole).

Given this, it makes a lot more sense to explicitly model this hole set in an
MDNode (e.g. by using a list of byte ranges?) instead of representing the holes
with a null pointer constant of some IR type.

Does this make sense?

-Chris

Dan Gohman

2012-Aug-28 18:50 UTC

head link

[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

On Aug 27, 2012, at 10:22 PM, Chris Lattner <clattner at apple.com> wrote:
> <moving this to llvmdev now that the lists are back up!>
> 
> On Aug 23, 2012, at 4:37 PM, Dan Gohman <gohman at apple.com> wrote:
>> On Aug 23, 2012, at 4:05 PM, Chris Lattner <clattner at
apple.com> wrote:
>>> On Aug 23, 2012, at 3:59 PM, Dan Gohman <gohman at apple.com>
wrote:
>>>> On Aug 23, 2012, at 3:31 PM, Chris Lattner <clattner at
apple.com> wrote:
>>>>> Interesting approach.  The IR type for a struct may or may
not be enough to describe holes (think unions and other cases), have you
considered a more explicit MDNode that describes the ranges of any holes?
>>>> 
>>>> What's the issue with unions? Do you mean unions containing
structs
>>>> containing holes?
>>> 
>>> Unions don't lower to a unique or useful IR type.  In general,
I'm skeptical of anything that uses IR types to reason about source level
types (except primitives like integers and floats).
>> 
>> I'm confused. It seems a big difference here between your
expectations
>> and my understanding is that you're expecting to see source level
types
>> here, whereas it hadn't even occurred to me that we should try to
represent
>> source level types.
> 
> My point here is that the frontend reasons about two things: 1) a source
level construct of a type, and 2) LLVM IR types.   The LLVM IR type lowering is
not guaranteed cover all fields in the source type (e.g. in the case of unions).
> 
> Let me give you a dumb example.  Consider:
> 
> union x {
>  struct { char b;  int c; } a;
>  short b;
> } u;
Ok, so the answer to my question above is, yes, you are talking about
unions containing structs containing holes.
> 
> On my system, Clang codegen's this to:
> 
> %union.x = type { %struct.anon }
> %struct.anon = type { i8, i32 }
> 
> This isn't a safe IR type to use to describe a memcpy (because it
wouldn't copy all of "b"), so implementing your proposal would
requiring implementing yet-another conversion from AST types to LLVM types that
*is* guaranteed to cover all the fields.
> 
> Instead of implementing this, it would be a lot easier for clang to walk a
type and produce a mask describing all the holes in a type, using a simple
recursive algorithm (where union intersects the member "hole sets",
finding that byte 3/4 of the union is a hole).
> Given this, it makes a lot more sense to explicitly model this hole set in
an MDNode (e.g. by using a list of byte ranges?) instead of representing the
holes with a null pointer constant of some IR type.
I'll send out a new proposal according to this design.

Dan

Krzysztof Parzyszek

2012-Aug-30 20:30 UTC

head link

[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

On 8/28/2012 12:22 AM, Chris Lattner wrote:> Instead of implementing this, it would be a lot easier for clang to walk a
type and produce a mask describing all the holes in a type, using a simple
recursive algorithm (where union intersects the member "hole sets",
finding that byte 3/4 of the union is a hole).
>
> Given this, it makes a lot more sense to explicitly model this hole set in
an MDNode (e.g. by using a list of byte ranges?) instead of representing the
holes with a null pointer constant of some IR type.
I guess I'm late to the party, but another possibility would be to model 
structure types as lists of members with their offsets from the 
beginning of the parent aggregate.  This would require extensive changes 
to LLVM, so I'm not sure if it's an option.


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The
Linux Foundation

Renato Golin

2012-Aug-31 08:15 UTC

head link

[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

On 30 August 2012 21:30, Krzysztof Parzyszek <kparzysz at codeaurora.org>
wrote:> I guess I'm late to the party, but another possibility would be to
model
> structure types as lists of members with their offsets from the beginning
of
> the parent aggregate.  This would require extensive changes to LLVM, so
I'm
> not sure if it's an option.
This has been proposed already, and could also be used by bitfields,
but the changes were too many and was not accepted.

I think the biggest reason against was that it was strongly based on
C++ semantics and not generic enough to be considered IR material.

-- 
cheers,
--renato

http://systemcall.org/

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Aug 2012 - [LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

Possibly Parallel Threads