thr3ads.net - llvm dev - [LLVMdev] Advice on field access, adding a Modula-3 front end [Apr 2014]

If this information is useful, please help other people find it:
Share via:

Rodney M. Bates

2014-Apr-11 01:40 UTC

[LLVMdev] Advice on field access, adding a Modula-3 front end

I am doing some preliminary investigation into splicing the Modula-3
compiler front end onto llvm.  I have a number of questions and will
no doubt have more, but will start by asking for advice on this one.

The M3 front end has lowered things farther than the llvm IR expects.
Whereas llvm accesses fields/data members of records/structs by field
number, M3 has already laid out the format of records, and its IR
accesses fields by bit offsets.

I could probably create llvm IR in this style by generating explicit
address arithmetic, but I suspect that might hurt the optimization
possibilities, perhaps a lot.  It looks like re-raising the level to
field numbers would not be horribly difficult, but it would require
using information in the M3 IR that is apparently intended to be debug
info only.  Also, it looks like M3 IR follows the same principle that
llvm does, i.e., that debug information should not affect translation.
I presume llvm does its own memory layout for structs?

It is worse with global variables and constants.  Here, in the M3 IR,
for each compilation unit, these have been collected into two records,
one for constants and one for variables, with the memory layout within
them already done.  These are accessed with byte offsets within the
two records.  What makes it more complicated is that some of the
fields are in a fixed layout that the runtime system expects.  So to
use field number access, I would still need to force llvm to accept
the memory layout I supply.  Can I do that?

Local variables come through at a matching level, so are not a
problem.

Any advice would be greatly appreciated


-- 
Rodney Bates
rodney.m.bates at acm.org

Krzysztof Parzyszek

2014-Apr-11 02:02 UTC

head link

[LLVMdev] Advice on field access, adding a Modula-3 front end

On 4/10/2014 8:40 PM, Rodney M. Bates wrote:>
> I could probably create llvm IR in this style by generating explicit
> address arithmetic, but I suspect that might hurt the optimization
> possibilities, perhaps a lot.  It looks like re-raising the level to
> field numbers would not be horribly difficult, but it would require
> using information in the M3 IR that is apparently intended to be debug
> info only.
You don't have to "re-raise" it, you may simply manufacture struct
types
that correspond to the data being used, which shouldn't be too hard if 
the data accesses to a specific member are always of the same size and 
type.  To avoid problems with the layout differing between targets, you 
could make the type "packed" and make the padding explicit.  This does
not solve problems with unions, for which address arithmetic and type 
casting may be necessary.

> I presume llvm does its own memory layout for structs?
It uses the data layout that is provided when you create the 
TargetMachine for a given target.  In other words, it can be different 
for each supported target.

> It is worse with global variables and constants.  Here, in the M3 IR,
> for each compilation unit, these have been collected into two records,
> one for constants and one for variables, with the memory layout within
> them already done.  These are accessed with byte offsets within the
> two records.  What makes it more complicated is that some of the
> fields are in a fixed layout that the runtime system expects.  So to
> use field number access, I would still need to force llvm to accept
> the memory layout I supply.  Can I do that?
Yes. Make it "packed" and add explicit padding.

The only problem may be with translating the debug information.  If all 
global variables became members of some aggregate, then I'm not sure how 
to generate debug information for them that would preserve original 
names and other relevant information.


-Krzysztof


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Krzysztof Parzyszek

2014-Apr-11 02:07 UTC

head link

[LLVMdev] Advice on field access, adding a Modula-3 front end

Check out these files.  There is a class StructLayout there, whose 
constructor generates the member offset information.

include/llvm/IR/DataLayout.h
lib/IR/DataLayout.cpp

-Krzysztof

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Rodney M. Bates

2014-Apr-11 23:23 UTC

head link

[LLVMdev] Advice on field access, adding a Modula-3 front end

On 04/10/2014 09:02 PM, Krzysztof Parzyszek wrote:> On 4/10/2014 8:40 PM, Rodney M. Bates wrote:
>>
>> I could probably create llvm IR in this style by generating explicit
>> address arithmetic, but I suspect that might hurt the optimization
>> possibilities, perhaps a lot.  It looks like re-raising the level to
>> field numbers would not be horribly difficult, but it would require
>> using information in the M3 IR that is apparently intended to be debug
>> info only.
>
> You don't have to "re-raise" it, you may simply manufacture
struct types that correspond to the data being used, which shouldn't be too
hard if the data accesses to a specific member are always of the same size and
type.  To avoid problems with the layout differing between targets, you could
make the type "packed" and make the padding explicit.  This does not
solve problems with unions, for which address arithmetic and type casting may be
necessary.
>
Yeah, I think that's kind of what I meant by "re-raise".  The
field access
operators I get have no field-identifying information other than the offset,
so I have to go backwards somewhere to find a field sequence number.
>
>> I presume llvm does its own memory layout for structs?
>
> It uses the data layout that is provided when you create the TargetMachine
for a given target.  In other words, it can be different for each supported
target.
>
So it looks like StructLayout::Structlayout does _not_ reorder non-packed
fields.
Can I rely on this?  I thought I remembered reading something to the contrary
somewhere in the documentation.
>
>> It is worse with global variables and constants.  Here, in the M3 IR,
>> for each compilation unit, these have been collected into two records,
>> one for constants and one for variables, with the memory layout within
>> them already done.  These are accessed with byte offsets within the
>> two records.  What makes it more complicated is that some of the
>> fields are in a fixed layout that the runtime system expects.  So to
>> use field number access, I would still need to force llvm to accept
>> the memory layout I supply.  Can I do that?
>
> Yes. Make it "packed" and add explicit padding.
>
> The only problem may be with translating the debug information.  If all
global variables became members of some aggregate, then I'm not sure how to
generate debug information for them that would preserve original names and other
relevant information.
>
I think I can handle that eventually, in the debugger itself.  We already have
a modified gdb that, among many other things, unscrambles access to a global so
that it looks normal to the source programmer, using a horribly cobbled up stabs
variant.  Getting better debug info, in dwarf, is one of my personal motives for
this idea.
>
> -Krzysztof
>
>
-- 
Rodney Bates
rodney.m.bates at acm.org

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Apr 2014 - [LLVMdev] Advice on field access, adding a Modula-3 front end

[LLVMdev] Advice on field access, adding a Modula-3 front end

[LLVMdev] Advice on field access, adding a Modula-3 front end

[LLVMdev] Advice on field access, adding a Modula-3 front end

[LLVMdev] Advice on field access, adding a Modula-3 front end

Apparently Analagous Threads