thr3ads.net - llvm dev - [LLVMdev] make DataLayout a mandatory part of Module [Jan 2014]

If this information is useful, please help other people find it:
Share via:

Nick Lewycky

2014-Jan-29 23:40 UTC

[LLVMdev] make DataLayout a mandatory part of Module

The LLVM Module has an optional target triple and target datalayout.
Without them, an llvm::DataLayout can't be constructed with meaningful
data. The benefit to making them optional is to permit optimization that
would work across all possible DataLayouts, then allow us to commit to a
particular one at a later point in time, thereby performing more
optimization in advance.

This feature is not being used. Instead, every user of LLVM IR in a
portability system defines one or more standardized datalayouts for their
platform, and shims to place calls with the outside world. The primary
reason for this is that independence from DataLayout is not sufficient to
achieve portability because it doesn't also represent ABI lowering
constraints. If you have a system that attempts to use LLVM IR in a
portable fashion and does it without standardizing on a datalayout, please
share your experience.

The cost to keeping this feature around is that we have to pass around the
DataLayout object in many places, test for its presence, in some cases
write different optimizations depending on whether we have DataLayout, and
in the worst case I can think of, we have two different canonical forms for
constant expressions depending on whether DL is present. Our canonical IR
is different with and without datalayout, and we have two canonicalizers
fighting it out (IR/ConstantFold.cpp and Analysis/ConstantFolding.cpp).

I'm trying to force the issue. Either this is a useful feature to maintain
in which case I want to see a design on how to defer ABI decisions until a
later point in time, or else we do not support it and target triple and
target datalayout become a mandatory part of a valid Module again. I think
the correct direction is to make them mandatory, but this is a large change
that warrants debate.

If we decide that target information should be a mandatory part of a
module, there's another question about what we should do with existing .bc
and .ll files that don't have one. Load in a default of the host machine?

Nick
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140129/8660dd36/attachment.html>

Jim Grosbach

2014-Jan-29 23:53 UTC

head link

[LLVMdev] make DataLayout a mandatory part of Module

Hi Nick,

The main use case I’ve seen is that it makes writing generic test cases for
‘opt’ easier in that it’s not necessary to specify a target triple on the
command line or have a data layout in the .ll/.bc file. That is, in my
experience, it’s more for convenience and perhaps historical layering
considerations.

I have no philosophical objection to the direction you’re suggesting.

For modules without a data layout, use the host machine as you suggest. That’s
consistent with what already happens with llc, so extending that to opt and
other such tools seems reasonable to me.

-Jim

On Jan 29, 2014, at 3:40 PM, Nick Lewycky <nlewycky at google.com> wrote:
> The LLVM Module has an optional target triple and target datalayout.
Without them, an llvm::DataLayout can't be constructed with meaningful data.
The benefit to making them optional is to permit optimization that would work
across all possible DataLayouts, then allow us to commit to a particular one at
a later point in time, thereby performing more optimization in advance.
> 
> This feature is not being used. Instead, every user of LLVM IR in a
portability system defines one or more standardized datalayouts for their
platform, and shims to place calls with the outside world. The primary reason
for this is that independence from DataLayout is not sufficient to achieve
portability because it doesn't also represent ABI lowering constraints. If
you have a system that attempts to use LLVM IR in a portable fashion and does it
without standardizing on a datalayout, please share your experience.
> 
> The cost to keeping this feature around is that we have to pass around the
DataLayout object in many places, test for its presence, in some cases write
different optimizations depending on whether we have DataLayout, and in the
worst case I can think of, we have two different canonical forms for constant
expressions depending on whether DL is present. Our canonical IR is different
with and without datalayout, and we have two canonicalizers fighting it out
(IR/ConstantFold.cpp and Analysis/ConstantFolding.cpp).
> 
> I'm trying to force the issue. Either this is a useful feature to
maintain in which case I want to see a design on how to defer ABI decisions
until a later point in time, or else we do not support it and target triple and
target datalayout become a mandatory part of a valid Module again. I think the
correct direction is to make them mandatory, but this is a large change that
warrants debate.
> 
> If we decide that target information should be a mandatory part of a
module, there's another question about what we should do with existing .bc
and .ll files that don't have one. Load in a default of the host machine?
> 
> Nick
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Nick Lewycky

2014-Jan-30 00:04 UTC

head link

[LLVMdev] make DataLayout a mandatory part of Module

On 29 January 2014 15:53, Jim Grosbach <grosbach at apple.com> wrote:
> Hi Nick,
>
> The main use case I’ve seen is that it makes writing generic test cases
> for ‘opt’ easier in that it’s not necessary to specify a target triple on
> the command line or have a data layout in the .ll/.bc file. That is, in my
> experience, it’s more for convenience and perhaps historical layering
> considerations.
>
> I have no philosophical objection to the direction you’re suggesting.
>
> For modules without a data layout, use the host machine as you suggest.
> That’s consistent with what already happens with llc, so extending that to
> opt and other such tools seems reasonable to me.
>
This is also what many clang tests do, where TUs get parsed using the host
triple. If we keep target datalayout out of the test files and fill it in
with the host's information, then our test coverage expands as our buildbot
diversity grows, which is a neat property.

Nick

On Jan 29, 2014, at 3:40 PM, Nick Lewycky <nlewycky at google.com>
wrote:>
> > The LLVM Module has an optional target triple and target datalayout.
> Without them, an llvm::DataLayout can't be constructed with meaningful
> data. The benefit to making them optional is to permit optimization that
> would work across all possible DataLayouts, then allow us to commit to a
> particular one at a later point in time, thereby performing more
> optimization in advance.
> >
> > This feature is not being used. Instead, every user of LLVM IR in a
> portability system defines one or more standardized datalayouts for their
> platform, and shims to place calls with the outside world. The primary
> reason for this is that independence from DataLayout is not sufficient to
> achieve portability because it doesn't also represent ABI lowering
> constraints. If you have a system that attempts to use LLVM IR in a
> portable fashion and does it without standardizing on a datalayout, please
> share your experience.
> >
> > The cost to keeping this feature around is that we have to pass around
> the DataLayout object in many places, test for its presence, in some cases
> write different optimizations depending on whether we have DataLayout, and
> in the worst case I can think of, we have two different canonical forms for
> constant expressions depending on whether DL is present. Our canonical IR
> is different with and without datalayout, and we have two canonicalizers
> fighting it out (IR/ConstantFold.cpp and Analysis/ConstantFolding.cpp).
> >
> > I'm trying to force the issue. Either this is a useful feature to
> maintain in which case I want to see a design on how to defer ABI decisions
> until a later point in time, or else we do not support it and target triple
> and target datalayout become a mandatory part of a valid Module again. I
> think the correct direction is to make them mandatory, but this is a large
> change that warrants debate.
> >
> > If we decide that target information should be a mandatory part of a
> module, there's another question about what we should do with existing
.bc
> and .ll files that don't have one. Load in a default of the host
machine?
> >
> > Nick
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140129/f3c17a7a/attachment.html>

Philip Reames

2014-Jan-30 17:55 UTC

head link

[LLVMdev] make DataLayout a mandatory part of Module

On 1/29/14 3:40 PM, Nick Lewycky wrote:> The LLVM Module has an optional target triple and target datalayout. 
> Without them, an llvm::DataLayout can't be constructed with meaningful 
> data. The benefit to making them optional is to permit optimization 
> that would work across all possible DataLayouts, then allow us to 
> commit to a particular one at a later point in time, thereby 
> performing more optimization in advance.
>
> This feature is not being used. Instead, every user of LLVM IR in a 
> portability system defines one or more standardized datalayouts for 
> their platform, and shims to place calls with the outside world. The 
> primary reason for this is that independence from DataLayout is not 
> sufficient to achieve portability because it doesn't also represent 
> ABI lowering constraints. If you have a system that attempts to use 
> LLVM IR in a portable fashion and does it without standardizing on a 
> datalayout, please share your experience.Nick, I don't have a current system in place, but I do want to put 
forward an alternate perspective.

We've been looking at doing late insertion of safepoints for garbage 
collection.  One of the properties that we end up needing to preserve 
through all the optimizations which precede our custom rewriting phase 
is that the optimizer has not chosen to "hide" pointers from us by
using
ptrtoint and integer math tricks. Currently, we're simply running a 
verification pass before our rewrite, but I'm very interested long term 
in constructing ways to ensure a "gc safe" set of optimization passes.

One of the ways I've been thinking about - but haven't actually 
implemented yet - is to deny the optimization passes information about 
pointer sizing.  Under the assumption that an opto pass can't insert an 
ptrtoint cast without knowing a safe integer size to use, this seems 
like it would outlaw a class of optimizations we'd be broken by.

My understanding is that the only current way to do this would be to not 
specify a DataLayout.  (And hack a few places with built in 
assumptions.  Let's ignore that for the moment.)  With your proposed 
change, would there be a clean way to express something like this?

p.s. From reading the mailing list a while back, I suspect that the SPIR 
folks might have similar needs.  (i.e. hiding pointer sizes, etc..)  
Pure speculation on my part though.

Philip

Rafael Espíndola

2014-Jan-30 21:07 UTC

head link

[LLVMdev] make DataLayout a mandatory part of Module

On 29 January 2014 18:40, Nick Lewycky <nlewycky at google.com>
wrote:> The LLVM Module has an optional target triple and target datalayout.
Without
> them, an llvm::DataLayout can't be constructed with meaningful data.
The
> benefit to making them optional is to permit optimization that would work
> across all possible DataLayouts, then allow us to commit to a particular
one
> at a later point in time, thereby performing more optimization in advance.
>
> This feature is not being used. Instead, every user of LLVM IR in a
> portability system defines one or more standardized datalayouts for their
> platform, and shims to place calls with the outside world. The primary
> reason for this is that independence from DataLayout is not sufficient to
> achieve portability because it doesn't also represent ABI lowering
> constraints. If you have a system that attempts to use LLVM IR in a
portable
> fashion and does it without standardizing on a datalayout, please share
your
> experience.
>
> The cost to keeping this feature around is that we have to pass around the
> DataLayout object in many places, test for its presence, in some cases
write
> different optimizations depending on whether we have DataLayout, and in the
> worst case I can think of, we have two different canonical forms for
> constant expressions depending on whether DL is present. Our canonical IR
is
> different with and without datalayout, and we have two canonicalizers
> fighting it out (IR/ConstantFold.cpp and Analysis/ConstantFolding.cpp).
>
> I'm trying to force the issue. Either this is a useful feature to
maintain
> in which case I want to see a design on how to defer ABI decisions until a
> later point in time, or else we do not support it and target triple and
> target datalayout become a mandatory part of a valid Module again. I think
> the correct direction is to make them mandatory, but this is a large change
> that warrants debate.
I don't think we can reasonably express all the information needed by
ABIs at the LLVM level. Given that, It would *love* to see DataLayout
become a mandatory part of the IR!
> If we decide that target information should be a mandatory part of a
module,
> there's another question about what we should do with existing .bc and
.ll
> files that don't have one. Load in a default of the host machine?
For tools that don't link with target (llvm-as and llvm-dis being the
most extreme cases) it would have to be the default "". For opt I
would be ok with "" or the host triple.

Thanks,
Rafael

Nick Lewycky

2014-Feb-01 01:23 UTC

head link

[LLVMdev] make DataLayout a mandatory part of Module

On 30 January 2014 09:55, Philip Reames <listmail at philipreames.com>
wrote:
> On 1/29/14 3:40 PM, Nick Lewycky wrote:
>
>> The LLVM Module has an optional target triple and target datalayout.
>> Without them, an llvm::DataLayout can't be constructed with
meaningful
>> data. The benefit to making them optional is to permit optimization
that
>> would work across all possible DataLayouts, then allow us to commit to
a
>> particular one at a later point in time, thereby performing more
>> optimization in advance.
>>
>> This feature is not being used. Instead, every user of LLVM IR in a
>> portability system defines one or more standardized datalayouts for
their
>> platform, and shims to place calls with the outside world. The primary
>> reason for this is that independence from DataLayout is not sufficient
to
>> achieve portability because it doesn't also represent ABI lowering
>> constraints. If you have a system that attempts to use LLVM IR in a
>> portable fashion and does it without standardizing on a datalayout,
please
>> share your experience.
>>
> Nick, I don't have a current system in place, but I do want to put
forward
> an alternate perspective.
>
> We've been looking at doing late insertion of safepoints for garbage
> collection.  One of the properties that we end up needing to preserve
> through all the optimizations which precede our custom rewriting phase is
> that the optimizer has not chosen to "hide" pointers from us by
using
> ptrtoint and integer math tricks. Currently, we're simply running a
> verification pass before our rewrite, but I'm very interested long term
in
> constructing ways to ensure a "gc safe" set of optimization
passes.
>
As a general rule passes need to support the whole of what the IR can
support. Trying to operate on a subset of IR seems like a losing battle,
unless you can show a mapping from one to the other (ie., using code
duplication to remove all unnatural loops from IR, or collapsing a function
to having a single exit node).

What language were you planning to do this for? Does the language permit
the user to convert pointers to integers and vice versa? If so, what do you
do if the user program writes a pointer out to a file, reads it back in
later, and uses it?

One of the ways I've been thinking about - but haven't actually
implemented> yet - is to deny the optimization passes information about pointer sizing.

Right, pointer size (address space size) will become known to all parts of
the compiler. It's not even going to be just the optimizations,
ConstantExpr::get is going to grow smarter because of this, as
lib/Analysis/ConstantFolding.cpp merges into lib/IR/ConstantFold.cpp. That
is one of the major benefits that's driving this. (All parts of the
compiler will also know endian-ness, which means we can constant fold
loads, too.)

Under the assumption that an opto pass can't insert an ptrtoint
cast> without knowing a safe integer size to use, this seems like it would outlaw
> a class of optimizations we'd be broken by.
>
Optimization passes generally prefer converting ptrtoint and inttoptr to
GEPs whenever possible. I expect that we'll end up with *fewer*
ptr<->int
conversions with this change, because we'll know enough about the target to
convert them into GEPs.

My understanding is that the only current way to do this would be to
not> specify a DataLayout.  (And hack a few places with built in assumptions.
>  Let's ignore that for the moment.)  With your proposed change, would
there
> be a clean way to express something like this?
>
I think your GC placement algorithm needs to handle inttoptr and ptrtoint,
whichever way this discussion goes. Sorry. I'd be happy to hear others
chime in -- I know I'm not an expert in this area or about GCs -- but I
don't find this rationale compelling.

p.s. From reading the mailing list a while back, I suspect that the
SPIR> folks might have similar needs.  (i.e. hiding pointer sizes, etc..)  Pure
> speculation on my part though.
>
The SPIR spec specifies two target datalayouts, one for 32 bits and one for
64 bits.

Nick
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140131/5572e2a4/attachment.html>

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Jan 2014 - [LLVMdev] make DataLayout a mandatory part of Module

[LLVMdev] make DataLayout a mandatory part of Module

[LLVMdev] make DataLayout a mandatory part of Module

[LLVMdev] make DataLayout a mandatory part of Module

[LLVMdev] make DataLayout a mandatory part of Module

[LLVMdev] make DataLayout a mandatory part of Module

[LLVMdev] make DataLayout a mandatory part of Module

Seemingly Similar Threads