thr3ads.net - llvm dev - [llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Peter Collingbourne via llvm-dev

2016-Oct-11 22:15 UTC

[llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands

On Tue, Oct 11, 2016 at 2:48 PM, Chris Lattner <clattner at apple.com>
wrote:
> On Oct 11, 2016, at 12:04 AM, Peter Collingbourne <peter at
pcc.me.uk> wrote:
>
> I have been experimenting with a number of approaches to representation in
>> SDAG, and I have found one that seems to work best, and would be the
least
>> intrusive (unfortunately most approaches to this problem are somewhat
>> intrusive).
>>
>> Specifically, I want to:
>> 1) move most of the body of ConstantSDNode to a new class,
>> ConstantIntSDNode, which would derives from ConstantSDNode.
ConstantSDNode
>> would act as the base class for immediates-post-static-linking. Change
>> most references to ConstantSDNode in C++ code to refer to
>> ConstantIntSDNode. However, "imm" in tblgen code would
continue to match
>> ConstantSDNode.
>> 2) introduce a new derived class of ConstantSDNode for references to
>> globals with !range metadata, and teach SDAG to use this new derived
class
>> for fixed address references
>>
>>
>> ConstantSDNode is poorly named, and renaming it to ConstantIntSDNode is
>> probably the right thing to do independently of the other changes.
>>
>> That said, I don’t understand why you’d keep ConstantSDNode around and
>> introduce a new derived class of it.  This seems like something that a
new
>> “imm" immediate matcher would handle: it would match constants in
a certain
>> range, or a GlobalAddressSDNode known-to-be-small.
>>
>
> To begin with: I'm not sure that GlobalAddressSDNode is the right node
to
> use for these types of immediates. It seems that we have two broad classes
> of globals here: those with a fixed-at-link-time address (e.g. regular
> non-PIC symbols, absolute symbols) and those where the address needs to be
> computed (e.g. PC-relative addresses, TLS variables). To me it seems like
> the first class is much more similar to immediates than to the second
> class. That suggested to me that there ought to be two separate
> representations for global variables, where the former are
"morally"
> immediates, and the latter are not (i.e. the existing GlobalAddressSDNode).
>
>
> I understand what you’re saying, but I don’t think that is the key issue
> here.  The relevant SDNode subclasses are concerned with representing the
> structural input code (in this case a GlobalValue*) not about representing
> the target-specific concept at work here (this particular GV has an address
> known to fit in this specific relocation).  The structure of SelectionDAG
> types like SDNode needs to be target independent, and target specific
> matchers are the ones that handle discrepancies.
>
>
> I went over a couple of approaches for representing "moral"
immediates in
> my llvm-commits post. The first one seems to be more like what you're
> suggesting:
>
> > - Introduce a new opcode for absolute symbol constants.
>
>
> If you mean a new ISD opcode, then I don’t think this makes sense.  We
> already have an opcode for that represents the address of a global value,
> we should use it.  “absolute symbol constants” are a special case of them,
> and using a predicate to handle matching them should work fine.  What am I
> missing?
>
> This intuitively seemed like the least risky approach, as individual
> instructions could "opt in" to the new absolute symbol
references. However,
> this seems hard to fit into the existing SDAG pattern matching engine, as
> the engine expects each "variable" to have a specific opcode. I
tried
> adding special support for "either of the two constant opcodes"
to the
> matcher, but I could not see a good way to do it without making fundamental
> changes to how patterns are matched.
>
>
> I think you’ll have to define the matcher in C++ with ComplexPattern,
> analogously to how the addressing mode selection logic works.  This allows
> you to specify multiple ISD nodes that it can match.
>
> > - Use the ISD::Constant opcode for absolute symbol constants, but
> introduce a separate class for them. This also seemed problematic, as there
> is a strong assumption (both in existing SDAG code and in generated code)
> of a many-to-one mapping from opcodes to classes.
>
>
> This also doesn’t make sense to me.  The fundamental issue you’re
> grappling with is that you have two different “input” concepts (small
> immediates, and globals whose absolute address fits in that range) that you
> want to handle the same way.  You need to do something like ComplexPattern
> to handle this.
>
Thanks Chris. I will take a closer look at ComplexPattern, I was somehow
unaware of it until now. From a brief look it does seem to allow me to do
what I need here while being less intrusive.

I think my main concerns with it are that using ComplexPattern pervasively
in instruction patterns may cause bloat in the pattern matching tables and
that it may cause FastISel to bail out more often. Those don't seem like
insurmountable problems though, we may just need a specialized variant of
it.

Thanks,
-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161011/1eae7b8f/attachment-0001.html>

Peter Collingbourne via llvm-dev

2016-Oct-12 03:13 UTC

head link

[llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands

On Tue, Oct 11, 2016 at 3:15 PM, Peter Collingbourne <peter at pcc.me.uk>
wrote:
>
>
> On Tue, Oct 11, 2016 at 2:48 PM, Chris Lattner <clattner at
apple.com> wrote:
>
>> On Oct 11, 2016, at 12:04 AM, Peter Collingbourne <peter at
pcc.me.uk>
>> wrote:
>>
>> I have been experimenting with a number of approaches to representation
>>> in SDAG, and I have found one that seems to work best, and would be
the
>>> least intrusive (unfortunately most approaches to this problem are
somewhat
>>> intrusive).
>>>
>>> Specifically, I want to:
>>> 1) move most of the body of ConstantSDNode to a new class,
>>> ConstantIntSDNode, which would derives from ConstantSDNode.
ConstantSDNode
>>> would act as the base class for immediates-post-static-linking.
Change
>>> most references to ConstantSDNode in C++ code to refer to
>>> ConstantIntSDNode. However, "imm" in tblgen code would
continue to match
>>> ConstantSDNode.
>>> 2) introduce a new derived class of ConstantSDNode for references
to
>>> globals with !range metadata, and teach SDAG to use this new
derived class
>>> for fixed address references
>>>
>>>
>>> ConstantSDNode is poorly named, and renaming it to
ConstantIntSDNode is
>>> probably the right thing to do independently of the other changes.
>>>
>>> That said, I don’t understand why you’d keep ConstantSDNode around
and
>>> introduce a new derived class of it.  This seems like something
that a new
>>> “imm" immediate matcher would handle: it would match constants
in a certain
>>> range, or a GlobalAddressSDNode known-to-be-small.
>>>
>>
>> To begin with: I'm not sure that GlobalAddressSDNode is the right
node to
>> use for these types of immediates. It seems that we have two broad
classes
>> of globals here: those with a fixed-at-link-time address (e.g. regular
>> non-PIC symbols, absolute symbols) and those where the address needs to
be
>> computed (e.g. PC-relative addresses, TLS variables). To me it seems
like
>> the first class is much more similar to immediates than to the second
>> class. That suggested to me that there ought to be two separate
>> representations for global variables, where the former are
"morally"
>> immediates, and the latter are not (i.e. the existing
GlobalAddressSDNode).
>>
>>
>> I understand what you’re saying, but I don’t think that is the key
issue
>> here.  The relevant SDNode subclasses are concerned with representing
the
>> structural input code (in this case a GlobalValue*) not about
representing
>> the target-specific concept at work here (this particular GV has an
address
>> known to fit in this specific relocation).  The structure of
SelectionDAG
>> types like SDNode needs to be target independent, and target specific
>> matchers are the ones that handle discrepancies.
>>
>>
>> I went over a couple of approaches for representing "moral"
immediates in
>> my llvm-commits post. The first one seems to be more like what
you're
>> suggesting:
>>
>> > - Introduce a new opcode for absolute symbol constants.
>>
>>
>> If you mean a new ISD opcode, then I don’t think this makes sense.  We
>> already have an opcode for that represents the address of a global
value,
>> we should use it.  “absolute symbol constants” are a special case of
them,
>> and using a predicate to handle matching them should work fine.  What
am I
>> missing?
>>
>> This intuitively seemed like the least risky approach, as individual
>> instructions could "opt in" to the new absolute symbol
references. However,
>> this seems hard to fit into the existing SDAG pattern matching engine,
as
>> the engine expects each "variable" to have a specific opcode.
I tried
>> adding special support for "either of the two constant
opcodes" to the
>> matcher, but I could not see a good way to do it without making
fundamental
>> changes to how patterns are matched.
>>
>>
>> I think you’ll have to define the matcher in C++ with ComplexPattern,
>> analogously to how the addressing mode selection logic works.  This
allows
>> you to specify multiple ISD nodes that it can match.
>>
>> > - Use the ISD::Constant opcode for absolute symbol constants, but
>> introduce a separate class for them. This also seemed problematic, as
there
>> is a strong assumption (both in existing SDAG code and in generated
code)
>> of a many-to-one mapping from opcodes to classes.
>>
>>
>> This also doesn’t make sense to me.  The fundamental issue you’re
>> grappling with is that you have two different “input” concepts (small
>> immediates, and globals whose absolute address fits in that range) that
you
>> want to handle the same way.  You need to do something like
ComplexPattern
>> to handle this.
>>
>
> Thanks Chris. I will take a closer look at ComplexPattern, I was somehow
> unaware of it until now. From a brief look it does seem to allow me to do
> what I need here while being less intrusive.
>
> I think my main concerns with it are that using ComplexPattern pervasively
> in instruction patterns may cause bloat in the pattern matching tables and
> that it may cause FastISel to bail out more often. Those don't seem
like
> insurmountable problems though, we may just need a specialized variant of
> it.
>
I'm afraid I might be misunderstanding the basics here -- I tried to start
off by replacing an "imm" matcher in one of the x86 patterns with a
ComplexPattern that simply matches ConstantSDNodes, as shown in the
attached patch. I would have expected this to be a no-op change, but I
found that this change broke (i.e. caused assertion failures in) a large
number of test cases, e.g. "test/CodeGen/X86/2012-07-16-fp2ui-i1.ll".

Is there anything I'm doing wrong here, or are there just bugs that need to
be fixed?

Thanks,
-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161011/db7df421/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-fancyimm.patch
Type: text/x-patch
Size: 2132 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161011/db7df421/attachment.bin>

Krzysztof Parzyszek via llvm-dev

2016-Oct-13 00:00 UTC

head link

[llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands

On 10/11/2016 10:13 PM, Peter Collingbourne via llvm-dev
wrote:> +    bool SelectFancyImm(SDValue N, SDValue &Op) {
> +      if (isa<ConstantSDNode>(N)) {
> +        Op = N;
> +        return true;
> +      } else {
> +        return false;
> +      }
> +    }
The result should be a TargetConstant.  Something like

bool SelectFancyImm(SDValue N, SDValue &Op) {
   auto *CN = dyn_cast<ConstantSDNode>(N);
   if (!CN)
     return false;
   Op = CurDAG->getTargetConstant(CN->getZExtValue(), MVT::i32);
   return true;
}

-Krzysztof

llvm dev - Oct 2016 - RFC: Absolute or "fixed address" symbols as immediate operands

[llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands

[llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands

[llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands