thr3ads.net - llvm dev - [LLVMdev] Two labels around one instruction in Codegen [Nov 2007]

If this information is useful, please help other people find it:
Share via:

Nicolas Geoffray

2007-Nov-05 19:52 UTC

[LLVMdev] Two labels around one instruction in Codegen

Hi everyone,

In order to have exceptions for non-call instructions (such as sdiv,
load or stores), I'm modifying codegen so that it generates a BeginLabel
and an EndLabel between the "may throwing" instruction. This is what
the
codegen of an InvokeInst does.

However, when generating native code, only BeginLabel is generated, and
it is generated after the instruction. I'm not familiar with DAGs in the
codegen library, so here are my 2-cents thoughts why:

1) BeginLabel and EndLabel are generated with:
  DAG.setRoot(DAG.getNode(ISD::LABEL, MVT::Other, getRoot(),
                            DAG.getConstant({Begin|End}Label, MVT::i32)));

This seems to work with InvokeInst instructions, because the root of the
DAG is modified by the instruction. With instructions such as sdiv, the
root is not modified: the instruction only lowers itself to:
DAG.getNode(OpCode, Op1.getValueType(), Op1, Op2)

Which probably makes the codegen think EndLabel and BeginLabel are in
the same place

2) Since there is no ordering between the node for the sdiv instruction
and the labels, the sdiv instruction can be generated anywhere.

These assumptions may be wrong, but it's the best I could come up with ;-).

If someone could correct me and help me found how to correctly generate
two labels between one instruction, that would be great! :)

Thanks,
Nicolas

Duncan Sands

2007-Nov-06 09:33 UTC

head link

[LLVMdev] Two labels around one instruction in Codegen

Hi Nicolas,
> In order to have exceptions for non-call instructions (such as sdiv,
> load or stores), I'm modifying codegen so that it generates a
BeginLabel
> and an EndLabel between the "may throwing" instruction. This is
what the
> codegen of an InvokeInst does.
the rule is that all instructions between eh begin labelN and eh end labelN
must unwind to the same landing pad.  This is why invokes are bracketed by
such labels.  There are also two other cases to consider: (1) potentially
throwing instructions which are not allowed to throw (nounwind), (2) throwing
instructions for which any thrown exception will not be processed in this
function.  In case (1) the instruction should have no entry in the final
dwarf exception table, while in case (2) it should have an entry.  We don't
handle (1) right now, however the plan is that nounwind calls will also be
bracketed by labels but will have no associated landing pad.  As for (2),
the dwarf writer scans all instructions in the function and if it sees a
call that is not bracketed by labels then it generates an appropriate entry
in the exception table (this will of course need to be modified to consider
all throwing instructions - note that this means that "maythrow"
markings will
have to exist right to the end of code generation!); it is done this way
because labels inhibit optimizations (we used to bracket all calls with
labels, but stopped doing that because of the optimization problem).  I'm
mentioning this because the begin and end labels are not *between* maythrow
instructions, they bracket them.
> However, when generating native code, only BeginLabel is generated, and
> it is generated after the instruction. I'm not familiar with DAGs in
the
> codegen library, so here are my 2-cents thoughts why:
> 
> 1) BeginLabel and EndLabel are generated with:
>   DAG.setRoot(DAG.getNode(ISD::LABEL, MVT::Other, getRoot(),
>                             DAG.getConstant({Begin|End}Label, MVT::i32)));
> 
> This seems to work with InvokeInst instructions, because the root of the
> DAG is modified by the instruction. With instructions such as sdiv, the
> root is not modified: the instruction only lowers itself to:
> DAG.getNode(OpCode, Op1.getValueType(), Op1, Op2)
I think that not creating a new root means that the instruction is allowed
to be re-ordered with respect to other instructions, as long as it occurs
before its uses.  Re-ordering is rather dubious for instructions that may
throw, though it's not clear what is acceptable.  I think you probably need
a new selection DAG "throw" node which you wrap throwing instructions
in, a
bit like a TokenFactor.  This throw node would be setup in such a way as to
be bracketable by labels.
> Which probably makes the codegen think EndLabel and BeginLabel are in
> the same place
In that case I would expect them both to be deleted...
> 2) Since there is no ordering between the node for the sdiv instruction
> and the labels, the sdiv instruction can be generated anywhere.
> 
> These assumptions may be wrong, but it's the best I could come up with
;-).
> 
> If someone could correct me and help me found how to correctly generate
> two labels between one instruction, that would be great! :)
Ciao,

Duncan.

Nicolas Geoffray

2007-Nov-06 17:18 UTC

head link

[LLVMdev] Two labels around one instruction in Codegen

Duncan Sands wrote:> Hi Nicolas,
>
>   
>> In order to have exceptions for non-call instructions (such as sdiv,
>> load or stores), I'm modifying codegen so that it generates a
BeginLabel
>> and an EndLabel between the "may throwing" instruction. This
is what the
>> codegen of an InvokeInst does.
>>     
>
> the rule is that all instructions between eh begin labelN and eh end labelN
> must unwind to the same landing pad.  This is why invokes are bracketed by
> such labels.  There are also two other cases to consider: (1) potentially
> throwing instructions which are not allowed to throw (nounwind), 
What do you mean "not allowed"? Is this decided by the front-end? Or
by
an optimization pass (div may throw, but if we have a = b / 5 we not it
won't throw).
> (2) throwing
> instructions for which any thrown exception will not be processed in this
> function. 
I'm not sure I understand here.
>  In case (1) the instruction should have no entry in the final
> dwarf exception table, while in case (2) it should have an entry.  We
don't
> handle (1) right now, however the plan is that nounwind calls will also be
> bracketed by labels but will have no associated landing pad. 
Why would they be bracketed by labels if codegen knows they don't throw?
>  As for (2),
> the dwarf writer scans all instructions in the function and if it sees a
> call that is not bracketed by labels then it generates an appropriate entry
> in the exception table 
Do you mean "that _is_ bracketed by labels" ?
> (this will of course need to be modified to consider
> all throwing instructions - note that this means that "maythrow"
markings will
> have to exist right to the end of code generation!); it is done this way
> because labels inhibit optimizations (we used to bracket all calls with
> labels, but stopped doing that because of the optimization problem). 
I'm
> mentioning this because the begin and end labels are not *between* maythrow
> instructions, they bracket them.
>
>   
Sure, that would be the goal. Which means the labels are not created
between an instruction, but between the instructions of a basic block.
I'll see if this works. My first implementation was between one
instruction because it was very simple to copy the invoke case for
non-calls.
>> However, when generating native code, only BeginLabel is generated, and
>> it is generated after the instruction. I'm not familiar with DAGs
in the
>> codegen library, so here are my 2-cents thoughts why:
>>
>> 1) BeginLabel and EndLabel are generated with:
>>   DAG.setRoot(DAG.getNode(ISD::LABEL, MVT::Other, getRoot(),
>>                             DAG.getConstant({Begin|End}Label,
MVT::i32)));
>>
>> This seems to work with InvokeInst instructions, because the root of
the
>> DAG is modified by the instruction. With instructions such as sdiv, the
>> root is not modified: the instruction only lowers itself to:
>> DAG.getNode(OpCode, Op1.getValueType(), Op1, Op2)
>>     
>
> I think that not creating a new root means that the instruction is allowed
> to be re-ordered with respect to other instructions, as long as it occurs
> before its uses.  Re-ordering is rather dubious for instructions that may
> throw, though it's not clear what is acceptable.  I think you probably
need
> a new selection DAG "throw" node which you wrap throwing
instructions in, a
> bit like a TokenFactor.  This throw node would be setup in such a way as to
> be bracketable by labels.
>
>   
I need to get some LLVM code reading ;-)
>> Which probably makes the codegen think EndLabel and BeginLabel are in
>> the same place
>>     
>
> In that case I would expect them both to be deleted...
>   
Only one was deleted. Consider the code:

define i32 @test(i32 %argc) {
entry:
        %tmp2 = sdiv i32 2, %argc       to label %continue unwind to
label %unwindblock ; <i32> [#uses=1]

continue:
        ret i32 %tmp2

unwindblock:
        unwind
}


And here is the resulting x86 code (Llabel1 was supposed to be before
the {ctld, idvl} and Llabel2 which was after is not generated)

test:
.Leh_func_begin1:
          
.Llabel4:
        movl    $2, %eax
        movl    4(%esp), %ecx
        cltd
        idivl   %ecx
          
.Llabel1:
.LBB1_1:        # continue
        ret
.LBB1_2:        # unwindblock


Thanks Duncan,
Nicolas

Evan Cheng

2007-Nov-07 08:07 UTC

head link

[LLVMdev] Two labels around one instruction in Codegen

On Nov 5, 2007, at 11:52 AM, Nicolas Geoffray wrote:
> Hi everyone,
>
> In order to have exceptions for non-call instructions (such as sdiv,
> load or stores), I'm modifying codegen so that it generates a  
> BeginLabel
> and an EndLabel between the "may throwing" instruction. This is  
> what the
> codegen of an InvokeInst does.
>
> However, when generating native code, only BeginLabel is generated,  
> and
> it is generated after the instruction. I'm not familiar with DAGs  
> in the
> codegen library, so here are my 2-cents thoughts why:
>
> 1) BeginLabel and EndLabel are generated with:
>   DAG.setRoot(DAG.getNode(ISD::LABEL, MVT::Other, getRoot(),
>                             DAG.getConstant({Begin|End}Label,  
> MVT::i32)));
>
> This seems to work with InvokeInst instructions, because the root  
> of the
> DAG is modified by the instruction. With instructions such as sdiv,  
> the
> root is not modified: the instruction only lowers itself to:
> DAG.getNode(OpCode, Op1.getValueType(), Op1, Op2)
DIV does not produce a chain. So the second LABEL's only operand is  
the first LABEL. The only ordering that's ensure are between the 2  
labels so it's very possible the DIV node will be scheduled after  
both labels. Actually this scheme doesn't even ensure there won't be  
anything scheduled between the first label and the DIV node. I am  
assuming that'd badness.
>
> Which probably makes the codegen think EndLabel and BeginLabel are in
> the same place
>
> 2) Since there is no ordering between the node for the sdiv  
> instruction
> and the labels, the sdiv instruction can be generated anywhere.
Right.
>
> These assumptions may be wrong, but it's the best I could come up  
> with ;-).
>
> If someone could correct me and help me found how to correctly  
> generate
> two labels between one instruction, that would be great! :)
One way to solve this right now is to use flag value. But that means  
ISD::LABEL, ISD::{S|U}DIV, ISD::LOAD, ISD::STORE will be marked  
SDNPOutFlag and SDNPOptInFlag. But that's just yucky. Perhaps we need  
to add new variants of these nodes and leave the current opcodes as  
non-faulting. But I am not certain that's a very clean solution either.

Evan
>
> Thanks,
> Nicolas
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Nicolas Geoffray

2007-Nov-07 17:07 UTC

head link

[LLVMdev] Two labels around one instruction in Codegen

Hi Evan,

Evan Cheng wrote:>
> One way to solve this right now is to use flag value. But that means  
> ISD::LABEL, ISD::{S|U}DIV, ISD::LOAD, ISD::STORE will be marked  
> SDNPOutFlag and SDNPOptInFlag. But that's just yucky. Perhaps we need  
> to add new variants of these nodes and leave the current opcodes as  
> non-faulting. But I am not certain that's a very clean solution either.
>
>   
I think having variants (1) or differentiating ISD::{S|U}DIV from other
binary instructions (2) is what we would like to avoid.

Following what we discussed with Duncan, what if we generate the labels
around a basic block? Is there a way then to ensure that the begin and
end labels will actually bracket the instructions in the block? I've
found that currently it's not the case, but perhaps it can trigger an
easier solution than (1) or (2).

Thanks,
Nicolas
> Evan
>
>   
>> Thanks,
>> Nicolas
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>     
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Nov 2007 - [LLVMdev] Two labels around one instruction in Codegen

[LLVMdev] Two labels around one instruction in Codegen

[LLVMdev] Two labels around one instruction in Codegen

[LLVMdev] Two labels around one instruction in Codegen

[LLVMdev] Two labels around one instruction in Codegen

[LLVMdev] Two labels around one instruction in Codegen

Reasonably Related Threads