thr3ads.net - llvm dev - [llvm-dev] Update control flow graph when splitting a machine basic block? [Nov 2017]

If this information is useful, please help other people find it:
Share via:

章明 via llvm-dev

2017-Nov-11 02:00 UTC

[llvm-dev] Update control flow graph when splitting a machine basic block?

> The right way to update the CFG very much depends on how you're 
> transforming it.
I would like to export the CFG for control flow checking.
Theoretically, it should be possible for a compiler to know every target of
every control flow instruction, except for computed targets that are not known
at compile-time.
When a machine basic block is split between two branches, as shown below:

BB:                 ; Old machine basic block shortened by the split
    ...
    branch to A
------------------- ; Split here
BB':                ; New machine basic block created by the split
    branch to B
    ...

BB' is a successor of BB (if there is no control flow barrier near the end
of BB).
In addition, targets of terminators like A should be added to the set of
successors of BB and possibly removed from the set of successors of BB'.
It is not always safe to remove B from the successors of BB', because A and
B may be the same machine basic block.

If edges in the CFG are interpreted as possible control flow transfers, rather
than definite control flow transfers, the most conservative way to update the
CFG might be:
1) Let BB' have all successors of the original machine basic block before
the split.
2) Add BB' and all successors of BB' to the set of successors of BB.
However, this introduce false control flow transfers into the CFG. False control
flow transfers are harmful to control flow checking, because different basic
blocks have to be assigned the same signature, which is called aliasing.

To update the CFG without introducing any false control flow transfers, one
needs to determine the target(s) of each terminator.
That is what I desire. I am not quite familiar with LLVM, so I am asking whether
it is possible to achieve that with the current LLVM API.
> Your pseudo-code looks similar to ARMBaseInstrInfo::analyzeBranch.
ARMBaseInstrInfo::analyzeBranch also attempts to determine targets of
terminators.
However, that method is not adequate to achieve what I desire, because it only
works for simple conditional/unconditional branches. When it sees something
else, e.g., a jump table, it simply reports that the machine basic block cannot
be analyzed.

Friedman, Eli via llvm-dev

2017-Nov-11 02:57 UTC

head link

[llvm-dev] Update control flow graph when splitting a machine basic block?

On 11/10/2017 6:00 PM, 章明 wrote:>> The right way to update the CFG very much depends on how you're
>> transforming it.
> I would like to export the CFG for control flow checking.
> Theoretically, it should be possible for a compiler to know every target of
every control flow instruction, except for computed targets that are not known
at compile-time.Every MachineBasicBlock has a list of successors; you can access it with 
the successors() accessor.  That's what you should be using for any CFG 
analysis.
> When a machine basic block is split between two branches, as shown below:
>
> BB:                 ; Old machine basic block shortened by the split
>      ...
>      branch to A
> ------------------- ; Split here
> BB':                ; New machine basic block created by the split
>      branch to B
>      ...
>
> BB' is a successor of BB (if there is no control flow barrier near the
end of BB).
> In addition, targets of terminators like A should be added to the set of
successors of BB and possibly removed from the set of successors of BB'.
> It is not always safe to remove B from the successors of BB', because A
and B may be the same machine basic block.
>
> If edges in the CFG are interpreted as possible control flow transfers,
rather than definite control flow transfers, the most conservative way to update
the CFG might be:
> 1) Let BB' have all successors of the original machine basic block
before the split.
> 2) Add BB' and all successors of BB' to the set of successors of
BB.
> However, this introduce false control flow transfers into the CFG. False
control flow transfers are harmful to control flow checking, because different
basic blocks have to be assigned the same signature, which is called aliasing.
>
> To update the CFG without introducing any false control flow transfers, one
needs to determine the target(s) of each terminator.
> That is what I desire. I am not quite familiar with LLVM, so I am asking
whether it is possible to achieve that with the current LLVM API.
I don't think this actually has any impact in practice; I mean, I guess 
it's an issue in theory, but in practice we don't stick branches into 
the middle of basic blocks.

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

章明 via llvm-dev

2017-Nov-11 08:20 UTC

head link

[llvm-dev] Update control flow graph when splitting a machine basic block?

Thank you for your reply!
> Every MachineBasicBlock has a list of successors; you can access it with 
> the successors() accessor.  That's what you should be using for any CFG
> analysis.
I am aware of these methods of class MachineBasicBlock, which allows one to
access a MachineBasicBlock's successors and predecessors in the CFG.
But the CFG itself may no longer be valid if a MachineBasicBlock is split
between two control flow instructions.
The accessors of class MachineBasicBlock do not automatically update the CFG.
So there is no way to access the up-to-date CFG.
> I don't think this actually has any impact in practice; I mean, I guess
> it's an issue in theory, but in practice we don't stick branches
into
> the middle of basic blocks.
I did not expect a branch in the middle of a basic block either, until yesterday
LLVM Release 4.0.0 produced the following machine basic block before the pass
ARMConstantIslands is run:

  bb.1.if.end:
    successors: %bb.3.for.body(0x80000000)
    liveins: %r4
  
    %r0 = tMOVr %r4, 14, _, debug-location !23
    tBL 14, _, $__aeabi_i2d, csr_aapcs, implicit-def dead %lr, implicit %sp,
implicit %r0, implicit-def %sp, implicit-def %r0, implicit-def %r1,
debug-location !23
    tBL 14, _, @sqrt, csr_aapcs, implicit-def dead %lr, implicit %sp, implicit
%r0, implicit %r1, implicit-def %sp, implicit-def %r0, implicit-def %r1,
debug-location !24
    tBL 14, _, $__aeabi_d2iz, csr_aapcs, implicit-def dead %lr, implicit %sp,
implicit %r0, implicit %r1, implicit-def %sp, implicit-def %r0, debug-location
!25
    DBG_VALUE 2, 0, !17, !18, debug-location !27
    DBG_VALUE debug-use %r0, debug-use _, !16, !18, debug-location !26
    tCMPi8 %r0, 2, 14, _, implicit-def %cpsr, debug-location !32
    t2IT 11, 28, implicit-def %itstate
    %r0 = tMOVi8 _, 1, 11, %cpsr, implicit %r0, implicit %itstate
    tPOP_RET 11, %cpsr, def %r4, def %r6, def %r7, def %pc, implicit %r0,
implicit %r4, implicit killed %itstate, debug-location !44
    %r1 = t2MOVi 2, 14, _, _
    t2B %bb.3.for.body, 14, _

Note that a terminator tPOP_RET is before a non-terminator t2MOVi.
The command line to produce this is as follows:

llc -mtriple=thumbv7m-none-none-eabi -mcpu=cortex-m3 -O1
-stop-before=arm-cp-islands -o prime-factorize.mir prime-factorize.ll

Attached are the input file prime-factorize.ll and output file
prime-factorize.mir.

The machine basic block above violates my previous assumption that only
terminators and debug instructions may appear after the first terminator in a
machine basic block.
I don't know how "bad" this could be, i.e., how many
non-terminators in a program could be generated between two terminators.
So I have to consider split such basic blocks before I could instrument the
program with control flow checks.

I don't know whether this is a bug or not. If it is not a bug, this
apparently isn't a purely theoretical issue.

Best regards!

Ming Zhang

llvm dev - Nov 2017 - Update control flow graph when splitting a machine basic block?

[llvm-dev] Update control flow graph when splitting a machine basic block?

[llvm-dev] Update control flow graph when splitting a machine basic block?

[llvm-dev] Update control flow graph when splitting a machine basic block?