thr3ads.net - llvm dev - [LLVMdev] CFG entry and exit blocks [Mar 2010]

If this information is useful, please help other people find it:
Share via:

Trevor Harmon

2010-Mar-31 01:00 UTC

[LLVMdev] CFG entry and exit blocks

Hi,

I'm confused about the entry and exit blocks of an LLVM CFG. I  
understand that every CFG has one and only one entry block, and this  
is confirmed by the existence of the getEntryBlock function in the  
Function class.

But what about exit (a.k.a. return) blocks? At first I assumed that  
LLVM CFGs have one and only one exit block, too, but the following  
code is a counter-example:

int simple(int j) {
     try {
         if (j > 50) throw 100;
     } catch (int status) {
         j++;
     }

     return j;
}

This produces the CFG shown in the attachment, which appears to have 3  
exit blocks. At least, there are 3 blocks with no successors. (I don't  
yet understand the semantics of the "invcont" and "Unwind"
blocks, so
perhaps I'm misunderstanding something here.)

Anyway, given this code, how would one identify the "real" exit block
(the one labeled "return")? Obviously I can't just search for the
block without successors because there are 3 of them.

Thanks,

Trevor

-------------- next part --------------
A non-text attachment was scrubbed...
Name: cfg._Z6simplei.dot
Type: application/octet-stream
Size: 4665 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100330/9a780a4f/attachment.obj>
-------------- next part --------------

John Criswell

2010-Mar-31 02:51 UTC

head link

[LLVMdev] CFG entry and exit blocks

Dear Trevor,

I'm too lazy to convert your .dot file into a graph file, but I'll make 
some comments anyway.
:)

First, LLVM does not guarantee that a function has a single exit block.  
Multiple basic blocks can have a return instruction as their terminator 
instruction.  There is a pass (Unify Function Exit nodes i.e., 
-mergereturn <http://llvm.org/docs/Passes.html#mergereturn>) that 
transform a function to have only 1 return instruction.

Now, I haven't looked at LLVM's new exception handling facilities 
recently, but there used to be an unwind instruction that would unwind 
the stack to the nearest dynamically executed invoke instruction (see 
the LLVM Langauge Reference Manual for more details).  Functions that 
use unwind would naturally have multiple exits.  I don't know of a 
transform that would change it to have a single exit (or one return and 
one unwind).

-- John T.

Trevor Harmon wrote:> Hi,
>
> I'm confused about the entry and exit blocks of an LLVM CFG. I  
> understand that every CFG has one and only one entry block, and this  
> is confirmed by the existence of the getEntryBlock function in the  
> Function class.
>
> But what about exit (a.k.a. return) blocks? At first I assumed that  
> LLVM CFGs have one and only one exit block, too, but the following  
> code is a counter-example:
>
> int simple(int j) {
>      try {
>          if (j > 50) throw 100;
>      } catch (int status) {
>          j++;
>      }
>
>      return j;
> }
>
> This produces the CFG shown in the attachment, which appears to have 3  
> exit blocks. At least, there are 3 blocks with no successors. (I don't
> yet understand the semantics of the "invcont" and
"Unwind" blocks, so
> perhaps I'm misunderstanding something here.)
>
> Anyway, given this code, how would one identify the "real" exit
block
> (the one labeled "return")? Obviously I can't just search for
the
> block without successors because there are 3 of them.
>
> Thanks,
>
> Trevor
>
>

Trevor Harmon

2010-Mar-31 18:59 UTC

head link

[LLVMdev] CFG entry and exit blocks

On Mar 30, 2010, at 7:51 PM, John Criswell wrote:
> I'm too lazy to convert your .dot file into a graph file
What format should I have posted? (I'm not sure what you mean by  
"graph file".) I had thought that .dot was the preferred format here,
since that's what LLVM generates (e.g., "opt -dot-cfg ...").
> First, LLVM does not guarantee that a function has a single exit  
> block.
> Multiple basic blocks can have a return instruction as their  
> terminator
> instruction.
Any suggestions on what kind of source code pattern would cause this  
to happen? I've tried feeding C++ code with multiple return statements  
to LLVM, but it always spits out a CFG with a single return block. All  
of the blocks for the return statements simply branch to this final  
return block.
> There is a pass (Unify Function Exit nodes i.e.,
> -mergereturn <http://llvm.org/docs/Passes.html#mergereturn>) that
> transform a function to have only 1 return instruction.
That sounds like a good approach, except that this pass does not  
appear to produce a CFG with exactly one return block, but rather one  
or zero return blocks. The distinction is important because if the  
pass's getReturnBlock function returns null (see  
UnifyFunctionExitNodes.h), I'm back to square one of not having a  
single exit block.

I'm wondering what would cause a CFG not to have a return block. The  
comments in UnifyFunctionExitNodes.cpp say: "If there are no return  
stmts in the Function, a null pointer is returned." But this doesn't  
make sense; even an empty void function has a return block with a ret  
instruction inside it. Is there some other kind of source code pattern  
that would cause an absence of a return block?

Thanks,

Trevor

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Mar 2010 - [LLVMdev] CFG entry and exit blocks

[LLVMdev] CFG entry and exit blocks

[LLVMdev] CFG entry and exit blocks

[LLVMdev] CFG entry and exit blocks

Reasonably Related Threads