mats petersson via llvm-dev
2017-Oct-23 11:12 UTC
[llvm-dev] Finding the entry point function in a LLVM IR
If you want to know which functions are (or may be) called from where, in the entire program, then you will need to do some sort of "LLVM-IR Linking" (there are tools that will do that for you, such as "llvm-link"). Of course, even then, there's possible cases where it's impossible to know whether a function is ACTUALLY called until at runtime - function pointers, including those in vtables, may or may not actually get called, depending on the exact dynamic behaviour of the code. Then there will be functions implemented outside of the LLVM-IR for the program anyway. atexit is a good example of a function that takes a function pointer. Your code will not know what (if anything) atexit does with that function pointer, or if/when that function gets called. Of course, we, as humans, know how atexit works and when the function gets called, but some code will not know that unless you write code to understand it's behaviour - and there are many types of functions that take a pointer to a function, some of which are much more complex than atexit - for example signal handlers or call-backs that gets called on errors. Imagine a function being called only when a memory allocation fails... As David says, it's of course highly dependent on what you are trying to achieve, exactly what approach you should take (or if there is an approach that is meaningful at all). On 23 October 2017 at 09:03, David Chisnall via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 21 Oct 2017, at 12:51, mohie pokhriyal via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > I want to be able to find out that main is the entry point function of > the program. > > main and boo both do not have any predecessors or successors , such that > I can make a cfg to figure out who’s calling whom ? > > > > Is there a way I can achieve this ? > > The fact that main is the entry point is not known to LLVM (except in a > couple of places that special-case main, such as the internalise pass), > because it is an artefact of C/C++, not a generic property. On most *NIX > platforms, the real entry point for a program is something like __start or > _start, which then call main. In most compilation units, there is no > single entry point, because they do not contain the program entry point and > so can be entered by any externally visible function. > > It might help if you explained why you need this. > > David > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171023/f3c02815/attachment.html>
mohie pokhriyal via llvm-dev
2017-Oct-25 04:34 UTC
[llvm-dev] Finding the entry point function in a LLVM IR
Thank You David and Mats for the reply, The reason I need to know that main is the entry point is as follows : I have a dead code elimination pass that removes the function call for boo. boo was initially called from the main function , but since the return in the main function has no dependency on boo, boo function call is removed. Now I want to remove the function definition of the functions that are not called. Notice that boo and main , both do not have any function calls to them . So, if I were to traverse all the functions over the entire module and use the API F->use_empty() to check for functions calls both boo and main return true and be deleted. I want to save main function definition from being deleted. As correctly mentioned by you above, there could be more functions similar to main , which can be called externally and I do not want them to be deleted. I am looking for a solution which tells me that this particular function is an entry point to the program and will have no function calls to it. Thanks On Mon, Oct 23, 2017 at 4:42 PM, mats petersson <mats at planetcatfish.com> wrote:> If you want to know which functions are (or may be) called from where, in > the entire program, then you will need to do some sort of "LLVM-IR Linking" > (there are tools that will do that for you, such as "llvm-link"). > > Of course, even then, there's possible cases where it's impossible to know > whether a function is ACTUALLY called until at runtime - function pointers, > including those in vtables, may or may not actually get called, depending > on the exact dynamic behaviour of the code. > > Then there will be functions implemented outside of the LLVM-IR for the > program anyway. atexit is a good example of a function that takes a > function pointer. Your code will not know what (if anything) atexit does > with that function pointer, or if/when that function gets called. Of > course, we, as humans, know how atexit works and when the function gets > called, but some code will not know that unless you write code to > understand it's behaviour - and there are many types of functions that take > a pointer to a function, some of which are much more complex than atexit - > for example signal handlers or call-backs that gets called on errors. > Imagine a function being called only when a memory allocation fails... > > As David says, it's of course highly dependent on what you are trying to > achieve, exactly what approach you should take (or if there is an approach > that is meaningful at all). > > On 23 October 2017 at 09:03, David Chisnall via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> On 21 Oct 2017, at 12:51, mohie pokhriyal via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> > >> > I want to be able to find out that main is the entry point function of >> the program. >> > main and boo both do not have any predecessors or successors , such >> that I can make a cfg to figure out who’s calling whom ? >> > >> > Is there a way I can achieve this ? >> >> The fact that main is the entry point is not known to LLVM (except in a >> couple of places that special-case main, such as the internalise pass), >> because it is an artefact of C/C++, not a generic property. On most *NIX >> platforms, the real entry point for a program is something like __start or >> _start, which then call main. In most compilation units, there is no >> single entry point, because they do not contain the program entry point and >> so can be entered by any externally visible function. >> >> It might help if you explained why you need this. >> >> David >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171025/a9cb3364/attachment.html>
David Chisnall via llvm-dev
2017-Oct-25 08:37 UTC
[llvm-dev] Finding the entry point function in a LLVM IR
On 25 Oct 2017, at 05:34, mohie pokhriyal <mohie10 at gmail.com> wrote:> > Thank You David and Mats for the reply, > > The reason I need to know that main is the entry point is as follows : > > I have a dead code elimination pass that removes the function call for boo. boo was initially called from the main function , but since the return in the main function has no dependency on boo, boo function call is removed.This is not sound. The linkage of boo means that it is externally visible. There is no guarantee that it will not be called from another compilation unit. If boo had internal or private linkage, then you would be safe to delete it as soon as its uses count dropped to zero (and LLVM’s dead code elimination pass will do exactly that). If you run the Internalize pass first, then it will mark functions that are not reachable from main as internal and then DCE can delete them. David