Michael Kruse via llvm-dev
2017-Oct-28  00:51 UTC
[llvm-dev] RFC: We need to explicitly state that some functions are reserved by LLVM
2017-10-27 20:31 GMT+02:00 Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org>:> I agree. Marking external functions from system headers seems like a > reasonable heuristic. We'd need some heuristic because it's not reasonable > for the frontend to know about every function the optimizer knows about. > Over-marking seems okay, however.Sorry for the naive question, why is it unreasonable for the frontend to know about special functions? It is the frontend who defines a source language function's semantics. Clang also has access (or can be made to can get access) to TargetLibraryInfo, no? The most straightforward solution seems to have an intrinsic for every function that has compiler magic, meaning every other function is ordinary without worrying about hitting a special case (e.g. when concatenating strings to create new function names when outlining). Recognizing functions names and assuming they represent the semantics from libs seems "unclean", tying LLVM IR more closely to C and a specific platform's libc/libm than necessary. "malloc" once had an intrinsic. Why was it removed, and recognized by name instead? Michael
Hal Finkel via llvm-dev
2017-Oct-28  01:30 UTC
[llvm-dev] RFC: We need to explicitly state that some functions are reserved by LLVM
On 10/27/2017 07:51 PM, Michael Kruse wrote:> 2017-10-27 20:31 GMT+02:00 Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org>: >> I agree. Marking external functions from system headers seems like a >> reasonable heuristic. We'd need some heuristic because it's not reasonable >> for the frontend to know about every function the optimizer knows about. >> Over-marking seems okay, however. > Sorry for the naive question, why is it unreasonable for the frontend > to know about special functions? It is the frontend who defines a > source language function's semantics. Clang also has access (or can be > made to can get access) to TargetLibraryInfo, no?I think this depends on how we want to define the separation of concerns. The optimizer has knowledge about many special functions. This list is non-trivial in size and also varies by target/environment. It is not reasonable to duplicate this list both in the optimizer and in all relevant frontends (which include not only things like Clang but also a whole host of other code generators that produce code directly calling system-library functions). Note that the optimizer sometimes likes to create calls to these functions, based only on its knowledge of the target/environment, without them ever been declared by the frontend. Now, can the list exist in the optimizer and be queried by the frontend? Sure. (*) It's not clear that this is necessary or useful, however. Clang, for example, would need to distinguish between functions declared in system headers and those that don't. This, strictly speaking, does not apply to functions that some from the C standard (because those names are always reserved), but names that come from POSIX or other miscellaneous system functions, can be used by well-formed programs (so long as, in general, they don't include the associated system headers). As a result, Clang might as well mark functions from system headers in a uniform way and let the optimizer do with them what it will. It could further filter that marking process using some callback to TLI, but I see no added value there. Similarly, a custom code generator can mark functions it believes will be resolved to system functions. (*) Although we need to be a bit careful to make sure that all optimizations, including custom ones, plugins, etc. register all of their relevant functions with TLI, and TLI isn't really setup for this (yet).> > The most straightforward solution seems to have an intrinsic for every > function that has compiler magic, meaning every other function is > ordinary without worrying about hitting a special case (e.g. when > concatenating strings to create new function names when outlining). > Recognizing functions names and assuming they represent the semantics > from libs seems "unclean", tying LLVM IR more closely to C and a > specific platform's libc/libm than necessary. > > "malloc" once had an intrinsic. Why was it removed, and recognized by > name instead?You want to have intrinsics for printf, getenv, and all the rest? TLI currently recognizes nearly 400 functions (see include/llvm/Analysis/TargetLibraryInfo.def). -Hal> > Michael-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
Michael Kruse via llvm-dev
2017-Oct-28  09:45 UTC
[llvm-dev] RFC: We need to explicitly state that some functions are reserved by LLVM
2017-10-28 3:30 GMT+02:00 Hal Finkel <hfinkel at anl.gov>:> > On 10/27/2017 07:51 PM, Michael Kruse wrote: >> >> 2017-10-27 20:31 GMT+02:00 Hal Finkel via llvm-dev >> <llvm-dev at lists.llvm.org>: >>> >>> I agree. Marking external functions from system headers seems like a >>> reasonable heuristic. We'd need some heuristic because it's not >>> reasonable >>> for the frontend to know about every function the optimizer knows about. >>> Over-marking seems okay, however. >> >> Sorry for the naive question, why is it unreasonable for the frontend >> to know about special functions? It is the frontend who defines a >> source language function's semantics. Clang also has access (or can be >> made to can get access) to TargetLibraryInfo, no? > > > I think this depends on how we want to define the separation of concerns. > The optimizer has knowledge about many special functions. This list is > non-trivial in size and also varies by target/environment. It is not > reasonable to duplicate this list both in the optimizer and in all relevant > frontends (which include not only things like Clang but also a whole host of > other code generators that produce code directly calling system-library > functions). Note that the optimizer sometimes likes to create calls to these > functions, based only on its knowledge of the target/environment, without > them ever been declared by the frontend. > > Now, can the list exist in the optimizer and be queried by the frontend? > Sure. (*) It's not clear that this is necessary or useful, however. Clang, > for example, would need to distinguish between functions declared in system > headers and those that don't. This, strictly speaking, does not apply to > functions that some from the C standard (because those names are always > reserved), but names that come from POSIX or other miscellaneous system > functions, can be used by well-formed programs (so long as, in general, they > don't include the associated system headers). As a result, Clang might as > well mark functions from system headers in a uniform way and let the > optimizer do with them what it will. It could further filter that marking > process using some callback to TLI, but I see no added value there. > Similarly, a custom code generator can mark functions it believes will be > resolved to system functions. > > (*) Although we need to be a bit careful to make sure that all > optimizations, including custom ones, plugins, etc. register all of their > relevant functions with TLI, and TLI isn't really setup for this (yet).Thank you for the answer.>> >> The most straightforward solution seems to have an intrinsic for every >> function that has compiler magic, meaning every other function is >> ordinary without worrying about hitting a special case (e.g. when >> concatenating strings to create new function names when outlining). >> Recognizing functions names and assuming they represent the semantics >> from libs seems "unclean", tying LLVM IR more closely to C and a >> specific platform's libc/libm than necessary. >> >> "malloc" once had an intrinsic. Why was it removed, and recognized by >> name instead? > > > You want to have intrinsics for printf, getenv, and all the rest? TLI > currently recognizes nearly 400 functions (see > include/llvm/Analysis/TargetLibraryInfo.def).intrinsics.gen currently already has 6243 intrinsics (most of them target-dependent). Would 400 additional ones be that significant? Michael